As commitment to our database literacy campaign, we're offering our Certified Database Practitioner course—for 50% OFF at $99!

Skip to main content
Completion requirements

This section provides insight into how RDBMS architecture supports database functionality and performance.

Database Engines

Types and Implementations

  • InnoDB (MySQL/MariaDB): Transaction-safe engine with ACID compliance, row-level locking, and foreign key constraints support. Features crash recovery and multi-versioning for concurrent transactions.
  • MyISAM (MySQL): Non-transactional engine optimized for read-heavy workloads with table-level locking. Offers faster performance for read-only scenarios.
  • PostgreSQL's Engine: Single integrated engine with extensible architecture supporting custom data types, operators, and functions. Implements MVCC (Multi-Version Concurrency Control) natively.
  • Oracle Database Engine: Sophisticated engine with proprietary storage structures (tablespaces, segments, extents, blocks) and background processes for managing memory and I/O.
  • SQL Server Engine: Consists of relational engine (query processing) and storage engine (data access and management).

Engine Selection Criteria

  • Transactional requirements (ACID properties)
  • Read vs. write workload balance
  • Concurrency requirements
  • Availability and recovery requirements
  • Scaling needs (vertical vs. horizontal)

Storage Mechanisms

Physical Storage Structures

  • Pages/Blocks: Fundamental unit of I/O (typically 4KB to 16KB)
  • Extents: Collection of contiguous pages/blocks
  • Segments: Logical storage units composed of extents
  • Tablespaces: Container for segments across multiple files

Data Organization

  • Heap Tables: Unordered collections of rows
  • Index-Organized Tables: Data stored in B-tree structure ordered by primary key
  • Partitioning: Horizontal division of tables based on range, list, hash, or composite schemes
  • Sharding: Distributing data across multiple physical servers

Storage Format Optimization

  • Row vs. Column Storage: Tradeoffs between OLTP and OLAP workloads
  • Compression Techniques: Page, row, and column level compression
  • Data Encoding: Binary encoding, dictionary encoding for space efficiency
  • Sparse Data Handling: Null value storage optimization

Memory and Disk Utilization

Memory Components

  • Buffer Pool/Cache: Keeps frequently accessed data pages in memory
  • Query Cache: Stores results of frequent queries
  • Sort Buffer: Dedicated memory for sorting operations
  • Join Buffer: Memory allocated for join operations
  • Log Buffer: Temporarily stores transaction log records

Memory Management Strategies

  • LRU (Least Recently Used): Algorithm for page replacement
  • Pinning: Keeping critical pages in memory
  • Scan-resistant Caching: Preventing full table scans from flushing useful data
  • Adaptive Hash Indexing: Dynamic in-memory indexing based on access patterns

Disk I/O Optimization

  • Read-ahead: Prefetching anticipated pages
  • Write Clustering: Grouping multiple writes into fewer I/O operations
  • Asynchronous I/O: Non-blocking disk operations
  • Direct I/O vs. Buffered I/O: Bypassing OS cache for specialized workloads
  • RAID Configurations: Different redundancy and performance options (RAID 0, 1, 5, 10)

Concurrency and Transaction Management

Transaction Isolation Levels

  • Read Uncommitted: Lowest isolation, allows dirty reads
  • Read Committed: Prevents dirty reads, but allows non-repeatable reads
  • Repeatable Read: Prevents non-repeatable reads, but allows phantom reads
  • Serializable: Highest isolation, prevents all concurrency anomalies

Locking Mechanisms

  • Shared (S) vs. Exclusive (X) Locks: Reading vs. writing
  • Granularity Levels: Row, page, table locks
  • Deadlock Detection and Resolution: Timeout or dependency cycle detection
  • Lock Escalation: Converting multiple fine-grained locks to fewer coarse-grained locks

Concurrency Control Methods

  • Pessimistic Concurrency Control: Locks resources before accessing
  • Optimistic Concurrency Control: Checks for conflicts at commit time
  • MVCC (Multi-Version Concurrency Control): Maintains multiple data versions to enhance concurrency
  • Two-Phase Commit Protocol: Ensures atomic transaction across distributed systems

Transaction Logging

  • Write-Ahead Logging (WAL): Recording changes before applying to data
  • Checkpoint Process: Synchronizing memory state with disk
  • Redo vs. Undo Logs: Forward vs. backward recovery mechanisms
  • Point-in-time Recovery: Ability to restore database to specific moment
Last modified: Thursday, 10 April 2025, 3:59 PM