As commitment to our database literacy campaign, we're offering our Database Foundations course—for FREE!

Skip to main content
Completion requirements

This section provides insight into how RDBMS architecture supports database functionality and performance.

Database Engines

Types and Implementations

  • InnoDB (MySQL/MariaDB): Transaction-safe engine with ACID compliance, row-level locking, and foreign key constraints support. Features crash recovery and multi-versioning for concurrent transactions.
  • MyISAM (MySQL): Non-transactional engine optimized for read-heavy workloads with table-level locking. Offers faster performance for read-only scenarios.
  • PostgreSQL's Engine: Single integrated engine with extensible architecture supporting custom data types, operators, and functions. Implements MVCC (Multi-Version Concurrency Control) natively.
  • Oracle Database Engine: Sophisticated engine with proprietary storage structures (tablespaces, segments, extents, blocks) and background processes for managing memory and I/O.
  • SQL Server Engine: Consists of relational engine (query processing) and storage engine (data access and management).

Engine Selection Criteria

  • Transactional requirements (ACID properties)
  • Read vs. write workload balance
  • Concurrency requirements
  • Availability and recovery requirements
  • Scaling needs (vertical vs. horizontal)

Storage Mechanisms

Physical Storage Structures

  • Pages/Blocks: Fundamental unit of I/O (typically 4KB to 16KB)
  • Extents: Collection of contiguous pages/blocks
  • Segments: Logical storage units composed of extents
  • Tablespaces: Container for segments across multiple files

Data Organization

  • Heap Tables: Unordered collections of rows
  • Index-Organized Tables: Data stored in B-tree structure ordered by primary key
  • Partitioning: Horizontal division of tables based on range, list, hash, or composite schemes
  • Sharding: Distributing data across multiple physical servers

Storage Format Optimization

  • Row vs. Column Storage: Tradeoffs between OLTP and OLAP workloads
  • Compression Techniques: Page, row, and column level compression
  • Data Encoding: Binary encoding, dictionary encoding for space efficiency
  • Sparse Data Handling: Null value storage optimization

Memory and Disk Utilization

Memory Components

  • Buffer Pool/Cache: Keeps frequently accessed data pages in memory
  • Query Cache: Stores results of frequent queries
  • Sort Buffer: Dedicated memory for sorting operations
  • Join Buffer: Memory allocated for join operations
  • Log Buffer: Temporarily stores transaction log records

Memory Management Strategies

  • LRU (Least Recently Used): Algorithm for page replacement
  • Pinning: Keeping critical pages in memory
  • Scan-resistant Caching: Preventing full table scans from flushing useful data
  • Adaptive Hash Indexing: Dynamic in-memory indexing based on access patterns

Disk I/O Optimization

  • Read-ahead: Prefetching anticipated pages
  • Write Clustering: Grouping multiple writes into fewer I/O operations
  • Asynchronous I/O: Non-blocking disk operations
  • Direct I/O vs. Buffered I/O: Bypassing OS cache for specialized workloads
  • RAID Configurations: Different redundancy and performance options (RAID 0, 1, 5, 10)

Concurrency and Transaction Management

Transaction Isolation Levels

  • Read Uncommitted: Lowest isolation, allows dirty reads
  • Read Committed: Prevents dirty reads, but allows non-repeatable reads
  • Repeatable Read: Prevents non-repeatable reads, but allows phantom reads
  • Serializable: Highest isolation, prevents all concurrency anomalies

Locking Mechanisms

  • Shared (S) vs. Exclusive (X) Locks: Reading vs. writing
  • Granularity Levels: Row, page, table locks
  • Deadlock Detection and Resolution: Timeout or dependency cycle detection
  • Lock Escalation: Converting multiple fine-grained locks to fewer coarse-grained locks

Concurrency Control Methods

  • Pessimistic Concurrency Control: Locks resources before accessing
  • Optimistic Concurrency Control: Checks for conflicts at commit time
  • MVCC (Multi-Version Concurrency Control): Maintains multiple data versions to enhance concurrency
  • Two-Phase Commit Protocol: Ensures atomic transaction across distributed systems

Transaction Logging

  • Write-Ahead Logging (WAL): Recording changes before applying to data
  • Checkpoint Process: Synchronizing memory state with disk
  • Redo vs. Undo Logs: Forward vs. backward recovery mechanisms
  • Point-in-time Recovery: Ability to restore database to specific moment
Last modified: Thursday, 10 April 2025, 3:59 PM