Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

InnoDB Multi-Version Concurrency Control: Snapshot Isolation and Undo Log Mechanics

Tech May 7 3

Core Concepts and Scope

Multi-Version Concurrency Control (MVCC) enables the InnoDB storage engine to process concurrent read and write operations without explicit row-level locking. By maintaining multiple temporal versions of data records, readers access consistent snapshots while writers modify newer copies, eliminating contention between read and write transactions. This mechanism operates exclusively under the READ COMMITTED and REPEATABLE READ isolation levels. In READ UNCOMMITTED, transactions read the latest uncommitted modifications directly, bypassing version chains. Under SERIALIZABLE, all operations execute sequentially, rendering versioning unnecessary.

Versioning Analogy

Consider a collaborative document platform where an author revises a published article. While the author modifies a draft copy, readers continue viewing the previously published version. Only after the author publishes the revision does the new content become visible to subsequent readers. MVCC implements similar copy-on-write semantics, allowing write operations to proceed on new versions while read transactions observe stable historical snapshots.

Hidden System Columns

Every user record contains four invisible metadata columns managed by InnoDB:

  • implicit_row_id: A 6-byte auto-incrementing value serving as the clustered index key when no explicit primary key or non-null unique index exists on the table.
  • last_modifier_xid: A 6-byte transaction identifier recording which transaction most recently modified this row. Read-only queries executed without an explicit transaction context receive xid = 0, while explicit transactions (including SELECT statements within BEGIN blocks) receive unique positive identifiers.
  • prev_version_ptr: A 7-byte pointer referencing the previous version of this row stored in the rollback segment.
  • logical_delete_flag: A 1-byte marker indicating the row has been logically deleted and awaits physical purging by background cleanup processes.

Undo Log Architecture

InnoDB maintains two distinct categories of undo records in the rollback segment:

Modification Undo Records preserve pre-update images of existing rows. These entries persist beyond transaction completion to support consistent reads for other transactiosn that require historical versions, and they enable transaction rollback during execution.

Insertion Undo Records track newly created rows and exist solely for rollback purposes. They require minimal metadata because uncommitted inserts remain invisible to other transactions regardless of isolation level.

Version Chain Formation

When a transaction modifies an existing row, InnoDB executes the following sequence:

  1. Acquire an exclusive lock on the target record.
  2. Copy the current row contents to a modification undo record in the rollback segment.
  3. Update the row's data, set last_modifier_xid to the current transaction ID, and adjust prev_version_ptr to reference the undo record.
  4. Release the exclusive lock upon transaction commit.

New versions prepend to the chain rather than append, creating a singly-linked list ordered from newest to oldest. Each node in this chain represents the row state at a specific point in time.

Example Scenario

Consider a products table containing inventory data:

-- Transaction Alpha
BEGIN;
UPDATE products SET price = 29.99 WHERE sku = 'TECH-001';
COMMIT;
-- Transaction Beta  
BEGIN;
UPDATE products SET stock_count = stock_count - 5 WHERE sku = 'TECH-002';
COMMIT;

These operations generate independent version chains for each modified row, with undo records preserving the pre-transaction values.

ReadView Composition

When a transaction executes its first snapshot read (non-locking SELECT), InnoDB instantiates a ReadView—a visibility checkpoint capturing the current system state:

  • creator_xid: The transaction identifier that constructed this snapshot.
  • active_xids: A sorted list of all transaction IDs that remain uncommitted at the moment of snapshot creation.
  • oldest_active_xid: The minimum value within active_xids.
  • next_allocation_xid: The transaction ID that will be assigned to the next transaction started after this snapshot.

Sample Configuration

Assume the system contains transactions T10 through T20, where T12, T15, and T18 remain active while T11, T13, T16, T17, and T19 have committed, and T20 has rolled back. A new read operation generates:

{
  "creator_xid": 0,
  "active_xids": [12, 15, 18],
  "oldest_active_xid": 12,
  "next_allocation_xid": 21
}

Visibility Resolution Algorithm

To locate the appropriate version for a query, InnoDB evaluates the row's last_modifier_xid against the ReadView using the following decision tree:

Step 1: Creator Verification If the row's last_modifier_xid equals the ReadView's creator_xid, the transaction reads its own uncommitted modifications—the current version is visible.

Step 2: Historical Boundary Check If last_modifier_xid is less than oldest_active_xid, the modifying transaction committed before snapshot creation—the current version is visible.

Step 3: Future Transaction Check If last_modifier_xid equals or exceeds next_allocation_xid, the modification occurred after snapshot creation—the current version is invisible, and the system traverses to the previous version via prev_version_ptr.

Step 4: Active Status Verification For values falling between oldest_active_xid and next_allocation_xid:

  • If last_modifier_xid exists in active_xids: The modifying transaction remains uncommitted—the current version is invisible; follow prev_version_ptr to older iterations.
  • If absent from active_xids: The modifying transaction committed after the oldest active transaction started but before snapshot creation—the current version is visible.

Isolation Level Implementation

Read Committed: Each snapshot read generaets a fresh ReadView. Consequently, subsequent queries within the same transaction observe changes committed by other transactions since the previous statement.

Repeatable Read: Only the first snapshot read creates a ReadView; all subsequent queries within the transaction reuse this identical view. This ensures that once a transaction reads a row, it sees the same version throughout its lifetime, preventing non-repeatable reads and phantom reads within the scope of queried data.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.