MySQL Transaction Isolation and Concurrency Control Mechanisms
ACID Properties and Log Management
Database transactions rely on four core properties to guarantee reliable data processing. Atomicity ensures operations execute as a single, indivisible unit. Partial failures trigger a complete rollback using undo logs to revert buffer pool modifications. Isolation guarantees concurrent transactions operate independently, preventing interference through serialization or multiversioning techniques. Durability persists committed changes via redo logs and the two-phase commit protocol, safeguarding against system crashes even before dirty pages flush to disk. Consistency maintains structural and business constraints, emerging as a direct result of the other three properties enforced by underlying logging systems.
Concurrency Anomalies
Simultaneous transaction execution introduces specific data integrity risks:
- Lost Updates: Overwrites occur when concurrent modifications target the same row, either via rollback or commit overwriting another session's changes.
- Dirty Reads: A transaction retrieves uncommitted modifications from a concurrent session. If the originating session rolls back, the read data becomes invalid.
- Non-Repeatable Reads: Subsequent queries within a single transaction yield different values because an external session committed a modification to the target rows.
- Phantom Reads: Repeated range queries return varying row counts due to concurrent insertions adding new qualifying records.
Distinction: Dirty reads involve uncommitted data, whereas non-repeatable reads stem from committed modifications to the same row. Phantom reads differ by affecting result set cardinality across a query range rather than individual record values.
Isolation Tiers
The SQL standard defines isolation levels that balance performance and data consistency:
| Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible |
| Read Committed | Blocked | Possible | Possible |
| Repeatable Read (Default) | Blocked | Blocked | Mitigated via Gap Locks |
| Serializable | Blocked | Blocked | Blocked |
READ UNCOMMITTED: Permits visibility of uncommitted changes. Prevents rollback-based lost updates but allows dirty reads.READ COMMITTED: Restricts visibility to committed data only. Eliminates dirty reads but permits non-repaetable reads.REPEATABLE READ: Guarantees a consistent dataset snapshot for the transaction lifecycle. Prevents dirty and non-repeatable reads. InnoDB mitigates phantom reads using gap locking.SERIALIZABLE: Enforces strict ordering via read/write locks on all accessed data. Eliminates all concurrency anomalies at the cost of significant lock contention and latency.
Locking Strategies and Granularity
InnoDB employs multiple locking dimensions to manage concurrent access:
- Granularity: Table locks offer fast acquisition but severely limit concurrency. Row locks provide fine-grained control with higher overhead but superior concurrency. Page locks sit between both but are largely deprecated in modern architectures.
- Access Modes: Shared locks (
S) allow concurrent reads, while Exclusive locks (X) block all other access. Intention locks (IS,IX) act as table-level metadata flags signaling row-level lock presence, preventing expensive full-table scans during lock compatibility checks. - Exeuction Strategy: Pessimistic locking acquires resources before modification. Optimistic locking defers conflict detection until commit time, typically using version counters or timestamp comparisons.
InnoDB Row Lock Algorithms
InnoDB implements row-level locking through B+ tree index structures using three core algorithms:
- Record Locks: Secure a single index entry. Supported in both
RCandRRisolation levels. - Gap Locks: Protect the space between index values, preventing phantom inserts into empty ranges. Exclusive to
RRlevel. - Next-Key Locks: Combine record and gap locks to secure both a record and its preceding index range, effectively preventing phantom reads in standard operations.
Locking behavior adapts dynamically based on query patterns. Range scans default to Next-Key locks. Queries utilizing unique indexes often degrade to Record locks for performance optimization. Without proper index coverage, full-table locking occurs due to the engine's inability to pinpoint target rows efficiently. Row-to-table lock escalation triggers when lock memory thresholds are exceeded.
Deadlock Prevention and Diagnostics
Explicit table locks utilize LOCK TABLES, though implicit row locking is standard for transactional workloads:
LOCK TABLES inventory READ, order_logs WRITE;
UNLOCK TABLES;
Pessimistic row locks are acquired via:
SELECT current_stock FROM warehouse_bins WHERE bin_id = 42 FOR UPDATE;
Optimistic implementations rely on application-level version fields:
UPDATE inventory SET qty = ?, version_seq = version_seq + 1
WHERE sku_id = ? AND version_seq = ?;
Deadlocks arise from circular wait conditions. Common scenarios include inconsistent multi-table access ordering, full-table scans promoting row locks, or shared-to-exclusive lock conversion waits. Prevention requires deterministic resource acquisition ordering, minimized transaction scopes, and strict index optimization. Diagnostic commands SHOW ENGINE INNODB STATUS and monitoring Innodb_row_lock_waits metrics expose contention patterns and identify deadlock victims.
Multi-Version Concurrency Control (MVCC)
MVCC decouples read operations from write locks by maintaining historical row versions. Each physical record contains hidden metadata: DB_TRX_ID (creator transaction identifier), DB_ROLL_PTR (pointer to the undo log segment), and a system-generated row identifier. Write operations generate new versions while preserving prior states in the undo log.
Read operations utilize two modes:
- Snapshot Reads: Standard
SELECTstatements query historical versions defined by aReadView. No explicit locks are acquired, enabling high concurrency. - Current Reads: Modifications (
UPDATE/DELETE/INSERT) and explicit locking clauses (SELECT ... FOR UPDATE) bypass history to fetch the latest committed state, acquiring exclusive or shared locks.
ReadView and Visibility Rules
A ReadView snapshot determines version visibility through transaction ID boundaries established at query execution:
min_trx_id: Lowest active transaction ID in the system.max_trx_id: Next transaction ID slated for assignment.trx_ids: Snapshot of all currently active transaction identifiers.creator_trx_id: Identifier of the transaction generating the view.
Visibility Logic:
- If
row.trx_idmatchescreator_trx_id, the version is visible. - If
row.trx_id < min_trx_id, the transaction committed before the snapshot; visible. - If
row.trx_id >= max_trx_idor exists withintrx_ids, the transaction remains active; invisible.
READ COMMITTED regenerates the ReadView per statement, yielding the freshest committed data. REPEATABLE READ creates the view once per transaction, ensuring stable result sets throughout the session. Phantom reads in RR are suppressed by Next-Key locks during range scans, though mixing current and snapshot reads can still expose transient anomalies.
Background Purge Mechanism
Background purge threads asynchronously reclaim obsolete undo log entries and logically deleted records. Once the oldest active ReadView advances past a transaction's commit sequence number (trx_no), the associated historical versions become obsolete. The purge thread traverses the history list, permanently removing old versions and clearing the deleted_flag markers from the clustered index. This deferred cleanup ensures that long-running queries maintain snapshot consistency without forcing immediate write amplification.