Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Understanding InnoDB Index Structure and Maintenance

Tech May 15 1

Data Structure

A B+ tree is a multi-way balanced search tree. All leaf nodes reside at the same depth. Leaf nodes are interconnected via a doubly linked list to optimize range scans.

Purpose

Indexes expedite query execution by reducing the number of rows scanned.

Index Creation

Values from the indexed columns are sorted, and a B+ tree structure is built on top of them.

Storage Rules

  • Internal nodes hold index keys and pointers to child nodes, stored in interleaved fashion.
  • For a clustered index (primary key), leaf nodes contain the full row data.
  • For a secondary index (non-primary key), leaf nodes store the index key together with the corresponding primary key value.

Query Execution

Starting from the root, a binary search determines which child pointer to follow. This process repeats at each level until a leaf node is reached.

  • If the query uses a clustered index, the row data is retrieved directly from the leaf.
  • If it uses a secondary index, the primary key is obtained from the leaf, and a subsequent lookup on the clustered index (called a "table access by index rowid" or "back to the table") fetches the full row.

Index Rebalancing

Because the B+ tree must remain strictly balanced and ordered, any insert, update, or delete operation may trigger node splitting or merging to preserve these properties.

Impact of Rebalancing

  • InnoDB stores both index and data in units called "pages" (default 16 KB).
  • Page splits allocate new pages and migrate data into them; page merges combine adjacent pages.
  • These operations can leave pages partially filled, especially for clustered indexes where leaf pages contain the actual row data. Over time, storage fragmentation increases—more pages with unused space appear.

Index Rebuilding

When fragmentation reaches a certain threshold, MySQL automatically rebuilds the index. New pages are allocated, a fresh index tree is constructed, and all data is moved into the new structure in sorted order.

Consequences of Rebuilding

  • Fragmentation is eliminated, improving storage efficiency and scan performance.
  • The rebuild consumes significant CPU and I/O resources. To ensure data consistency, MySQL may acquire locks (such as table locks) during the process, potentially causing query delays or brief service stalls.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.