Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

MySQL Index Types and Implementation Details

Tech May 19 1

In MySQL, an index (also referred to as a key) is a data structure that storage engines use to locate rows rapidly. By functioning similarly to a book's table of contents, indexes prevent the database from scanning the entire table to find relevant data, thereby significantly improving query performance.

Classification of Indexes

MySQL supports various index types to accommodate different data access patterns and integrity requirements.

1. Primary Key Index

A primary key index enforces unique identification for each record. A table is restricted to a single primary key. While auto-incrementing columns are frequently used as primary keys, the reverse is not required. It is best practice to define primary keys on meaningless columns (such as surrogate keys) using numeric data types for optimal performance.

2. Standard Index (INDEX or KEY)

Standard indexes, also known as non-unique indexes, are the most common type. They are typically added after table creation to improve retrieval speed.

  • Single-Column Index: An index created on a specific column. ``` CREATE INDEX idx_user_email ON employees(email);
  • Composite Index: An index covering multiple columns. This type follows the "Leftmost Prefix" rule. ``` CREATE INDEX idx_user_profile ON employees(last_name, department_id);
    
     For the index above: 
    - A query filtering by `last_name` will utilize the index.
    - A query filtering only by `department_id` will likely bypass the index.
    
    

3. Unique Index

Unique indexes ensure that all values in the indexed column are distinct. While similar to a primary key, they differ in key aspects: unique indexes allow NULL values (depending on the storage engine), whereas primary keys do not. Additionally, a table can have multiple unique indexes but only one primary key.

ALTER TABLE products ADD UNIQUE INDEX idx_sku (product_sku);

4. Full-Text Index

Designed for text-based searches, full-text indexes were historically limited to the MyISAM engine but are now supported by InnoDB (MySQL 5.6+). They enable efficient querying of text columns using natural language search patterns.

CREATE TABLE articles (
    id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    title VARCHAR(200),
    body TEXT,
    FULLTEXT INDEX ft_search (title, body)
) ENGINE=InnoDB;

Index Data Structures

The underlying storage structure of an index dictates its behavior and efficiency. MySQL generally supports B-Tree, Hash, R-Tree, and Full-Text structures.

1. B-Tree Indexes

B-Tree (and its variant B+Tree) is the default index type for most MySQL storage engines.

  • MyISAM Implementation: The index file stores pointers (disk addresses) to the actual data records. The leaf nodes contain the data address.
  • InnoDB Implementation: InnoDB uses a clustered index structure. For the primary key, the leaf nodes contain the actual data record. Secondary indexes store the primary key value at the leaf nodes.

B+Tree vs. B-Tree:

  • Non-leaf nodes in B+Tree only store key values, while leaf nodes store keys and data (or pointers).
  • B+Tree leaf nodes are linked via a doubly-linked list, optimizing range scans.

2. Hash Indexes

Hash indexes are based on a hash table. They are primarily supported by the MEMORY engine.

  • Pros: Extremely fast for equality comparisons (=, IN).
  • Cons: They cannot be used for range queries or sorting. They are also susceptible to hash collisions.

3. R-Tree Indexes

Used for spatial data types (GEOMETRY). Support is limited to specific engines like MyISAM and InnoDB.

Common Causes of Index Invalidation

Indexes are not always used by the query optimizer. Several scenarios can cause an index to be ignored, resulting in a full table scan:

  1. Improper OR Usage: If one column in an OR condition lacks an index, the optimizer may ignore all indexes for that query.
  2. Composite Index Violation: Failing to use the leftmost column of a composite index prevents the index from being utilized.
  3. Leading Wildcard: Queries using LIKE '%abc' cannot utilize a B-Tree index, whereas LIKE 'abc%' can.
  4. Implicit Type Conversion: If an indexed column is a string type (e.g., VARCHAR) and the query supplies a numeric value without quotes, the index may be bypassed.
  5. Functions on Columns: Applying a function to an indexed column (e.g., WHERE YEAR(created_at) = 2023) invalidates the index.
  6. Low Selectivity: If an index column has very low cardinality (e.g., a "Gender" column with only 'M' and 'F'), the optimizer may calculate that a full table scan is faster than traversing the index.

Locking Implications

In InnoDB, row-level locking is implemented through index records. If a query fails to use an index (e.g., due to the invalidation reasons above), InnoDB may resort to a table lock rather than row locks, potentialyl impacting concurrency.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.