Database 2024-04-01

What is a Database Index? How It Works and When You Need It

Understand how database indexes work internally—B-tree and hash structures—when to add or avoid indexes, and how to measure index effectiveness using EXPLAIN.

Read in: ja
What is a Database Index? How It Works and When You Need It

What is an Index?

A mechanism to quickly retrieve records stored in a table.

Consider a query with an O(n) problem like the following:

SELECT * FROM users WHERE first_name = ‘Tom’

To improve the performance of this query, you can add an index as follows:

ALTER TABLE users ADD INDEX (first_name)

Advantages and Disadvantages

Advantages

Disadvantages

When creating or updating data, indexes are also added or updated simultaneously, leading to the above disadvantages.

Index Patterns

Standard (Applied to a single column)

ALTER TABLE users ADD INDEX (first_name)

Partial Index

An effective pattern when you want to improve performance while suppressing storage increase.

Example of applying an index to only the first 4 bytes: ALTER TABLE users ADD INDEX (first_name(4))

Multi-column Index (Also called composite or compound index)

ALTER TABLE users ADD INDEX (last_name, first_name)

In MySQL, only one index per table can be used for a single query execution, but if a multi-index is applied, a valid index will be selected during query execution.

It is generally good to specify a column with high cardinality at the beginning of a multi-column index.

Unique Index

Values will not appear duplicated except for NULL. During record creation or update, it checks all values to ensure the same value does not already exist. In MySQL, specifying a unique key also specifies a unique index.

ALTER TABLE users ADD UNIQUE (first_name)

Measuring Index Effectiveness

Check the execution plan of a query with the EXPLAIN clause.

EXPLAIN SELECT * FROM users WHERE first_name = ‘Tom’

Check the following items:

Index Criteria

List of criteria that might suggest considering an index. These are just guidelines for estimation, so it's better to measure with EXPLAIN.

Clustered Index and Secondary Index

Clustered Index

Indexes that fall under the following are clustered indexes.

Secondary Index

Indexes other than clustered indexes are called secondary indexes. Secondary indexes include the value of the primary key. Although it is assumed to be measured with EXPLAIN, since the primary key value is included, it might be good to remember that a covering index can be achieved with just a secondary index without including the primary key in a composite index. cf. InnoDB Secondary Index Utilization Techniques!

Covering Index

An index that contains all the columns necessary for the query result.

Since it can be covered with just the index without reading the data file, the search is accelerated.

Cautions When Applying Indexes

Arithmetic Operations and SQL Functions on Index Columns

SELECT * FROM users WHERE amount * 2 > 10;

If an index is applied to amount, avoid operators to utilize the index. The amount itself is held in the index, not the result of the operation. The same applies to SQL functions.

SELECT * FROM users WHERE amount > 10/2;

IS_NULL

SELECT * FROM users WHERE amount IS NULL;

IS NULL or IS NOT NULL generally do not effectively utilize indexes (depends on DBMS specifications).

Negation or OR

SELECT * FROM users WHERE amount <> 10;

Negation cannot utilize indexes. The same applies to OR.

LIKE

SELECT * FROM users WHERE name = 'a%';

When using LIKE, only forward matches utilize indexes due to the nature of B-Tree.

Implicit Type Conversion

SELECT * FROM users WHERE age = '10'

If age is a numeric type, implicit conversion from string to number will prevent index utilization.

References

Tags: DB Index MySQL
Share: 𝕏 Post Facebook Hatena
✏️ View source / Discuss on GitHub
☕ Support

If you enjoy this blog, consider supporting it. Every bit helps keep it running!


Related Articles