Home > Database > Mysql Tutorial > Which Composite Index is Optimal for Range Queries: High vs. Low Cardinality Columns?

Which Composite Index is Optimal for Range Queries: High vs. Low Cardinality Columns?

Mary-Kate Olsen
Release: 2024-11-24 14:10:47
Original
258 people have browsed it

Which Composite Index is Optimal for Range Queries: High vs. Low Cardinality Columns?

High Cardinality Column Placement in Composite Indexes with Range Queries

When querying a table with a composite index involving a range condition, the placement of columns within the index can significantly impact performance.

Consider the table files with a primary key (did, filename) and two composite indexes: INDEX(filetime, ext) and INDEX(ext, filetime). Both indexes contain the filetime column, which has higher cardinality than ext.

The query:

WHERE ext = '...'
  AND filetime BETWEEN ... AND ...
Copy after login

requires accessing data based on both ext and filetime. The question arises: which index is optimal for such a query?

Analysis

To determine the optimal index, we can use FORCE INDEX and examine the execution plans:

-- Force range on filetime first
FORCE INDEX(fe) SELECT COUNT(*), AVG(fsize)
FROM files
WHERE ext = 'gif'
  AND filetime >= '2015-01-01'
  AND filetime < '2015-01-01' + INTERVAL 1 MONTH;

-- Force low-cardinality ext first
FORCE INDEX(ef) SELECT COUNT(*), AVG(fsize)
FROM files
WHERE ext = 'gif'
  AND filetime >= '2015-01-01'
  AND filetime < '2015-01-01' + INTERVAL 1 MONTH;
Copy after login

The output shows that INDEX(ext, filetime) (ef) has a significantly lower row count, indicating a more efficient scan.

Optimizer Trace

To further analyze the optimizer's behavior, we can use the optimizer trace:

SELECT explain_format = 'JSON';

SELECT COUNT(*), AVG(fsize)
FROM files
WHERE ext = 'gif'
  AND filetime >= '2015-01-01'
  AND filetime < '2015-01-01' + INTERVAL 1 MONTH;
Copy after login

The trace reveals that the optimizer chooses INDEX(ext, filetime) because it can use both columns of the index to filter and fetch data. In contrast, INDEX(filetime, ext) can only use the first column (filetime) for filtering.

Conclusions

Based on the analysis, the following conclusions can be drawn:

  • For composite indexes used in range queries, the column involved in the equality predicate (ext in this case) should be placed first in the index definition.
  • Query performance is improved when the columns in an index are ordered according to the order in which they are used in the WHERE clause.
  • Cardinality alone is not a decisive factor in determining the optimal index. In scenarios where the range column has higher cardinality but the equality column is involved in an equality predicate, placing the equality column first yields better performance.

The above is the detailed content of Which Composite Index is Optimal for Range Queries: High vs. Low Cardinality Columns?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template