Does field order matter in multi-column index in MySQL?

Question

I know the importance of indexes and how join order changes performance. I've read a lot related to multi-column indexes but didn't find an answer to my question. I'm curious if the order in which they are specified matters if I do a multi-column index. My guess is no, and the engine will treat them as a group where the order doesn't matter. But I want to verify it. For example, from the mysql website (http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.ht

P粉317679342 · Answer

The two indexes are different. This is true in MySQL and other databases. MySQL explains the difference in the documentation.

Consider two indexes:

create index idx_lf on name(last_name, first_name);
create index idx_fl on name(first_name, last_name);

Both should apply equally to:

where last_name = XXX and first_name = YYY

idx_lf is optimal for:

where last_name = XXX
where last_name like 'X%'
where last_name = XXX and first_name like 'Y%'
where last_name = XXX order by first_name

idx_fl will be best suited for the following situations:

where first_name = YYY
where first_name like 'Y%'
where first_name = YYY and last_name like 'X%'
where first_name = XXX order by last_name

For many such cases, it is possible to use two indexes, but one of them is optimal. For example, consider using the idx_lf query:

where first_name = XXX order by last_name

MySQL can use idx_lf to read the entire table and then filter after order by. I don't think this is an optimization option in practice (for MySQL), but it may happen in other databases.

P粉899950720 · Answer

When discussing multi-column indexes, I use the analogy of a phone book. A phone book is basically an index with last name first and then first name. So the sort order is determined by which "column" comes first. Searches are divided into the following categories:

If you are looking for someone with the last name Smith, you can easily find them because the book is sorted by last name.
If you are looking for someone named John, the phone book is not helpful because John is scattered throughout the book. You have to scan the entire phone book to find them.
If you are looking for someone with a specific last name Smith and a specific first name John, this book will help because you will find that the Smiths are sorted together, and within the Smiths, the Johns are also sorted turn up.

If you had a phone book sorted by first name and then by last name, the sorting of the phone book would help you in cases #2 and #3 above, but not in case #1 provide help.

This explains the case of finding an exact value, but what if you're looking by a range of values? Suppose you want to find all people whose first name is John and whose last name begins with "S" (Smith, Saunders, Staunton, Sherman, etc.). Johns are sorted by "J" within each last name, but if you want all Johns with all last names starting with "S", the Johns are not grouped together. They're spread out again, so you end up having to scan for all the names whose last name starts with "S." However, if the phone book was organized by first name and then by last name, you would find that all Johns would be grouped together, and then within John, all the "S" surnames would be grouped together.

Therefore, the order of columns in a multi-column index absolutely matters. A type of query may require a specific column order of the index. If you have multiple types of queries, you may need multiple indexes to help them, with the columns in different order.

You can read my presentation How to Really Design an Index to learn more, or watch the video.