Ignore the concept of index first , if you want to check a certain record directly now, how to search it?
If there are very few records in the table and one page is enough, then there are two situations:
Use the primary key as the search condition: This is the method mentioned in the previous article. Use the dichotomy method to quickly locate the slot in the page directory, then traverse the records corresponding to the group of the slot, and finally find the specified record.
Use other non-primary key columns as search conditions: Because there is no page directory for non-primary key columns in the data page, the slot cannot be quickly located through the dichotomy method. You can only start from the Infimum record once. Traversing each record in a singly linked list is inefficient.
When there are many records in the table, many data pages will be used to store them. In this case, 2 steps are required:
Locate the page where the record is located.
Repeat the above process of searching in a page.
Generally speaking, when there is no index, we cannot quickly locate the page where the record is located. We can only follow the doubly linked list from the first page (the page has the previous page and the next page) Keep searching, and then repeat the above process on each page to query the specified record, which requires traversing all records, which is very time-consuming.
Since the positioning record is too slow due to too many pages, how to solve it? You may wish to refer to the "Page Directory".
The page directory is set up to quickly locate the position of a record in the page based on the primary key. Therefore, we can explore a method of creating an "other directory" to quickly locate the page where the record is located.
But there are two things that need to be done in order to complete this "other directory".
Assuming that each data page can hold up to 3 records (actually many can be placed), then Now insert 3 records into the table, each record has 3 columns c1, c2, and c3. For convenience, the storage row format is also simplified, leaving only key attributes. The virtual records Infimum and Supremum are located at the beginning and end of the user record respectively, with three user records in the middle.
At this time, continue to insert 1 record. In the hypothetical case, at least one new page needs to be allocated, so the two pages will be reallocated and rearranged.
Please note that the two records shown in red font include a newly inserted record with a primary key of 4, which should be placed on a new page. However, in order to satisfy the requirement that the primary key value of the user record on the next page must be greater than the primary key value of the user record on the previous page, operations such as record movement are performed. This process can also be called "page splitting".
Also, why is the new page page 28, not 11? Because the pages may not be next to each other on the disk, they just establish a linked list relationship by maintaining the numbers of the previous page and the next page.
Now continue to add data to the table. The final relationship between multiple pages is as follows:
In order to quickly locate a record from multiple non-adjacent pages, a directory needs to be compiled for them, because these pages may not be contiguous on the disk.
Each page corresponds to a directory entry, and each directory entry includes:
The smallest primary key value in the user record of the page, represented by key
Page number, represented by page_no
So, after cataloging them, the relationship will be like this:
So, now I want to find the record with the primary key value of 20. I will do it in two steps:
Use the dichotomy method to quickly determine the record with the primary key value of 20 from the directory entry. Item 3, and the page number it is on is 9. Knowing that it is on page 9, repeat the previous approach to find the final target record.
At this point, a simple plan is completed. The completed simple directory has an alias called index.
The above-mentioned simple index is the content set up by the author of the original book to help readers understand step by step. This is not the indexing plan of innodb.
Then look at the above suggested index and see what problems there are.
Question 1:
InnoDB uses pages as the basic unit for managing storage space, which means it can only store up to 16kb of continuous storage.
When there are more and more records in the table, a very large continuous storage space is needed to hold all the directory entries, which is unrealistic for tables with large amounts of data.
Question 2:
We often have to add, delete, and modify records, which will affect the whole body.
For example, if I delete all the records on page 28 in the picture above, then page 28 does not need to exist, and directory entry 2 does not need to exist. At this time, you need to move the directory items after directory item 2 forward.
Even if it is not moved, placing directory entry 2 as redundant in the directory entry list will still waste a lot of storage space.
The above is the detailed content of Mysql simple index plan analysis. For more information, please follow other related articles on the PHP Chinese website!