Table of Contents
1The concept of index
1.1Definition
1.2 Type
1.3 Function
2 Indexed data The evolution process of structure B-tree
2.3 Question: How to create a directory? Create a table of contents for each page?
3什么是二级索引树
3.1那么二级索引树怎么排序?
3.2索引桥的概念是什么呢(最左匹配原则)?
3.3回表、覆盖索引、索引下推
3.4延申几个面试题:
3.5 Summary of the secondary index tree
4 The difference between primary key index and secondary index
Home Database Mysql Tutorial MySQL index knowledge point analysis

MySQL index knowledge point analysis

May 27, 2023 pm 08:38 PM
mysql

    1The concept of index

    1.1Definition

    In a relational database, an index is a separate, physical pair of databases A storage structure that sorts one or more column values ​​in a table. It is a collection of one or more column values ​​in a table, and a list of logical pointers to the data pages in the table that physically identify these values.
    The index is equivalent to the table of contents of the book. You can quickly find the required content based on the key page numbers of the table of contents. The database uses the index to find a specific value, and then follows the pointer to find the row containing the value, which can correspond to the table. SQL statements execute faster and provide quick access to specific information in database tables.

    1.2 Type

    InnoDB contains three index types, namely ordinary index, unique index (the primary key index is a special non-empty unique index), and full-text index.

    Rewritten as: Ordinary index, also known as non-unique index, has no restrictions. Unique: A unique index requires that the key value cannot be repeated (can be empty). The primary key index is actually a special unique index, but it also has an additional restriction, which requires that the key value cannot be empty. Primary key indexes are created using primary key. Full text (Fulltext): For relatively large data, for example, we store articles, texts, emails, etc., one field may require several kb. If you want to solve the problem of low efficiency of like query in full-text matching, you can create Full text index. Only fields of type char, varchar, and text can create full-text indexes. Both MyISAM and InnoDB support full-text indexing.

    1.3 Function

    One sentence summary:

    Index can improve the efficiency of data retrieval and reduce the IO cost of the database.

    Ask a question: We trade space for time, but what about its data structure, query IO cost, and how to store data?

    2 Indexed data The evolution process of structure B-tree

    We look at the evolution process of our B-tree from a Page perspective.

    Page is the basic unit for InnoDB to manage storage space. InnoDB stores the data in the database in the basic storage unit of page; page is also the basic unit for interaction between memory and disk. The database starts from disk. Read several pages of data into the memory, and refresh several pages of data in the memory to the disk.
    The memory size of one page is 16KB.

    Suppose we want to execute this SQL and get 10 records:

    SELECT * FROM INNODB_USER LIMIT 0 , 10;
    Copy after login

    If the data size of a record is 4K, then one of our Page pages can How many pieces of data are stored?

    16K divided by 4K gets 4 records, right.

    Every piece of data in Page has a key attribute called record_type
    0 Ordinary user record 1 Directory index record 2 Minimum 3 Maximum

    Draw a picture to show how the data is placed on the page:

    MySQL index knowledge point analysis

    This is our Page, and each Page will store data, Store the data in an orderly manner according to the primary key

    We know that the storage of data is sequential IO, which is convenient for storage. However, if the storage is convenient, the query will be inconvenient. If the last one is checked, does it need to be traversed? Entire page of data?

    2.1 Question

    What if we want to check a piece of data? How can we quickly find the data?

    • If the data in our Page has a connection method, think about the data structures we have learned, which structure is the fastest to query?

    • If the data in our Page has a connection method, it can be solved! That's right, it's how the data in the

      linked list

    is connected (the data is in the same page):

    MySQL connects the data in the page through

    One-way linked list. If the query is based on the primary key, the binary positioning method will be very fast. If the query is based on the non-primary key index, only Can traverse a one-way linked list starting from the smallest one.

    How to establish a connection between multiple Pages (the data is in different pages):

    MySQL passes different pages through a two-way linked list Establish a link so that we can find the next page through the previous page and one page through the next page. Since

    we cannot quickly locate the page where the data is , we can only start from the first page Search all the way down the doubly linked list, and then search for the specified record in each page as on the same page. This is also a full table scan.

    MySQL index knowledge point analysis

    2.2 Question

    When there are more and more Page pages, what problems will occur in the query, how to solve and optimize it?

    When our linked list records increase, because we cannot directly locate them, we have the problem of slow query. Think deeply, the so-called slow query,

    is actually the following two problems:

    • Query time complexity 0 (N)

    • The number of IO times reading and writing to the disk is too many

    Let's think about it. When we usually read a book, we want to find information on a certain page. How do we do it?
    CheckDirectory Right? What is a directory? Isn’t it just an index?

    Find a directory on Baidu and post a picture:

    MySQL index knowledge point analysis

    We found that there are two Very important information:

    • Content introduction (chapter title)

    • The page number

    We refer to the idea of ​​​​a book's catalog to achieve our purpose of quickly querying data:

    Add a catalog to the data and check the data, we first based on the catalog Page finds where the data is on which page to improve query performance.

    But,

    2.3 Question: How to create a directory? Create a table of contents for each page?

    Is it necessary to create directories regularly? For example, the directory of a dictionary is established in alphabetical order. What did you think of? That’s right, primary key. The auto-incremented primary key in Mysql just meets our requirements. It is regular, has less content, and is not repeatable. It is a perfect directory. We will store the primary key of each page according to the rules. , add a pointer pointing to the location of the data, directly based on the primary key size during query, use the dichotomy method to quickly find the directory, and then find the data.
    But do we need to create a directory for each data page? It seems that this is still necessary. If you don't create data for each page, how can you locate the data in the page? Is it a full page scan?
    But create a directory for each page. As the directory pages appear multiple times, we have to traverse the directories one by one The query performance will also decrease.
    Can we create a directory for the directory?
    So, we can also create a directory for the directory page and extract one layer of root nodes upwards, which will make it easier for us to query.

    MySQL index knowledge point analysis

    This tree is stored according to the primary key, so we call it primary key index tree, because The primary key index tree stores all the data in our table, so in MySQL index is data, and data is index for this reason.

    This is the data structure of the MysqlB tree primary key index tree. How about it? Is it more impressive than the knowledge you get by memorizing it directly?

    2.4 Index tree, Page splitting and merging

    We have found a way to improve query performance. So, what problems will we encounter when Pages are added, modified, or deleted?

    What if

    increases in an orderly manner and adds a new piece of data? The page is full, so do you have to open a new page?
    And the data of the page must meet a condition:
    The primary key value of the user record in the next data page must be greater than the primary key value of the user record in the previous pageBecause it is an orderly increase, We can directly add a page to the end of the doubly linked list of pages.
    What if
    increases out of order and adds a new piece of data?

    • Open a new page and find the location of the data.

    • Move the old data to the new page and put the new data in an ordered position.

    • Leaf node data is always translated.

    • Triggers the splitting and merging of the leaf node data Page and triggers the splitting and merging of the upper leaf node and root node again.

    • What is this called, "a single move affects the whole body", also called page splitting! !

    Summary: Problems encountered when adding, modifying, and deleting Page pages:

    We can say that when an unordered increase occurs During update operations such as updating primary key IDs and deleting index pages, there will be a large number of tree node adjustments, which will trigger the paging and merging of child leaf node Page pages and upper leaf node and root node pages, resulting in a large amount of disk fragmentation and loss of database capacity. Performance, which explains why we

    should not build indexes on frequently updated and modified columns, or should not update the primary key .

    Let us summarize:

    Clustered index (clustered index):

    The primary key index tree is also It is called a clustered index or clustered index. In InnoDB, a table has only one clustered index tree. If a table creates a primary key index, then this primary key index is a clustered index. We determine the data based on the key value of the clustered index tree. In the physical storage order of rows, our clustered index will sort and store all columns in the table. The index is the data, and the data is the index, which refers to our primary key index tree.

    2.5 Based on what we just deduced, here are some interview questions

    Why is it best for the primary key ID to have an increasing trend?

    你刚刚看完啊,不会没记住吧,有序递增,下一个数据页中用户记录的主键值必须大于上一个页中用户的主键值,假如我是趋势递增,存入的数据肯定是在最末尾链表或者新增一个链表,就不会触发页的分裂与合并,导致添加的速度变慢。

    三层B+数能存多少数据?

    考察点:Page页的大小,B+树的定义
    1GB = 1024 M, 1mb = 1024k,1k= 1024 bytes

    答:
    已知:索引逻辑单元 16bytes 字节,16KB=16* 1024*1024,肯定比一千万多,在InnoDB中B+树的深度为3层就能满足千万级别的数据存储。

    mysql 大字段为什么要拆分?

    一个Page页可存放16K的数据,大字段占用大量的存储空间,意味着一个Page页可存储的数据条数变少,那么就需要更多的页来存储,需要更多的Page,意味着树的深度会变高。那么磁盘IO的次数会增加性能下降,查询更慢。大字段不管是否被使用都会存放在索引上,占据大量内存空间压缩Page数据条数。

    为什么用B+树?

    B+树的底层是多路平衡查找树,对于每一次的查询的都是从根节点触发,到子叶结点才存放数据,根节点和非叶子结点都是存放的索引指针,查找叶子结点互,可以根据键值数据查询。具备更强的扫库、扫表能力、排序能力以及查询效率和性能的稳定性,存储能力也更强,仅使用三层B+树就能存储千万级别的数据。

    3什么是二级索引树

    刚才看的是根据主键得来的索引,我们如果不查主键,或者说表里压根就没有主键,怎么办?我们还可以根据几个字段来创建联合索引(组合索引聚合索引。。哎呀名字而已怎么叫都行)。

    根据主键得到的索引树叫主键索引树,根据别的字段得到的索引树叫二级索引树。

    通过下面的SQL 可以建立一个组合索引

    ALTER TABLE INNODB_USER ADD INDEX
    SECOND_INDEX_AGE_USERNAME_PHONE('age','user_name','phone');
    Copy after login

    其实,看似建立了1个索引,但是你使用 age 查询 age,user_name 查询 age,user_name,phone 都能生效
    您也可以认为建立了三个这样的索引:

    ALTER TABLE INNODB__USER ADD INDEX
    SECOND_INDEX_AGE__USERNAME_PHONE('age');
    ALTER TABLE INNODB_USER ADD INDEX
    SECOND_INDEX_AGE_USERNAME_PHONE('age','user_name');
    ALTER TABLE `INNODB_USER`ADD INDEX
    SECOND_INDEX_AGE_USERNAME_PHONE('age','user_name','phone');
    Copy after login

    3.1那么二级索引树怎么排序?

    首先需要知道参与排序的字段类型是否有有序?

    如果是有序字段,就按照有序字段排序比如(int) 1 2 3 4。
    如果是无序字段,按照这个列的字符集的排序规则来排序,这点不去深入,知道就好。

    我现在有一个组合索引(A-B-C)他会按照你建立字段的顺序来进行排序:
    如果A相同按照B排序,如果B相同按照C排序,如果ABC全部相同,会按照聚集索引进行排序。

    我们的Page会根据组合索引的字段建立顺序来存储数据,年龄 用户名 手机号。
    它的数据结构其实是一样的

    3.2索引桥的概念是什么呢(最左匹配原则)?

    还是上面那个索引,年龄用户名手机号,age,username,phone
    那么可以看到我们第一个字段是AGE,如果需要这个索引生效,是不是在查询的时候需要先使用Age查询,然后如果还需要user_name,就使用user_name。

    只使用了user_name 能使用到索引吗?
    其实是不行的,因为我是先使用age进行排序的,你必须先命中age,再命中user_name,再命中phone,这个其实
    就是我们所说的最左匹配原则。

    最左其实就是因为我们是按照组合索引的顺序来存储的。大家常说的"索引桥"也是这个原因。在命中组合索引中,必须像过桥一样,先跨过第一块木板,再到第二块木板,最后到第三块木板。

    3.3回表、覆盖索引、索引下推

    二级索引树有三个重要的概念,分别是回表、覆盖索引、索引下推。.

    回表就是:我们查询的数据不在二级索引树中需要拿到ID去主键索引树找的过程。

    覆盖索引就是:我们需要查询的数据都在二级索引树中,直接返回这种情况就叫做覆盖索引。
    索引下推(index condition pushdown )简称ICP:在Mysql5.6以后的版本上推出,用于优化回表查询;

    3.4延申几个面试题:

    为什么离散度低的列不走索引?

    What is the concept of dispersion? The more identical data, the lower the dispersion, and the less identical data, the higher the dispersion.
    The data is all the same, how to sort it? Can't sort it?
    There are too many duplicate values ​​in the B Tree. When the MySQL optimizer finds that indexing is almost the same as using a full table scan, it will not go even if the index is created. Whether to use the index or not is decided by the MySQL optimizer.

    Are the more indexes, the better?

    In terms of space: Exchange space for time, and the index needs to occupy disk space.
    Time: Hit the index to speed up our query efficiency. If it is an update and delete, it will cause the splitting and merging of pages, affecting the response time of insert and update statements, but slowing down performance.
    If it is a column that needs to be updated frequently, it is not recommended to create an index, because the splitting and merging of pages are frequently triggered.

    3.5 Summary of the secondary index tree

    Also called a combined index (composite index), the secondary index tree stores the order of the column names when we create the index. It only saves part of the data used to create the secondary index column names. The secondary index tree was born to assist us in querying and improve query efficiency. There are three actions in the secondary index tree: table return, covering index, and index pushdown. Among them, the most performant is the covering index.

    4 The difference between primary key index and secondary index

    I found a difference picture on the Internet

    MySQL index knowledge point analysis

    The above is the detailed content of MySQL index knowledge point analysis. For more information, please follow other related articles on the PHP Chinese website!

    Statement of this Website
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    AI Hentai Generator

    AI Hentai Generator

    Generate AI Hentai for free.

    Hot Article

    R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
    2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
    Hello Kitty Island Adventure: How To Get Giant Seeds
    1 months ago By 尊渡假赌尊渡假赌尊渡假赌
    Two Point Museum: All Exhibits And Where To Find Them
    1 months ago By 尊渡假赌尊渡假赌尊渡假赌

    Hot Tools

    Notepad++7.3.1

    Notepad++7.3.1

    Easy-to-use and free code editor

    SublimeText3 Chinese version

    SublimeText3 Chinese version

    Chinese version, very easy to use

    Zend Studio 13.0.1

    Zend Studio 13.0.1

    Powerful PHP integrated development environment

    Dreamweaver CS6

    Dreamweaver CS6

    Visual web development tools

    SublimeText3 Mac version

    SublimeText3 Mac version

    God-level code editing software (SublimeText3)

    PHP's big data structure processing skills PHP's big data structure processing skills May 08, 2024 am 10:24 AM

    Big data structure processing skills: Chunking: Break down the data set and process it in chunks to reduce memory consumption. Generator: Generate data items one by one without loading the entire data set, suitable for unlimited data sets. Streaming: Read files or query results line by line, suitable for large files or remote data. External storage: For very large data sets, store the data in a database or NoSQL.

    How to use MySQL backup and restore in PHP? How to use MySQL backup and restore in PHP? Jun 03, 2024 pm 12:19 PM

    Backing up and restoring a MySQL database in PHP can be achieved by following these steps: Back up the database: Use the mysqldump command to dump the database into a SQL file. Restore database: Use the mysql command to restore the database from SQL files.

    How to optimize MySQL query performance in PHP? How to optimize MySQL query performance in PHP? Jun 03, 2024 pm 08:11 PM

    MySQL query performance can be optimized by building indexes that reduce lookup time from linear complexity to logarithmic complexity. Use PreparedStatements to prevent SQL injection and improve query performance. Limit query results and reduce the amount of data processed by the server. Optimize join queries, including using appropriate join types, creating indexes, and considering using subqueries. Analyze queries to identify bottlenecks; use caching to reduce database load; optimize PHP code to minimize overhead.

    How to insert data into a MySQL table using PHP? How to insert data into a MySQL table using PHP? Jun 02, 2024 pm 02:26 PM

    How to insert data into MySQL table? Connect to the database: Use mysqli to establish a connection to the database. Prepare the SQL query: Write an INSERT statement to specify the columns and values ​​to be inserted. Execute query: Use the query() method to execute the insertion query. If successful, a confirmation message will be output.

    How to create a MySQL table using PHP? How to create a MySQL table using PHP? Jun 04, 2024 pm 01:57 PM

    Creating a MySQL table using PHP requires the following steps: Connect to the database. Create the database if it does not exist. Select a database. Create table. Execute the query. Close the connection.

    How to use MySQL stored procedures in PHP? How to use MySQL stored procedures in PHP? Jun 02, 2024 pm 02:13 PM

    To use MySQL stored procedures in PHP: Use PDO or the MySQLi extension to connect to a MySQL database. Prepare the statement to call the stored procedure. Execute the stored procedure. Process the result set (if the stored procedure returns results). Close the database connection.

    How to fix mysql_native_password not loaded errors on MySQL 8.4 How to fix mysql_native_password not loaded errors on MySQL 8.4 Dec 09, 2024 am 11:42 AM

    One of the major changes introduced in MySQL 8.4 (the latest LTS release as of 2024) is that the "MySQL Native Password" plugin is no longer enabled by default. Further, MySQL 9.0 removes this plugin completely. This change affects PHP and other app

    The difference between oracle database and mysql The difference between oracle database and mysql May 10, 2024 am 01:54 AM

    Oracle database and MySQL are both databases based on the relational model, but Oracle is superior in terms of compatibility, scalability, data types and security; while MySQL focuses on speed and flexibility and is more suitable for small to medium-sized data sets. . ① Oracle provides a wide range of data types, ② provides advanced security features, ③ is suitable for enterprise-level applications; ① MySQL supports NoSQL data types, ② has fewer security measures, and ③ is suitable for small to medium-sized applications.

    See all articles