If you want to learn MySQL in depth, you should start from the macro architecture. In this article, we will learn the process of executing MySQL query statements. I hope it will be helpful to everyone!
The MySQL version of this article is 8.0.18
Architecture diagram
Parser
The function of the parser is to perform the following work on the SQL statement sent from the client:
- Grammar Parsing: Check the syntax of the SQL statement, whether the brackets and quotation marks are closed, etc.
- Lexical parsing: Split the keywords, table names, and field names in the SQL statement into nodes, and finally obtain a parse tree
Preprocessor
The parser mainly checks the grammar and lexicon, but if the grammar and lexicon are correct, but the table , the field does not exist, then this SQL statement cannot be executed correctly.
So the role of the preprocessor is: Semantic parsing, to determine whether the semantics of the parse tree is correct and whether tables and fields exist. After preprocessing, a new parse tree will be obtained.
Query optimizer
Query optimizer structure
The execution method of a SQL statement in MySQL is as follows Although the same results will be obtained in the end, there are differences in overhead. The specific execution method chosen is determined by the query optimizer. For example:
- There are multiple indexes in the table that can be selected. Which index should be selected?
- When we perform related queries on multiple tables, which table’s data should be used? For the benchmark table
The query optimizer is a cost-based optimizer. Its working principle is to evaluate various execution plans based on the parse tree. The cost required for the execution method, will eventually get an execution plan with the minimum cost as the final solution .
However, this execution method with the smallest overhead is not necessarily the optimal execution method. For example, an index should be used, but a full table scan is performed. Although there are two words "optimization" in the query optimizer, this optimization is not omnipotent. In many cases, it is more necessary to consider whether the SQL statement is written reasonably.
Logical query optimization
Logical query optimization is mainly responsible for performing some relational algebra to optimize SQL statements, thereby making SQL statement execution more efficient
We can use several cases to briefly understand logical query optimization
-
Subquery merging
Before merging
SELECT * FROM t1 WHERE a1<10 AND (
EXISTS(SELECT a2 FROM t2 WHERE t2.a2<5 AND t2.b2=1) OR
EXISTS(SELECT a2 FROM t2 WHERE t2.a2<5 AND t2.b2=2)
);
Copy after login
After merging
SELECT * FROM t1 WHERE a1<10 AND (
EXISTS(SELECT a2 FROM t2 WHERE t2.a2<5 AND (t2.b2=1 OR t2.b2=2)
);
Copy after login
Merge multiple subqueries by merging query conditions, and reduce multiple connection operations to a single table scan and a single connection
Equivalent predicate rewriting
Like the familiar like fuzzy query, % is written after the condition before the index range query is performed. In fact, this is the credit of the query optimizer
Assume that the conditions used are all indexed, before rewriting
SELECT * FROM USERINFO WHERE name LIKE 'Abc%';
Copy after login
After rewriting
SELECT * FROM USERINFO WHERE name >= 'Abc' AND name < 'Abd';
Copy after login
This is why the answer to index range query
-
Conditional simplification
Conditional simplification is also used Some equations and algebraic relationships are used to achieve simplification
- Remove redundant brackets in expressions and reduce the levels of AND and OR trees generated during syntax analysis, such as
((a AND b) AND (c AND d))
is simplified to a AND b AND c AND d
- Constant transfer, such as
col1 = col2 AND col2 = 3
is simplified to col1 = 3 AND col2 = 3
- Expression calculation, some expressions that can be directly solved will be converted into the final calculation result, such as
col1 = 1 2
Simplification For col1 = 3
##Physical query optimization
The main work of physical query optimization is based on SQL Statements evaluate the cost of multiple execution plans respectively
Physical query optimization mainly solves the following problems:
- Which method is the least expensive in single table scanning? (scan index back to table or full table scan)
- When there is a table connection, which connection method is the least expensive to use
Simple Learn about cost evaluation. Cost evaluation is based on the two dimensions of CPU cost and IO cost.
Scanning method | Cost evaluation formula |
Sequential scan | N_page * a_page_IO_time N_tuple * a_tuple_CPU_time |
Index scan | C_index N_page_index * a_page_IO_time |
The above parameters are explained as follows:
- a_page_IO_time, the IO time of loading a data page is
- N_page, the number of data pages is
- N_tuple, the number of tuples ( A tuple is understood as a row of data)
- a_tuple_CPU_time, the CPU time spent on parsing a tuple from the data page is
- C_index, the IO time spent on the index is
- N_page_index, the index page Quantity
For information on index cost calculation, please refer to this article:Why did MySQL query choose to use this index? ——Based on MySQL 8.0.22 index cost calculation
Execution plan
The execution plan is the product of the query optimizer and will eventually be handed over to the storage engine for execution . The execution plan can help us know how MySQL will execute this SQL statement.
Use the explain
keyword to view the execution plan of the SQL statement, and you can get the following information:
- id: The execution order of the query in the nested query
- possible_keys: Indexes that may be used in this query
- Key: Actual indexes used
- rows: Approximately how many rows of data need to be retrieved to get the result
- select_type many Connection type between tables
- extra: additional information, whether there is index coverage, index pushdown, etc.
Storage engine
The MySQL server stipulates specifications for how data is stored, extracted, and updated. This specification is implemented by storage engines. Different storage engines have different implementation methods, so different storage engines will present their unique functions and characteristics. The most commonly used storage engines are InnoDB and MyISAM
Let’s briefly talk about the characteristics of these two storage engines
InnoDB:
- Supports foreign keys and transactions, ensuring Improves the integrity and consistency of data
- Supports finer lock granularity, better control of locks, and higher reading and writing efficiency
MyISAM
- Does not support transactions, only supports row locks, suitable for read-only data scenarios
The storage engine will not be expanded on for the time being, and will continue to be interspersed with their comparisons in other articles, as well as details. Analyze the process of updating data in InnoDB
Summary
In the past, I only knew how to write SQL statements on the client software, click to execute, and get the data
Now I finally understand that after a query statement is passed into the MySQL server, it needs to go through this series of operations
The parser checks the syntax and lexicon of this SQL statement. , if there are no errors, it will be split into nodes according to keywords, and finally a parse tree will be formed
The preprocessor will check the semantics of the SQL statement and check whether the SQL statement is ambiguous , fields, etc., to form a new parse tree
The query optimizer gets the various execution plans generated by this parse tree, and obtains them after logical query optimization and physical query optimization An execution plan with minimal overhead
The execution engine gets this execution plan and calls the storage engine interface
The storage engine processes data according to the execution plan Query, the query will query and call some interfaces of the file system in the operating system, complete the data query, and finally return to the client
[Related recommendations: mysql video tutorial]
The above is the detailed content of MySQL learning to talk about query statement execution process. For more information, please follow other related articles on the PHP Chinese website!