For high-performance databases: database table structure optimization, index optimization and query optimization need to go hand in hand

1. Why is the query speed slow?

Queries are actually composed of a series of subtasks. Optimizing queries actually means: either eliminating some subtasks or reducing the number of subtasks executed.

2. Slow query basics: Optimizing data access

(1) Whether unnecessary data is requested from the database

1) Unnecessary data is queried:
For example, we query a large number of results through select, and close the result set after obtaining the first N rows. In fact, MySQL will query all the results. Set, the client receives part of the data and then discards the remaining data. There is query redundancy here. So we only need to query the previous n records, using the limit keyword limit.

2) Return all columns when multiple tables are associated
When we perform multi-table queries, we often encounter
mysql>select * from …….
Such a query is actually It greatly affects performance. Specific field names should be used instead of wildcards *

3) Always remove all columns
It is forbidden to write statements such as select *.

(2) Whether MySQL scanned additional records

After confirming that the query only returns the required data (that is, do not use wildcards in the specific fields of the customized query *)

The next thing to pay attention to is whether the returned result scanned too much data. The three simplest indicators for MySQL are as follows:
(1) Response time

(2) Number of rows scanned

(3) Number of rows returned.

Response time
Response time: including service time (real query time) and queuing time (blocking waiting time).

Number of rows scanned and number of rows returned
When analyzing a query, it is very helpful to view the number of rows scanned by the query. To a certain extent, it shows whether the query is efficient or not. high.

Number of rows scanned and access type
MySQL has several access methods to find and return a row of results: full table scan, index scan, range scan, unique index query, Constant references, etc.

The role of adding an index comes out here. The index allows MySQL to find records in the most efficient way with the least number of scanned rows.

3. How to reconstruct the query

The purpose is to find a more optimal way to obtain the actual required results.

(1) One complex query or multiple simple queries
A question we often need to consider when writing SQL is: whether we need to divide a complex query into multiple Simple query?

For MySQL, connection and disconnection are very lightweight and very efficient in returning a small query result. Although it is good to have as few queries as possible, it is sometimes necessary to break large queries into smaller queries after measuring whether the workload is significantly reduced.

(2) Segmentation query
The idea of divide and conquer. Sometimes we need to divide a large query into pieces, execute them in parts, and set a delay between steps, so as to avoid locking a lot of data for a long time.

For example, when we delete data, if we delete all the data that needs to be deleted at once, it may occupy the transaction for a long time, but we can shard it and divide a large delete into multiple delete executions through conditional restrictions. , which can improve efficiency.

(3) Decompose related queries
Many high-performance applications will split related queries. For example:

mysql>select * from tag    
left join tag_post on tag_post.tag_id=tag.id    
left join post on tag_post.post_id = post.idwhere tag.tag=&#39;mysql&#39;;

Copy after login

can be decomposed into

mysql>select * from tag where tag=&#39;mysql&#39;;mysql>select * from tag_post where tag_id=1234;
mysql>select * from post where post.id in (123,345,456,8933);

Copy after login

What is the reason for such decomposition?
(1) Make caching more efficient; (For example, the tag queried above has been cached, then the application can skip the first query.)

( 2) After breaking down the query, executing a single query can reduce lock contention.

(3) In some cases, the efficiency will be higher. For example, using the in keyword query after decomposition above is more efficient.

4. The basis of query execution

First, let’s take a look at the schematic diagram of the query execution path:
MySQL query performance optimization details

The steps are as follows:
(1) The client sends a query to the server;

(2) The server first checks the query cache. If the cache is hit, it immediately returns the results stored in the cache, otherwise it goes to the next step.

(3) The server parses and preprocesses the SQL, and then the optimizer generates the corresponding execution plan.

(4) MySQL will call the API of the storage engine to execute the query based on the execution plan generated by the optimizer.

(5) Return the result to the client.

(1) MySQL client/server communication protocol

We do not need to understand how the communication protocol is implemented internally, we only need to understand how the communication protocol works.

MySQL's client and server communication protocol is half-duplex, which means that only one party can send data to the other party at the same time.

(2) Query cache

Before parsing a SQL statement, if the cache is turned on, MySQL will give priority to checking whether the query hits the data in the query cache. If the cache is hit, the result set will be obtained directly from the cache and returned to the client. If there is no hit in the cache, it will enter the next stage.

(3) Query Optimizer

The most important thing in this part is the query optimizer. A query statement can be executed in many ways, and all will be returned in the end. With the same result, the optimizer's role is to find the most efficient execution plan.

The following are the optimization types that the MySQL query optimizer can automatically handle:
(1) Redefine the order of association tables: The association order of data tables is not always in the order specified in the query. This is related to the optimizer.

(2) Convert outer joins to inner joins:

(3) Use equivalent transformation rules: you can reduce some comparisons or remove some identities judge. For example (5=5 and a>5) will be rewritten as (a > 5).

(4) Optimize COUNT(), MIN() and MAX() functions: Whether the index and column are allowed to be empty can help optimize this type of expression: such as finding the minimum value, using the B-Tree structural features, Just query the leftmost record of B-Tree and that's it. The same is true for finding the max() function. But for the COUNT(*) function, the MyISAM storage type maintains a variable to specifically store the total number of record rows in the table.

(5) Covered index scan: When the columns in the index include all the columns that need to be used in the query, MySQL can directly use the index to return the required data without querying the corresponding data rows.

(6) Subquery optimization

(8) Terminate query early: MySQL can always terminate the query immediately when it finds that the query requirements have been met. For example, limit keyword.

(9) Comparison of list IN instead of OR: MySQL will first sort the data in the IN statement, and then use binary search to determine whether the data in the list meets the requirements. This is a O(logn) complexity operation. If equivalently converted to OR, it will become O(n) time complexity.

(4) Sorting optimization

No matter what, sorting is a very costly operation, and you must avoid sorting big data. Therefore, we must use index columns for sorting. When the index cannot be used to generate sorting results, there will definitely be a situation where the table query records will be returned. At this time, the amount of data is huge, and file sorting will be used.

The above is the detailed content of MySQL query performance optimization details. For more information, please follow other related articles on the PHP Chinese website!