How to optimize the join statement in MySQL
Simple Nested-Loop Join
Let’s take a look at how mysql works when performing a join operation. What are the common join methods?
As shown in the figure, when we perform a connection operation, the table on the left is Driver table, and the table on the right is Driven table
Simple Nested-Loop Join This join operation is to take a record from the driving table and then match the records of the driven table one by one. If the conditions match, the result will be returned. Then, continue to match the next record in the driver table until all the data in the driver table has been matched
Because it is time-consuming to fetch data from the driver table each time, MySQL does not use this algorithm To perform the join operation
Block Nested-Loop Join
In order to avoid the time-consuming fetching data from the driver table each time, we can add a batch of The data is retrieved from the driver table at once and matched in memory. After this batch of data is matched, a batch of data is fetched from the driver table and placed in the memory until all the data in the driver table is matched.
Batch data retrieval can reduce a lot of IO operations, so the execution efficiency is relatively high. This kind of connection operation is also used by MySQL
By the way, this memory has a proper name in MySQ, called join buffer. We can execute the following statement to view the size of the join buffer
show variables like '%join_buffer%'
Move out the single_table table we used before, create 2 tables based on the single_table table, and insert 1w random records into each table
CREATE TABLE single_table ( id INT NOT NULL AUTO_INCREMENT, key1 VARCHAR(100), key2 INT, key3 VARCHAR(100), key_part1 VARCHAR(100), key_part2 VARCHAR(100), key_part3 VARCHAR(100), common_field VARCHAR(100), PRIMARY KEY (id), KEY idx_key1 (key1), UNIQUE KEY idx_key2 (key2), KEY idx_key3 (key3), KEY idx_key_part(key_part1, key_part2, key_part3) ) Engine=InnoDB CHARSET=utf8; create table t1 like single_table; create table t2 like single_table;
If you use the join statement directly, MySQL optimization The server may select table t1 or t2 as the driving table, which will affect our process of analyzing sql statements, so we use straight_join to let mysql use a fixed connection method to execute the query
select * from t1 straight_join t2 on (t1.common_field = t2.common_field)
The running time is 0.035s
The execution plan is as follows
Using join buffer is seen in the Extra column, indicating that the connection operation is based on Block Nested -Loop Join Algorithm
Index Nested-Loop Join
After understanding the Block Nested-Loop Join algorithm, you can see that each record in the driver table will Matching all the records in the driven table is very time-consuming. Can the efficiency of matching in the driven table be improved?
I guess you have also thought of this algorithm, which is to add indexes to the columns connected by the driven table, so that the matching process is very fast, as shown in the figure
Let’s take a look at how fast it is to perform queries based on joins based on index columns?
select * from t1 straight_join t2 on (t1.id = t2.id)
The execution time is 0.001 seconds. It can be seen that it is more than one level faster than connecting based on ordinary columns.
The execution plan is as follows
Not all columns of the driver table records will be put into the join buffer. Only the columns in the query list and the columns in the filter conditions will be put into the join buffer. Therefore We don't want to use * as the query list, we just need to put the columns we care about in the query list, so that more records can be placed in the join buffer
How to choose the driver table?
Now that we know the specific implementation of join, let’s talk about a common question, that is, how to choose the driver table?
If it is Block Nested-Loop Join algorithm:
When the join buffer is large enough, it does not matter who does the driving table
When the join buffer is not large enough, a small table should be selected as the driving table (the small table has less data and the number of times it is put into the join buffer is small, which reduces the number of scans of the table)
If it is Index Nested-Loop Join algorithm
Assume that the number of rows in the driver table is M, so M rows of the driver table need to be scanned
Every time from When obtaining a row of data from the driven table, you need to first search for index a, and then search for the primary key index. The number of rows in the driven table is N. The approximate complexity of searching a tree each time is the logarithm of base 2 N, so the time complexity of searching a row on the driven table is 2∗ l o g 2 N 2*log2^N 2∗log2N
Each row of data in the driver table must be searched once in the driven table. The approximate complexity of the entire execution process is M M ∗ 2 ∗ l o g 2 N M M*2*log2^N M M∗2∗log2N
Obviously M has a greater impact on the number of scanned rows, so the small table should be used as the driving table. Of course, the premise of this conclusion is that the index of the driven table can be used
In short, we can just let the small table be the driving table
When the join statement is executed slowly, we can optimize it through the following methods
When performing a connection operation, the index of the driven table can be used
The small table is used as the driving table
Increase the size of the join buffer
Do not use * as the query list, only return the required columns
The above is the detailed content of How to optimize the join statement in MySQL. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



MySQL is suitable for beginners because it is simple to install, powerful and easy to manage data. 1. Simple installation and configuration, suitable for a variety of operating systems. 2. Support basic operations such as creating databases and tables, inserting, querying, updating and deleting data. 3. Provide advanced functions such as JOIN operations and subqueries. 4. Performance can be improved through indexing, query optimization and table partitioning. 5. Support backup, recovery and security measures to ensure data security and consistency.

You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".

MySQL is an open source relational database management system. 1) Create database and tables: Use the CREATEDATABASE and CREATETABLE commands. 2) Basic operations: INSERT, UPDATE, DELETE and SELECT. 3) Advanced operations: JOIN, subquery and transaction processing. 4) Debugging skills: Check syntax, data type and permissions. 5) Optimization suggestions: Use indexes, avoid SELECT* and use transactions.

Create a database using Navicat Premium: Connect to the database server and enter the connection parameters. Right-click on the server and select Create Database. Enter the name of the new database and the specified character set and collation. Connect to the new database and create the table in the Object Browser. Right-click on the table and select Insert Data to insert the data.

MySQL and SQL are essential skills for developers. 1.MySQL is an open source relational database management system, and SQL is the standard language used to manage and operate databases. 2.MySQL supports multiple storage engines through efficient data storage and retrieval functions, and SQL completes complex data operations through simple statements. 3. Examples of usage include basic queries and advanced queries, such as filtering and sorting by condition. 4. Common errors include syntax errors and performance issues, which can be optimized by checking SQL statements and using EXPLAIN commands. 5. Performance optimization techniques include using indexes, avoiding full table scanning, optimizing JOIN operations and improving code readability.

You can create a new MySQL connection in Navicat by following the steps: Open the application and select New Connection (Ctrl N). Select "MySQL" as the connection type. Enter the hostname/IP address, port, username, and password. (Optional) Configure advanced options. Save the connection and enter the connection name.

Recovering deleted rows directly from the database is usually impossible unless there is a backup or transaction rollback mechanism. Key point: Transaction rollback: Execute ROLLBACK before the transaction is committed to recover data. Backup: Regular backup of the database can be used to quickly restore data. Database snapshot: You can create a read-only copy of the database and restore the data after the data is deleted accidentally. Use DELETE statement with caution: Check the conditions carefully to avoid accidentally deleting data. Use the WHERE clause: explicitly specify the data to be deleted. Use the test environment: Test before performing a DELETE operation.

Steps to perform SQL in Navicat: Connect to the database. Create a SQL Editor window. Write SQL queries or scripts. Click the Run button to execute a query or script. View the results (if the query is executed).
