How to write high-performance sql-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

How to write high-performance sql

(*-*)浩

May 09, 2019 am 09:34 AM

mysql

First of all, we need to understand what an execution plan is?

The execution plan is a query plan made by the database based on the statistical information of the SQL statement and related tables. This plan is automatically analyzed by the query optimizer. For example, if a SQL statement is used to extract 100,000 records To search for 1 record in the table, the query optimizer will choose the "index search" method. If the table is archived and there are only 5,000 records left, the query optimizer will change the plan and use "full table scan" "Way.

It can be seen that the execution plan is not fixed, it is "personalized". There are two important points in generating a correct "execution plan":

Does the SQL statement clearly tell the query optimizer what it wants to do?

Is the database statistics obtained by the query optimizer up to date and correct?

Recommended courses: MySQL Tutorial.

How to write high-performance sql

Unified way of writing SQL statements

For the following two SQL statements, programmers think they are the same, database query optimizer Think it's different.

select*from dual 
select*From dual

Copy after login

In fact, the case is different. The query analyzer considers it to be two different SQL statements and must be parsed twice. Generate 2 execution plans. Therefore, as a programmer, you should ensure that the same query statement is consistent everywhere, even one more space will not work!

Don’t write the SQL statement too complicated

I often see that a SQL statement captured from the database is printed out with 2 A4 sheets of paper So long. Generally speaking, there are usually problems with such complex statements. I took this 2-page long SQL statement to ask the original author, but he said it took too long and he couldn't understand it for a while. It is conceivable that even the original author may be confused by the SQL statement, and the database will be confused as well.

Generally, the results of a Select statement are used as a subset, and then the query is performed from the subset. This kind of one-level nested statement is relatively common, but according to experience, more than three levels of nested statements are not allowed. set, the query optimizer can easily give wrong execution plans. Because it was stunned. Things like artificial intelligence are ultimately inferior to human resolution. If people are dizzy, I can guarantee that the database will be dizzy as well.

In addition, the execution plan can be reused. The simpler the SQL statement, the higher the possibility of being reused. As long as one character changes in a complex SQL statement, it must be re-parsed, and then a lot of garbage will be stuffed in the memory. It is conceivable how inefficient the database will be.

Use "temporary table" to temporarily store intermediate results

An important way to simplify SQL statements is to use temporary tables to temporarily store intermediate results. However, The benefits of temporary tables are far more than these. Temporary results are temporarily stored in the temporary table, and subsequent queries are in tempdb. This can avoid multiple scans of the main table in the program, and also greatly reduces "shared lock" blocking during program execution. Update Lock", which reduces blocking and improves concurrency performance.

OLTP system SQL statements must use bind variables

select*from orderheader where changetime >'2010-10-20 00:00:01' 
select*from orderheader where changetime >'2010-09-22 00:00:01'

Copy after login

The query optimizer considers the above two statements to be different SQL statements and needs to be parsed twice. If you use a bind variable

select*from orderheader where changetime >@chgtime

Copy after login

@chgtime variable, you can pass in any value, so that a large number of similar queries can reuse the execution plan, which can greatly reduce the burden of parsing SQL statements on the database. Parse once and reuse multiple times is the principle to improve database efficiency.

Bind variable peek

Everything has two sides, bind variables are applicable to most OLTP processing, but there are exceptions. For example, when the field in the where condition is a "skewed field".

"Tilted field" means that most of the values in the column are the same. For example, in a census form, in the "Ethnicity" column, more than 90% are Han. So if a SQL statement wants to query the population of Han people who are 30 years old, then the "ethnic" column must be placed in the where condition. At this time, there will be a big problem if you use the bind variable @nation.

Just imagine if the first value passed in by @nation is "Han", then the entire execution plan will inevitably choose table scan. Then, the second value passed in is "Buyei". It stands to reason that the proportion of "Buyi" may be only one ten thousandth, so index search should be used. However, since the execution plan of "Han" parsed for the first time is reused, the table scan method will also be used for the second time. This problem is the famous "bind variable snooping". It is recommended not to use bind variables for "skewed fields".

Use begin tran only when necessary

A SQL statement in SQL Server is a transaction by default, and it is committed by default after the statement is executed. In fact, this is a minimized form of begin tran, just like a begin tran is implied at the beginning of each statement, and a commit is implied at the end.

In some cases, we need to explicitly declare begin tran. For example, when performing "insert, delete, and modify" operations, we need to modify several tables at the same time. It is required that either all modifications of several tables are successful or none of them are successful. begin tran can play such a role. It can execute several SQL statements together and finally commit them together. The advantage is that data consistency is guaranteed, but nothing is perfect. The price paid by Begin tran is that before submission, all resources locked by SQL statements cannot be released until they are committed.

It can be seen that if Begin tran traps too many SQL statements, the performance of the database will be terrible. Before the large transaction is committed, other statements will inevitably be blocked, resulting in a lot of blocks.

The principle of using Begin tran is that on the premise of ensuring data consistency, the fewer SQL statements trapped by begin tran, the better! In some cases, triggers can be used to synchronize data, and begin tran is not necessarily used.

Some SQL query statements should be added with nolock

Adding nolock to the SQL statement is an important means to improve the concurrent performance of SQL Server. This is not required in Oracle , because the structure of Oracle is more reasonable, and there is an undo table space to save the "previous data". If the data has not been committed during modification, then what you read is the copy before it was modified, and the copy is placed in the undo table space. . In this way, Oracle's reading and writing can be independent of each other, which is why Oracle is widely praised. SQL Server's reading and writing will block each other. In order to improve concurrency performance, nolock can be added to some queries, so that writing can be allowed during reading. However, the disadvantage is that uncommitted dirty data may be read. There are three principles for using nolock.

(1) If the query results are used for "insertion, deletion, and modification", nolock cannot be added!

(2) The queried table is one where page splits occur frequently, so use nolock with caution!

(3) Using a temporary table can also save the "data foreshadow", which has a function similar to Oracle's undo table space.

If you can use a temporary table to improve concurrency performance, do not use nolock .

The clustered index is not built on the sequence field of the table, and the table is prone to page splits

For example, in the order table, there is the order number orderid and the customer number contactid, then On which field should the clustered index be added? For this table, the order numbers are added sequentially. If a clustered index is added to the orderid, the new rows will be added at the end, so that page splits will not occur frequently. However, since most queries are based on customer IDs, it only makes sense to add a clustered index to contactid. For the order table, contactid is not a sequential field.

For example, the "contactid" of "Zhang San" is 001, then the order information of "Zhang San" must be placed on the first data page of this table. If "Zhang San" places a new order today An order, then the order information cannot be placed on the last page of the table, but on the first page! What if the first page is full? Sorry, all the data in this table needs to be moved back to make room for this record.

SQL Server's index is different from Oracle's index. SQL Server's clustered index actually sorts the table in the order of the clustered index fields, which is equivalent to Oracle's index-organized table. The clustered index of SQL Server is an organizational form of the table itself, so its efficiency is very high. Because of this, when a record is inserted, its position is not placed randomly, but on the data page where it should be placed in order. If there is no space on that data page, it will cause page splits. So obviously, the clustered index is not built on the sequential fields of the table, and the table is prone to page splits.

I once encountered a situation where a friend’s insertion efficiency dropped significantly after re-indexing a certain table. It is estimated that the situation is probably like this. The clustered index of the table may not be built on the sequential fields of the table. The table is often archived, so the data of the table exists in a sparse state. For example, Zhang San has placed 20 orders, but there are only 5 orders in the last three months. The archiving strategy is to retain 3 months of data. Then Zhang San’s past 15 orders have been archived, leaving 15 vacancies, which can be entered in insert Repurposed as they occur. In this case, since there are free spaces available, no page split will occur. However, the query performance will be relatively low, because the query must scan those empty spaces without data.

The situation changed after rebuilding the clustered index, because rebuilding the clustered index means rearranging the data in the table. The original vacancies are gone, and the fill rate of the page is very high. When inserting data, page splits often occur. , so the performance drops significantly.

For tables whose clustered indexes are not built on sequential fields, should we give a relatively low page fill rate? Do you want to avoid rebuilding the clustered index? It’s a question worth considering!

After adding nolock, querying tables where page splits often occur may easily cause skipped or repeated reads

加nolock后可以在“插、删、改”的同时进行查询，但是由于同时发生“插、删、改”，在某些情况下，一旦该数据页满了，那么页分裂不可避免，而此时nolock的查询正在发生，比如在第100页已经读过的记录，可能会因为页分裂而分到第101页，这有可能使得nolock查询在读101页时重复读到该条数据，产生“重复读”。同理，如果在100页上的数据还没被读到就分到99页去了，那nolock查询有可能会漏过该记录，产生“跳读”。

上面提到的哥们，在加了nolock后一些操作出现报错，估计有可能因为nolock查询产生了重复读，2条相同的记录去插入别的表，当然会发生主键冲突。

使用like进行模糊查询时应注意

有的时候会需要进行一些模糊查询比如

select*from contact where username like ‘%yue%’

Copy after login

关键词%yue%，由于yue前面用到了“%”，因此该查询必然走全表扫描，除非必要，否则不要在关键词前加%，

数据类型的隐式转换对查询效率的影响

sql server2000的数据库，我们的程序在提交sql语句的时候，没有使用强类型提交这个字段的值，由sql server 2000自动转换数据类型，会导致传入的参数与主键字段类型不一致，这个时候sql server 2000可能就会使用全表扫描。Sql2005上没有发现这种问题，但是还是应该注意一下。

The above is the detailed content of How to write high-performance sql. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Saving in R.E.P.O. Explained (And Save Files)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7571

CakePHP Tutorial

1386

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

110

Related knowledge

MySQL: Simple Concepts for Easy Learning Apr 10, 2025 am 09:29 AM

MySQL is an open source relational database management system. 1) Create database and tables: Use the CREATEDATABASE and CREATETABLE commands. 2) Basic operations: INSERT, UPDATE, DELETE and SELECT. 3) Advanced operations: JOIN, subquery and transaction processing. 4) Debugging skills: Check syntax, data type and permissions. 5) Optimization suggestions: Use indexes, avoid SELECT* and use transactions.

How to open phpmyadmin Apr 10, 2025 pm 10:51 PM

You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".

MySQL: An Introduction to the World's Most Popular Database Apr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

Why Use MySQL? Benefits and Advantages Apr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

How to use single threaded redis Apr 10, 2025 pm 07:12 PM

Redis uses a single threaded architecture to provide high performance, simplicity, and consistency. It utilizes I/O multiplexing, event loops, non-blocking I/O, and shared memory to improve concurrency, but with limitations of concurrency limitations, single point of failure, and unsuitable for write-intensive workloads.

MySQL and SQL: Essential Skills for Developers Apr 10, 2025 am 09:30 AM

MySQL and SQL are essential skills for developers. 1.MySQL is an open source relational database management system, and SQL is the standard language used to manage and operate databases. 2.MySQL supports multiple storage engines through efficient data storage and retrieval functions, and SQL completes complex data operations through simple statements. 3. Examples of usage include basic queries and advanced queries, such as filtering and sorting by condition. 4. Common errors include syntax errors and performance issues, which can be optimized by checking SQL statements and using EXPLAIN commands. 5. Performance optimization techniques include using indexes, avoiding full table scanning, optimizing JOIN operations and improving code readability.

MySQL's Place: Databases and Programming Apr 13, 2025 am 12:18 AM

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

Monitor Redis Droplet with Redis Exporter Service Apr 10, 2025 pm 01:36 PM

Effective monitoring of Redis databases is critical to maintaining optimal performance, identifying potential bottlenecks, and ensuring overall system reliability. Redis Exporter Service is a powerful utility designed to monitor Redis databases using Prometheus. This tutorial will guide you through the complete setup and configuration of Redis Exporter Service, ensuring you seamlessly build monitoring solutions. By studying this tutorial, you will achieve fully operational monitoring settings

See all articles