The Ultimate Guide - How to Write Better SQL Queries?
Implicit in the reverse model is the fact that there is a difference between set-based and program-based approaches to query building.
- The procedural approach to querying is one very similar to programming: you tell the system what needs to be done and how to do it. For example, as in the example in the previous article, query the database by executing one function and then calling another function, or use a logical approach involving loops, conditions, and user-defined functions (UDFs) to obtain the final query results. You will find that in this way, you are always requesting a subset of the data in each layer. This approach is also often referred to as step-by-step or row-by-row querying.
- The other is a collection-based method, where you only need to specify the operations that need to be performed. What you have to do with this method is specify the conditions and requirements for the results you want to obtain through the query. When retrieving data, you don't need to pay attention to the internal mechanisms that implement the query: the database engine determines the best algorithm and logic to execute the query.
Since SQL is set-based, this approach is more efficient than the procedural approach, which explains why in some cases, SQL can work faster than code.
Set-based query methods are also skills that the data mining analysis industry requires you to master! Because you need to be skilled in switching between these two methods. If you find that you have procedural queries in your queries, you should consider whether this part needs to be rewritten.
Reverse mode is not static. As you progress towards becoming a SQL developer, avoiding query reverse models and rewriting queries can be a daunting task. So you often need to use tools to optimize your queries in a more structured way.
Thinking about performance requires not only a more structured approach, but also a deeper approach.
However, this structured and in-depth approach is primarily based on query plans. The query plan is first parsed into a "parse tree" and defines exactly what algorithm is used for each operation and how the operations are coordinated.
Query OptimizationWhen optimizing a query, you will most likely need to manually inspect the plan generated by the optimizer. In this case, you will need to analyze your query again by looking at the query plan.
To master such a query plan, you need to use some tools provided by the database management system. You can use some of the following tools:
- Some software package functionality tools can generate graphical representations of query plans.
- Other tools can provide you with a text description of the query plan.
Note that if you are using PostgreSQL, you can differentiate between different EXPLAINs, you just get a description of how the planner executes the query without running the plan. At the same time, EXPLAIN ANALYZE will execute the query and return you an analysis report that evaluates the query plan and the actual query plan. Generally speaking, the actual execution plan will actually execute the plan, while the evaluated execution plan can solve this problem without executing the query. Logically, the actual execution plan is more useful because it contains additional details and statistics about what actually happened when the query was executed.
Next you will learn more about XPLAIN and ANALYZE, and how to use these two commands to further understand your query plans and query performance. To do this, you need to start doing some examples using two tables: one_million and half_million.
You can use EXPLAIN to retrieve the current information of the one_million table: make sure you put it in the first place when running the query, and after the run is completed, it will be returned to the query plan:
EXPLAIN SELECT * FROM one_million; QUERY PLAN <span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-emphasis">___</span>_ Seq Scan on one_million (cost=0.00..18584.82 rows=1025082 width=36) (1 row)
In the above example, we see that the cost of the query is 0.00..18584.82, the number of rows is 1025082, and the column width is 36.
At the same time, you can also use ANALYZE to update statistical information.
ANALYZE one_million; EXPLAIN SELECT * FROM one_million; QUERY PLAN <span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-emphasis">___</span>_ Seq Scan on one_million (cost=0.00..18334.00 rows=1000000 width=37) (1 row)
In addition to EXPLAIN and ANALYZE, you can also use EXPLAIN ANALYZE to retrieve the actual execution time:
EXPLAIN ANALYZE SELECT * FROM one_million; QUERY PLAN <span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span>_ Seq Scan on one_million (cost=0.00..18334.00 rows=1000000 width=37) (actual time=0.015..1207.019 rows=1000000 loops=1) Total runtime: 2320.146 ms (2 rows)
The disadvantage of using EXPLAIN ANALYZE is that you need to actually execute the query, which is worth noting!
All the algorithms we have seen so far are sequential scans or full table scans: this is a method of performing a scan on a database in which each row of the table is scanned in sequential (serial) order When reading, each column is checked to see if it meets the criteria. In terms of performance, a sequential scan is not the best execution plan because the entire table needs to be scanned. But if you use a slow disk, sequential reads will also be fast.
There are also some examples of other algorithms:
EXPLAIN ANALYZE SELECT * FROM one<span class="hljs-emphasis">_million JOIN half_</span>million ON (one<span class="hljs-emphasis">_million.counter=half_</span>million.counter); QUERY PLAN <span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span><span class="hljs-strong">_____</span>_ Hash Join (cost=15417.00..68831.00 rows=500000 width=42) (actual time=1241.471..5912.553 rows=500000 loops=1) Hash Cond: (one<span class="hljs-emphasis">_million.counter = half_</span>million.counter) <span class="hljs-code"> -> Seq Scan on one_million</span> <span class="hljs-code"> (cost=0.00..18334.00 rows=1000000 width=37)</span> <span class="hljs-code"> (actual time=0.007..1254.027 rows=1000000 loops=1)</span> <span class="hljs-code"> -> Hash (cost=7213.00..7213.00 rows=500000 width=5)</span> <span class="hljs-code"> (actual time=1241.251..1241.251 rows=500000 loops=1)</span> <span class="hljs-code"> Buckets: 4096 Batches: 16 Memory Usage: 770kB</span> <span class="hljs-code"> -> Seq Scan on half_million</span> <span class="hljs-code"> (cost=0.00..7213.00 rows=500000 width=5)</span> (actual time=0.008..601.128 rows=500000 loops=1) Total runtime: 6468.337 ms
We can see that the query optimizer selected Hash Join. Remember this operation because we need to use this to evaluate the time complexity of the query. We noticed that there is no half_million.counter index in the above example, we can add the index in the following example:
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">ON</span> half_million(counter); <span class="hljs-keyword">EXPLAIN</span> <span class="hljs-keyword">ANALYZE</span> <span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> one_million <span class="hljs-keyword">JOIN</span> half_million <span class="hljs-keyword">ON</span> (one_million.counter=half_million.counter); QUERY PLAN ______________________________________________________________ <span class="hljs-keyword">Merge</span> <span class="hljs-keyword">Join</span> (<span class="hljs-keyword">cost</span>=<span class="hljs-number">4.12</span>.<span class="hljs-number">.37650</span><span class="hljs-number">.65</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">500000</span> width=<span class="hljs-number">42</span>) (actual <span class="hljs-keyword">time</span>=<span class="hljs-number">0.033</span>.<span class="hljs-number">.3272</span><span class="hljs-number">.940</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">500000</span> loops=<span class="hljs-number">1</span>) <span class="hljs-keyword">Merge</span> Cond: (one_million.counter = half_million.counter) -> <span class="hljs-keyword">Index</span> <span class="hljs-keyword">Scan</span> <span class="hljs-keyword">using</span> one_million_counter_idx <span class="hljs-keyword">on</span> one_million (<span class="hljs-keyword">cost</span>=<span class="hljs-number">0.00</span>.<span class="hljs-number">.32129</span><span class="hljs-number">.34</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">1000000</span> width=<span class="hljs-number">37</span>) (actual <span class="hljs-keyword">time</span>=<span class="hljs-number">0.011</span>.<span class="hljs-number">.694</span><span class="hljs-number">.466</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">500001</span> loops=<span class="hljs-number">1</span>) -> <span class="hljs-keyword">Index</span> <span class="hljs-keyword">Scan</span> <span class="hljs-keyword">using</span> half_million_counter_idx <span class="hljs-keyword">on</span> half_million (<span class="hljs-keyword">cost</span>=<span class="hljs-number">0.00</span>.<span class="hljs-number">.14120</span><span class="hljs-number">.29</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">500000</span> width=<span class="hljs-number">5</span>) (actual <span class="hljs-keyword">time</span>=<span class="hljs-number">0.010</span>.<span class="hljs-number">.683</span><span class="hljs-number">.674</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">500000</span> loops=<span class="hljs-number">1</span>) Total runtime: <span class="hljs-number">3833.310</span> ms (<span class="hljs-number">5</span> <span class="hljs-keyword">rows</span>)
By creating the index, the query optimizer has decided how to find the Merge join when the index is scanned.
Please note the difference between index scan and full table scan (sequential scan): the latter (also called "table scan") finds suitable results by scanning all data or indexing all pages, while the former Scan only every row in the table.
The second part of the tutorial is introduced here. The final article in the series "How to Write Better SQL Queries" will follow, so stay tuned.
Please indicate the source of reprinting: Grape City Control
The above is the detailed content of The Ultimate Guide - How to Write Better SQL Queries?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

DeepSeek is a powerful intelligent search and analysis tool that provides two access methods: web version and official website. The web version is convenient and efficient, and can be used without installation; the official website provides comprehensive product information, download resources and support services. Whether individuals or corporate users, they can easily obtain and analyze massive data through DeepSeek to improve work efficiency, assist decision-making and promote innovation.

There are many ways to install DeepSeek, including: compile from source (for experienced developers) using precompiled packages (for Windows users) using Docker containers (for most convenient, no need to worry about compatibility) No matter which method you choose, Please read the official documents carefully and prepare them fully to avoid unnecessary trouble.

BITGet is a cryptocurrency exchange that provides a variety of trading services including spot trading, contract trading and derivatives. Founded in 2018, the exchange is headquartered in Singapore and is committed to providing users with a safe and reliable trading platform. BITGet offers a variety of trading pairs, including BTC/USDT, ETH/USDT and XRP/USDT. Additionally, the exchange has a reputation for security and liquidity and offers a variety of features such as premium order types, leveraged trading and 24/7 customer support.

Ouyi OKX, the world's leading digital asset exchange, has now launched an official installation package to provide a safe and convenient trading experience. The OKX installation package of Ouyi does not need to be accessed through a browser. It can directly install independent applications on the device, creating a stable and efficient trading platform for users. The installation process is simple and easy to understand. Users only need to download the latest version of the installation package and follow the prompts to complete the installation step by step.

Gate.io is a popular cryptocurrency exchange that users can use by downloading its installation package and installing it on their devices. The steps to obtain the installation package are as follows: Visit the official website of Gate.io, click "Download", select the corresponding operating system (Windows, Mac or Linux), and download the installation package to your computer. It is recommended to temporarily disable antivirus software or firewall during installation to ensure smooth installation. After completion, the user needs to create a Gate.io account to start using it.

Ouyi, also known as OKX, is a world-leading cryptocurrency trading platform. The article provides a download portal for Ouyi's official installation package, which facilitates users to install Ouyi client on different devices. This installation package supports Windows, Mac, Android and iOS systems. Users can choose the corresponding version to download according to their device type. After the installation is completed, users can register or log in to the Ouyi account, start trading cryptocurrencies and enjoy other services provided by the platform.

Gate.io is a highly acclaimed cryptocurrency trading platform known for its extensive token selection, low transaction fees and a user-friendly interface. With its advanced security features and excellent customer service, Gate.io provides traders with a reliable and convenient cryptocurrency trading environment. If you want to join Gate.io, please click the link provided to download the official registration installation package to start your cryptocurrency trading journey.

This tutorial guides you through installing and configuring Nginx and phpMyAdmin on an Ubuntu system, potentially alongside an existing Apache server. We'll cover setting up Nginx, resolving potential port conflicts with Apache, installing MariaDB (
