分析MySQL中优化distinct的技巧_MySQL
有这样的一个需求:select count(distinct nick) from user_access_xx_xx;
这条sql用于统计用户访问的uv,由于单表的数据量在10G以上,即使在user_access_xx_xx上加上nick的索引,
通过查看执行计划,也为全索引扫描,sql在执行的时候,会对整个服务器带来抖动;
root@db 09:00:12>select count(distinct nick) from user_access; +———————-+ | count(distinct nick) | +———————-+ | 806934 | +———————-+ 1 row in set (52.78 sec)
执行一次sql需要花费52.78s,已经非常的慢了
现在需要换一种思路来解决该问题:
我们知道索引的值是按照索引字段升序的,比如我们对(nick,other_column)两个字段做了索引,那么在索引中的则是按照nick,other_column的升序排列:
我们现在的sql:select count(distinct nick) from user_access;则是直接从nick1开始一条条扫描下来,直到扫描到最后一个nick_n,
那么中间过程会扫描很多重复的nick,如果我们能够跳过中间重复的nick,则性能会优化非常多(在oracle中,这种扫描技术为loose index scan,但在5.1的版本中,mysql中还不能直接支持这种优化技术):
所以需要通过改写sql来达到伪loose index scan:
root@db 09:41:30>select count(*) from ( select distinct(nick) from user_access)t ; | count(*) | +———-+ | 806934 | 1 row in set (5.81 sec)
Sql中先选出不同的nick,最后在外面套一层,就可以得到nick的distinct值总和;
最重要的是在子查询中:select distinct(nick) 实现了上图中的伪loose index scan,优化器在这个时候的执行计划为Using index for group-by ,
需要注意的是mysql把distinct优化为group by,它首先利用索引来分组,然后扫描索引,对需要的nick只扫描一次;
两个sql的执行计划分别为:
优化写法:
root@db 09:41:10>explain select distinct(nick) from user_access-> ; +—-+————-+——————————+——-+—————+————-| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +—-+————-+——————————+——-+—————+————- | 1 | SIMPLE | user_access | range | NULL | ind_user_access_nick | 67 | NULL | 2124695 | Using index for group-by | +—-+————-+——————————+——-+—————+————-
原始写法:
root@db 09:42:55>explain select count(distinct nick) from user_access; +—-+————-+——————————+——-+—————+————- | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +—-+————-+——————————+——-+—————+————- | 1 | SIMPLE | user_access | index | NULL | ind_user_access | 177 | NULL | 19546123 | Using index |

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



MySQL is suitable for beginners because it is simple to install, powerful and easy to manage data. 1. Simple installation and configuration, suitable for a variety of operating systems. 2. Support basic operations such as creating databases and tables, inserting, querying, updating and deleting data. 3. Provide advanced functions such as JOIN operations and subqueries. 4. Performance can be improved through indexing, query optimization and table partitioning. 5. Support backup, recovery and security measures to ensure data security and consistency.

MySQL is an open source relational database management system. 1) Create database and tables: Use the CREATEDATABASE and CREATETABLE commands. 2) Basic operations: INSERT, UPDATE, DELETE and SELECT. 3) Advanced operations: JOIN, subquery and transaction processing. 4) Debugging skills: Check syntax, data type and permissions. 5) Optimization suggestions: Use indexes, avoid SELECT* and use transactions.

You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".

Create a database using Navicat Premium: Connect to the database server and enter the connection parameters. Right-click on the server and select Create Database. Enter the name of the new database and the specified character set and collation. Connect to the new database and create the table in the Object Browser. Right-click on the table and select Insert Data to insert the data.

MySQL and SQL are essential skills for developers. 1.MySQL is an open source relational database management system, and SQL is the standard language used to manage and operate databases. 2.MySQL supports multiple storage engines through efficient data storage and retrieval functions, and SQL completes complex data operations through simple statements. 3. Examples of usage include basic queries and advanced queries, such as filtering and sorting by condition. 4. Common errors include syntax errors and performance issues, which can be optimized by checking SQL statements and using EXPLAIN commands. 5. Performance optimization techniques include using indexes, avoiding full table scanning, optimizing JOIN operations and improving code readability.

You can create a new MySQL connection in Navicat by following the steps: Open the application and select New Connection (Ctrl N). Select "MySQL" as the connection type. Enter the hostname/IP address, port, username, and password. (Optional) Configure advanced options. Save the connection and enter the connection name.

Recovering deleted rows directly from the database is usually impossible unless there is a backup or transaction rollback mechanism. Key point: Transaction rollback: Execute ROLLBACK before the transaction is committed to recover data. Backup: Regular backup of the database can be used to quickly restore data. Database snapshot: You can create a read-only copy of the database and restore the data after the data is deleted accidentally. Use DELETE statement with caution: Check the conditions carefully to avoid accidentally deleting data. Use the WHERE clause: explicitly specify the data to be deleted. Use the test environment: Test before performing a DELETE operation.

Redis uses a single threaded architecture to provide high performance, simplicity, and consistency. It utilizes I/O multiplexing, event loops, non-blocking I/O, and shared memory to improve concurrency, but with limitations of concurrency limitations, single point of failure, and unsuitable for write-intensive workloads.
