Highly liked sharing: MySQL optimization ideas that are in line with production-Mysql Tutorial-php.cn

Table of Contents

Statistical Business-OLAP" >Statistical Business-OLAP

Summary" >Summary

Home

Database

Mysql Tutorial

Highly liked sharing: MySQL optimization ideas that are in line with production

藏色散人

Nov 05, 2021 pm 02:16 PM

mysql

Preface

The starting point of writing this article is to record my accumulated experience in dealing with data problems at work. As I write and write, I find that every point will lead to Other background knowledge should be provided, such as when optimizing indexes, you need to have a certain understanding of slow query, Explain and other related functions. For example, introducing Elasticsearch requires solving data synchronization, learning Elasticsearch knowledge, etc. Due to the length of the article, it is impossible to cover every point. They are all detailed like video tutorials, and I can only summarize them based on my limited knowledge and some general points. Even so, the length of the article is already very long. If you are interested in a certain point, please go to Baidu/Google for in-depth knowledge of individual details.

The article is quite long, so if you are interested, you may want to read it through. I hope you haven’t wasted dozens of minutes. [Recommended learning: "mysql video tutorial"]

Thinking Perspective

Database technology has gone through the manual management stage and the file system stage so far. and database system stage.

In the early days when there was no software system, real-world operations of a certain business could also be realized through the manual management stage of manual accounting and verbal agreements. This form existed for a long time and was relatively inefficient. a plan. In the next stage, with the development of computer technology, there was a file system stage that replaced manual accounting with excel tables, which improved productivity to a certain extent. In the software system stage, which is a database system with simple operation and high efficiency, productivity has been improved again, specific problems in the real world are abstracted into data, and real-world business is represented through the flow and change of data. In software systems, data storage is generally composed of a relational database and multiple non-relational databases.

The database is strongly related to the system business. This requires the product manager to understand the data storage and query process when designing the business. At the beginning of the design, it is clear what impact the business changes will have on the database and whether A new technology stack needs to be referenced. For example, a business designed by the product manager is to conduct statistical analysis and summary of data on multiple MySQL tables with a single table volume of millions. If MySQL multi-table query is directly used, slow queries will definitely occur and cause the msyql service to go down. In this case, the solution is Either compromise on the product side or change the technology stack.

In the system architecture and database solution, we should choose the one that is more suitable for the company's team capabilities. In the early stage of the system, simple database optimization with banknote capabilities will be the most cost-effective solution, but when it comes to mysql database banknote capabilities, there is nothing we can do. , introducing software services that focus on key functions will become the most cost-effective solution. How to choose the appropriate solution when encountering problems is the time to reflect your value.

A poor boy falls in love with a rich girl. The short-term sweetness cannot match the real class inequality. The happy ending only exists in the fantasy of the poor boy and the TV series of Teacher Qiong Yao.

How to improve the performance of data storage at a limited cost is the central idea of this article.

Background knowledge

I believe that everyone will often come into contact with the following content in their daily work. Let me briefly summarize it.

Relational database

Relational database is a data organization composed of two-dimensional tables and the relationships between them, providing transaction data consistency, Functions such as data persistence are the core storage services of software systems. They are the databases we most often come into contact with during development and interviews. For some small outsourcing projects, one MySQL is enough to meet all business needs. It is something that we often come into contact with, and it is actually full of tricks. We will discuss the tricks in detail in the following chapters.
Advantages:

Transaction
Persistence
Relatively common SQL language

Problems

The requirements for hard disk I/O are very high
The aggregation query efficiency of large amounts of data is low
Index misses
The leftmost matching principle of the index leads to mismatches Suitable for full-text retrieval
Improper use of transactions will cause lock congestion
Various problems caused by horizontal expansion are difficult to deal with

Non-relational Type database - NoSql

MySQL database, as a relational data storage software, has advantages and obvious disadvantages. Therefore, when the data volume of the software system continues to expand and the business complexity continues to increase, We cannot expect to solve all problems by enhancing the capabilities of the MySQL database. Instead, we need to introduce other storage software and use various types of NoSQL to solve the problems of the software system's expanding data volume and increasing business complexity.
Relational database is an optimization of relational database in different scenarios. It does not mean that everything will be fine if you introduce some kind of NoSQL. It means that you should fully understand the types and application difficulties of NoSQL on the market and choose the appropriate storage in the appropriate scenario. Software is the way to go.

Key-Value type

In business, the contents of certain tables are often queried, but most of the query results remain unchanged, so Key-value storage software, mainly Memcached and Redis, has emerged and is widely used in cache modules in the system. Redis has more data structures and persistence than Memcached, making it the most widely used among KV-type NoSQL.

Search type

In the scenario of full-text search, query optimization of MySQLB tree index, like query cannot hit the index, and every like keyword query is one time Full table scan can be supported in tables with tens of thousands of data, but slow queries will occur when the data is at the end. If the business code is not well written and the Like query is called in the transaction, a read lock will occur. ElasticSearch, with inverted index as its core, can perfectly meet the scenario of full-text search. At the same time, ElasticSearch also supports massive data very well, and the documentation and ecology are also very good. ElasticSearch is a representative product of search type.

Document type

Document type NoSql refers to a type of NoSql that stores semi-structured data as documents. Document type NoSql usually stores data in JSON or XML format. , so document-type NoSql does not have Schema. Since there is no Schema feature, we can store and read data at will. Therefore, the emergence of document-type NoSql solves the problem of inconvenient expansion of relational database table structures. The author has never used

column formula

For enterprises of a certain size, the business often involves some real-time and flexible data summary, which is not suitable for this kind of business Use the solution of calculating in advance to solve the problem. Even if you can write the business using the solution of calculating and summarizing in advance, as the number of summarized data increases, the final step of accumulating the summarized data will gradually become very slow. Column-based NoSQL is the product of this scenario. It is one of the most representative technologies in the big data era. The most common one is HBase, but the application of HBase is very heavy and often requires a complete set of Hadoop ecosystem to run. The author's company Alibaba Cloud's AnalyticDB is used, a column storage software compatible with MySql query statements. The powerful query capabilities of summary column storage software are sufficient to support various real-time and flexible data summary services.

Case

Taking 2021 as the time node, most systems start with the following plan in the early stage. Next, I will use this case Make some adjustments slowly.

Highly liked sharing: MySQL optimization ideas that are in line with production

#The benefits brought by hardware upgrades are lower as time goes by. This is the fastest optimization solution when time and personnel are tight. The benefits brought by software optimization are higher in the future, but the level of technical personnel required is also higher in the future. When time and personnel permit, it is the most cost-effective optimization solution. Hardware and software optimization are not mutually exclusive. When needed, both can approach the upper limit of MYSQL performance at the same time.

Highly liked sharing: MySQL optimization ideas that are in line with production

Hard Optimization - Cash Ability

Phase One
- Improvement For disk I/O, try to use SSD disks (quality improvement)
- Increase memory and increase query cache space
- Increase the number of CPU cores and increase execution threads
Phase 2
- Replace the self-built mysql with the service provider mysql service
- Enable the built-in read and write separation function
Phase 3
- The service provider’s mysql service is replaced with a cloud-native distributed database
- Enable the built-in read and write separation function
- Enable the built-in sub-table function

Soft Optimization - Query - OLTP

OLTP is mainly used to record the occurrence of certain types of business events, such as user behavior. When the behavior occurs, the system will record When and where the user did something, such a row (or multiple rows) of data will be updated in the database in the form of additions, deletions, and modifications. This requires high real-time performance, strong stability, and ensuring that the data is updated successfully in a timely manner. Common business systems all belong to OLTP, and the databases used are transaction databases, such as MySlq, Oracle, etc. For OLTP, improving query speed and service stability are the core of optimization

Highly liked sharing: MySQL optimization ideas that are in line with production

Slow query
- Discover the SQL with efficiency problems through the slow query log
Problem SQL troubleshooting direction
- There is a problem with the index design
- There is a problem with the SQL statement
- Wrong index selection for the database
- The single table is large
Explain specific analysis
- View sql execution comparison rate
- Check the index hit situation (key points)
mysql optimizer
- When the optimizer selects an index, it will refer to the index Cardinality
- The cardinality is automatically maintained and estimated by MySQL, and may not be accurate
- If the index does not hit or the wrong index is used, there is a problem with the optimizer step
- analyze can re-count index information and recalculate the base number
Force index
- The force keyword can force the use of index and forcefully specify index on the business code
Covered index - the most ideal hit index
- Covered index means that the same index (unique, ordinary, joint index, etc.) is used from the execution of the query statement to the returned result
- Covering indexes can reduce back table queries
- If the data query uses more than one index, it is not a covering index
- You can optimize the SQL statement or optimize the joint index. Use covering index
count() function
- count(non-indexed field) - covering index cannot be used, theoretically the slowest
- count(index field) - can overwrite the index, but still needs to determine whether the field is null each time
- count (primary key) - Same as above
- count(1) - only scans the index tree, there is no process of parsing the data rows , theoretically faster, but it will still determine whether 1 is null
- count(*) - MySQL specifically optimizes the count(*) function to directly return the number of data in the index tree, optimal
ORDER BY
- Minimize additional sorting and specify where condition
- The combination of where statement and ORDER BY statement satisfies the leftmost prefix
- The most efficient - Index coverage (few scenarios, low chance of encounter)
  - Index coverage can skip generating the intermediate result set and directly output the query results
  - The ORDER field needs to be indexed and consistent with the WHERE conditions and output The contents are all in the same index
Paging query
- First find a way to cover the index
- Find out what you need first The id of the data, return to the table to get the final result set
Index pushdown
- KEY store_id_guide_id (store_id,guide_id) USING BTREE
- select * from table where store_id in (1,2) and guide_id = 3;
- Before MySQL5.6, you need to use the index to query store_id in (1,2), then add all tables to verify film_id = 3
- MySQL5.6, if it can be read in the index, directly use index filtering
Loose index scan
- KEY store_id_guide_id (store_id,guide_id) USING BTREE
- select film_id from table where guide_id = 3
- New features of MySQL8.0
- Loose index scan can break the "left-hand principle" and solve the problem of losing the leading brother
- The efficiency is lower than that of the joint index
Function operation
- If you perform function operation on the index field, the optimizer will give up the index
- This situation may include: time function, converting string to number, character encoding conversion
- Optimize the use of server-side logic to replace mysql functions
The single table size is too large
- Upgrade mysql, the single table size that different mysql software can carry It is different. Based on my current experience, there is no problem in querying the hit index when the single table of Alibaba Cloud polardb cluster version is 200 million (high priority)
- Data settlement - such as pipeline data can be The latest value is obtained through settlement at a certain point in time, and the settled flow is transferred to the backup table (medium priority)
- Separation of hot and cold data - data that cannot be settled is distinguished according to the frequency of query, and the frequency is low Transfer the query to another table, and distinguish the entry point of the query (medium priority) in terms of business
- Distributed database table splitting-enable the table splitting function of the distributed database with orders, and the distributed database component management is Insertion and query after table splitting (medium priority)
- Code to implement table splitting - split a single table into multiple tables according to certain rules, after splitting in most framework ORMs of PHP and GO It is necessary to make certain modifications to the framework ORM. The ORM in JAVA has native support. It is recommended to consider it at the early stage of the project. The later it becomes more difficult (lower priority)

Soft Optimization - Write Update Delete

Highly liked sharing: MySQL optimization ideas that are in line with production

Lock
- According to granularity MySQL locks can be divided into global locks, table-level locks, and row locks
- Global lock
  - 自google/baidu
- Table-level locks are divided into table locks (data locks) and metadata locks
  - Table locks
    - 自google/baidu
  - Metadata lock
    - 自google/baidu
- ##Row lock will lock the data row, divided into Shared locks and exclusive locks
Solution to deadlock
- Parameter configuration
  - Adjust the innodb_lock_wait_timeout parameter
    - The default is 50 seconds, that is, if the lock is not acquired after waiting for 50 seconds, the current statement will report an error
    - If the waiting time expires Long, you can shorten this parameter appropriately
  - Active deadlock detection: innodb_deadlock_detect
    - Roll back less costly transactions when a deadlock is found
    - Enabled by default
- Do not open the transaction if it is not necessary
- Try to place the query outside the transaction to reduce the number of locked rows
- Avoid The transaction time is too long, do not trigger http requests in the transaction
- Actively check the transaction status
```
show  processlist;SELECT * FROM information_schema.INNODB_TRX; //长事务SELECT * FROM information_schema.INNODB_LOCKs; //查看锁SELECT * FROM information_schema.INNODB_LOCK_waits; //查看阻塞事务
```
  Copy after login

Search business

The number of search rows is less than 100,000 - mysql is hard to carry
- Improve the cpu, io and memory hardware of mysql
The number of search rows is more than 100,000 - introduction Elasticsearch

Highly liked sharing: MySQL optimization ideas that are in line with production

Elasticsearch’s inverted index is suitable for full-text search, but the data structure has poor flexibility.

Data synchronization
- When the business code changes data, it is synchronized to Elasticsearch at the same time
- Canel subscription mysql log triggers synchronization
Elasticsearch-index
- is composed of a list of documents with the same fields - analogous to mysql's table
- Once the field type is set, modification is prohibited and new fields are allowed
- Specific The method is self-directed google/baidu
Elasticsearch-Document
- The user’s data document stored in es - analogy to the row of mysql
- is composed of metadata and Json Object composition
- Metadata and Json Object details are provided by google/baidu
Elasticsearch-Word Segmenter
- by Google/baidu
Elasticsearch-Inverted Index (Key Points)
- 自google/baidu

Statistical Business-OLAP

OLAP is used for decision-making analysis of data relative to OLTP transaction processing scenarios. It is an offline data warehouse idea used in big data analysis, not a specific technology stack. If your solution can embody the idea of OLAP analysis and processing, then the solution is OLAP.

Early data warehouse construction mainly refers to modeling and summarizing enterprise business databases such as ERP, CRM, SCM and other data into the data warehouse engine according to the requirements of decision-making analysis. Its application is mainly for reporting purposes. It is to support the decision-making of management and business personnel (medium- and long-term strategic decisions). As IT technology moves toward the Internet and mobility, data sources become more and more abundant. Unstructured data appears on the basis of the original business database, such as website logs, IoT device data, APP buried data, etc. The amount of these data It is several orders of magnitude larger than previous structured data.

No matter how the business faced by OLAP changes, it is inseparable from the following steps: Determine the analysis field->Synchronize business data to the computing library->Data cleaning modeling->Synchronize to the data warehouse- >Exposed to the outside

The calculation source database is specially used for data cleaning, and the purpose is to avoid affecting the performance of the business database during data cleaning. By cleaning the data in the calculation source database according to business and dimensions, the usability and reusability of the data are increased, and the final real-time detailed data is obtained, which is transferred to the data warehouse, and the data warehouse provides the final decision analysis data.

DEMO plan

Highly liked sharing: MySQL optimization ideas that are in line with production

Production plan

Highly liked sharing: MySQL optimization ideas that are in line with production

The software in each link can use the same functions If the software is replaced by the team's most confident software implementation solution, then the solution is OLAP.

Summary

Optimization must be down-to-earth, with capability accumulation step by step, multiple rounds of iterations, and cannot be achieved overnight. Conduct multiple rounds of iterations based on your own foundation, business scenarios and future development expectations.

The principle of iteration is to first improve the efficiency of the software through soft optimization and hard optimization of a single software service. When the optimization cost is lower than the benefit, based on future development expectations, refer to mature solutions on the market and follow the solutions. Introduce new software as needed for combined innovation, and avoid blind copying. Only through organic integration can the effects of 1 1>2, 2 1>3 be achieved. When the referenced software encounters a bottleneck, repeat this process.

Thank you for reading this. The above is all the content of the article. The optimization points and solutions proposed in the content are not necessarily the optimal solutions. They are the best practices in personal work. If you have different opinions, you are welcome to discuss them. comminicate.

The above is the detailed content of Highly liked sharing: MySQL optimization ideas that are in line with production. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Saving in R.E.P.O. Explained (And Save Files)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7566

CakePHP Tutorial

1386

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

106

Related knowledge

MySQL: Simple Concepts for Easy Learning Apr 10, 2025 am 09:29 AM

MySQL is an open source relational database management system. 1) Create database and tables: Use the CREATEDATABASE and CREATETABLE commands. 2) Basic operations: INSERT, UPDATE, DELETE and SELECT. 3) Advanced operations: JOIN, subquery and transaction processing. 4) Debugging skills: Check syntax, data type and permissions. 5) Optimization suggestions: Use indexes, avoid SELECT* and use transactions.

How to open phpmyadmin Apr 10, 2025 pm 10:51 PM

You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".

MySQL: An Introduction to the World's Most Popular Database Apr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

Why Use MySQL? Benefits and Advantages Apr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

How to use single threaded redis Apr 10, 2025 pm 07:12 PM

Redis uses a single threaded architecture to provide high performance, simplicity, and consistency. It utilizes I/O multiplexing, event loops, non-blocking I/O, and shared memory to improve concurrency, but with limitations of concurrency limitations, single point of failure, and unsuitable for write-intensive workloads.

MySQL and SQL: Essential Skills for Developers Apr 10, 2025 am 09:30 AM

MySQL and SQL are essential skills for developers. 1.MySQL is an open source relational database management system, and SQL is the standard language used to manage and operate databases. 2.MySQL supports multiple storage engines through efficient data storage and retrieval functions, and SQL completes complex data operations through simple statements. 3. Examples of usage include basic queries and advanced queries, such as filtering and sorting by condition. 4. Common errors include syntax errors and performance issues, which can be optimized by checking SQL statements and using EXPLAIN commands. 5. Performance optimization techniques include using indexes, avoiding full table scanning, optimizing JOIN operations and improving code readability.

MySQL's Place: Databases and Programming Apr 13, 2025 am 12:18 AM

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

Monitor Redis Droplet with Redis Exporter Service Apr 10, 2025 pm 01:36 PM

Effective monitoring of Redis databases is critical to maintaining optimal performance, identifying potential bottlenecks, and ensuring overall system reliability. Redis Exporter Service is a powerful utility designed to monitor Redis databases using Prometheus. This tutorial will guide you through the complete setup and configuration of Redis Exporter Service, ensuring you seamlessly build monitoring solutions. By studying this tutorial, you will achieve fully operational monitoring settings

See all articles