Today, database operations have increasingly become the performance bottleneck of the entire application, especially for Web applications. Regarding the performance of the database, this is not only something that DBAs need to worry about, but it is also something that our programmers need to pay attention to
Today, database operations have increasingly become the performance bottleneck of the entire application. This is especially true for Web applications. Regarding the performance of the database, this is not just something that DBAs need to worry about, but this is something that we programmers need to pay attention to. When we design the database table structure, we need to pay attention to the performance of data operations when operating the database (especially the SQL statements when looking up tables). Here, we will not talk too much about the optimization of SQL statements, but only focus on MySQL, the database with the most Web applications.
Mysql'sPerformance Optimization cannot be achieved overnight. You must take it step by step and optimize from all aspects, and the final performance will be greatly improved.
Optimization technology of Mysql database
Optimization of mysql is a comprehensive technology , mainly including
•Rationalization of table design (in line with 3NF)
•Add appropriate index (index) [four types: ordinary index, primary key index, unique index unique , Full text index]
•Table splitting technology (horizontal split, vertical split)
•Read and write [write: update/delete/add] separation
•Stored Procedure [Modular programming, can improve speed]
•Optimize mysql configuration [Configure the maximum number of concurrency my.ini, adjust the cache size]
•mysql server hardware upgrade
•Regularly clear unnecessary data and perform regular defragmentation (MyISAM)
Database optimization work
For a data-centric application, the quality of the database directly affects the performance of the program, so database performance is crucial. Generally speaking, to ensure the efficiency of the database, the following four aspects must be done:
① Database design
② SQL statement optimization
③ Database parameter configuration
④ Appropriate hardware resources and operating system
In addition, using appropriate stored procedures can also improve performance.
This order also shows the impact of these four tasks on performance
Database table design
A common understanding of the three paradigms is of great benefit to database design. In database design, in order to better apply the three paradigms, it is necessary to understand the three paradigms in a popular way (the popular understanding is a sufficient understanding, not the most scientific and accurate understanding):
First normal form: 1NF is an atomicity constraint on attributes, which requires attributes (columns) to be atomic and cannot be decomposed; (as long as it is a
, it satisfies 1NF)
Second Normal Form: 2NF is a unique constraint on records, requiring records to have unique identifiers, that is, the uniqueness of entities;
Third Normal Form: 3NF is a restriction on the uniqueness of records. Field redundancy constraints, which require fields to be non-redundant. No redundant database design can do that.
However, a database without redundancy may not be the best database. Sometimes in order to improve operating efficiency, it is necessary to lower the paradigm standard and appropriately retain redundant data. The specific approach is: abide by the third paradigm when designing the conceptual data model, and put the work of lowering the paradigm standard into consideration when designing the physical data model. Lowering the normal form means adding fields and allowing for redundancy.
Relational database: mysql/oracle/db2/informix/sysbase/sql server
Non-relational database: (Characteristics:
or collection)
NoSql database: MongoDB (Characteristics are document-oriented)
Give an example of what is moderate redundancy, or justified redundancy!
The above is inappropriate redundancy, the reason is:
Here, in order to improve the retrieval efficiency of student activity records, the unit name is redundantly added to the student activity record table. There are 500 records of unit information, and student activity records have about 2 million data volumes in one year. If the student activity record table does not redundant the unit name field, it only contains three int fields and a timestamp field, occupying only 16 bytes, and is a very small table. After adding a varchar(32) field redundantly, the size is three times that of the original value, and the retrieval process also requires so much more I/O. Moreover, the number of records is very different, 500 VS 2,000,000, resulting in the need to update 4,000 redundant records when updating a unit name. It can be seen that this redundancy is simply counterproductive.
Price in the order table is a redundant field, because we can calculate the price of this order from the order details table, but this redundancy is reasonable and can also improve query performance.
A conclusion can be drawn from the above two examples:
1---n redundancy should occur on the 1 side.
SQL statement optimization
General steps for SQL optimization
1. Pass The show status command understands the execution frequency of various SQLs.
2. Locate SQL statements with low execution efficiency - (emphasis on select)
3. Analyze inefficient SQL statements through explain
4. Determine the problem and take appropriate optimization measures
-- select语句分类 Select Dml数据操作语言(insert update delete) dtl 数据事物语言(commit rollback savepoint) Ddl数据定义语言(create alter drop..) Dcl(数据控制语言) grant revoke -- Show status 常用命令 --查询本次会话 Show session status like 'com_%'; //show session status like 'Com_select' --查询全局 Show global status like 'com_%'; -- 给某个用户授权 grant all privileges on *.* to 'abc'@'%'; --为什么这样授权 'abc'表示用户名 '@' 表示host, 查看一下mysql->user表就知道了 --回收权限 revoke all on *.* from 'abc'@'%'; --刷新权限[也可以不写] flush privileges;
SQL statement optimization-show parameters
MySQL client connection successful Afterwards, server status information can be provided by using the show [session|global] status command. The session represents the statistical results of the current connection, and global represents the statistical results since the database was last started. The default is session level.
The following example:
show status like 'Com_%';
where Com_XXX represents the number of times the XXX statement has been executed.
Important note: Com_select, Com_insert, Com_update, Com_delete. Through these parameters, you can easily understand whether the current database application is mainly based on insert and update operations or query operations, as well as various types of What is the approximate execution ratio of SQL.
There are also several commonly used parameters to help users understand the basic situation of the database.
Connections: The number of attempts to connect to the MySQL server
Uptime: The time the server works (in seconds)
Slow_queries: Slow queries Number of times (default is slow query time 10s)
show status like 'Connections' show status like 'Uptime' show status like 'Slow_queries'
How to query the slow query time of mysql
Show variables like 'long_query_time';
Modify the slow query time of mysql
set long_query_time=2
SQL statement Optimization - Locating slow queries
The problem is: How to quickly locate slow execution statements in a large project. (Locating slow queries)
First of all, we understand how to query some running status of mysql database (for example, if you want to know the current running time of mysql/how many times it has been executed in total
select/update/delete../current connection)
In order to facilitate testing, we build a large table (4 million)-> Use stored procedures Build
By default, mysql thinks 10 seconds is A slow query.
Modify the slow query of mysql.
show variables like 'long_query_time' ; //可以显示当前慢查询时间 set long_query_time=1 ;//可以修改慢查询时间
Build a large table->There are requirements for the records in the large table. It is only useful if the records are different, otherwise the test effect will be different from the real one The difference is huge. Create:
CREATE TABLE dept( /*部门表*/ deptno MEDIUMINT UNSIGNED NOT NULL DEFAULT 0, /*编号*/ dname VARCHAR(20) NOT NULL DEFAULT "", /*名称*/ loc VARCHAR(13) NOT NULL DEFAULT "" /*地点*/ ) ENGINE=MyISAM DEFAULT CHARSET=utf8 ; CREATE TABLE emp (empno MEDIUMINT UNSIGNED NOT NULL DEFAULT 0, /*编号*/ ename VARCHAR(20) NOT NULL DEFAULT "", /*名字*/ job VARCHAR(9) NOT NULL DEFAULT "",/*工作*/ mgr MEDIUMINT UNSIGNED NOT NULL DEFAULT 0,/*上级编号*/ hiredate DATE NOT NULL,/*入职时间*/ sal DECIMAL(7,2) NOT NULL,/*薪水*/ comm DECIMAL(7,2) NOT NULL,/*红利*/ deptno MEDIUMINT UNSIGNED NOT NULL DEFAULT 0 /*部门编号*/ )ENGINE=MyISAM DEFAULT CHARSET=utf8 ; CREATE TABLE salgrade ( grade MEDIUMINT UNSIGNED NOT NULL DEFAULT 0, losal DECIMAL(17,2) NOT NULL, hisal DECIMAL(17,2) NOT NULL )ENGINE=MyISAM DEFAULT CHARSET=utf8;
Test data
INSERT INTO salgrade VALUES (1,700,1200); INSERT INTO salgrade VALUES (2,1201,1400); INSERT INTO salgrade VALUES (3,1401,2000); INSERT INTO salgrade VALUES (4,2001,3000); INSERT INTO salgrade VALUES (5,3001,9999);
In order for the stored procedure to execute normally, we need to modify the command execution end character delimiter $$
Create Function, this function will return a random string of specified length
create function rand_string(n INT) returns varchar(255) #该函数会返回一个字符串 begin #chars_str定义一个变量 chars_str,类型是 varchar(100),默认值'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ'; declare chars_str varchar(100) default 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ'; declare return_str varchar(255) default ''; declare i int default 0; while i < n do set return_str =concat(return_str,substring(chars_str,floor(1+rand()*52),1)); set i = i + 1; end while; return return_str; end
Create a stored procedure
create procedure insert_emp(in start int(10),in max_num int(10)) begin declare i int default 0; #set autocommit =0 把autocommit设置成0 set autocommit = 0; repeat set i = i + 1; insert into emp values ((start+i) ,rand_string(6),'SALESMAN',0001,curdate(),2000,400,rand()); until i = max_num end repeat; commit; end #调用刚刚写好的函数, 1800000条记录,从100001号开始 call insert_emp(100001,4000000);
At this time, if we have a statement that takes more than 1 second to execute, it will be counted.
If you record the slow query sql into one of our logs
By default, the lower version of mysql will not record the slow query, you need to start mysql At this time, you can specify the slow query record
bin\mysqld.exe - -safe-mode - -slow-query-log [mysql5.5 can be specified in my.ini]
bin\mysqld.exe –log-slow-queries=d:/abc.log [The lower version mysql5.0 can be specified in my.ini]
The slow query log will be In the data directory [in mysql5.0 version, it is placed under the mysql installation directory/data/]. Under mysql5.5.19, you need to check the datadir="C of
my.ini: /Documents and Settings/All Users/Application Data/MySQL/MySQL Server 5.5/Data/" to determine.
In mysql5.6, the default is to start record slow query, my.ini The directory is: C:\ProgramData\MySQL\MySQL Server 5.6, which has a configuration item
slow-query-log=1
to start for mysql5.5 There are two methods for slow query
bin\mysqld.exe - -safe-mode - -slow-query-log
It can also be configured in the my.ini file :
[mysqld] # The TCP/IP Port the MySQL Server will listen on port=3306 slow-query-log
Locate SQL statements with low execution efficiency through slow query logs. The slow query log records all SQL statements whose execution time exceeds the setting of long_query_time.
show variables like 'long_query_time'; set long_query_time=2;
Add data to dept table
desc dept; ALTER table dept add id int PRIMARY key auto_increment; CREATE PRIMARY KEY on dept(id); create INDEX idx_dptno_dptname on dept(deptno,dname); INSERT into dept(deptno,dname,loc) values(1,'研发部','康和盛大厦5楼501'); INSERT into dept(deptno,dname,loc) values(2,'产品部','康和盛大厦5楼502'); INSERT into dept(deptno,dname,loc) values(3,'财务部','康和盛大厦5楼503');UPDATE emp set deptno=1 where empno=100002;
****Test statement***[The record of emp table can be 3600000, the effect is obviously slow]
select * from emp where empno=(select empno from emp where ename='研发部')
If you bring order by e.empno, the speed will be even slower, sometimes more than 1 minute.
Test statement
select * from emp e,dept d where e.empno=100002 and e.deptno=d.deptno;
View the slow query log: the default is in the data directory data host-name-slow.log. Lower versions of mysql need to be configured by using -log-slow-queries[=file_name] when opening mysql
##SQL statement optimization-explain analysis problem
Explain select * from emp where ename=“wsrcla”
possible_keys:表示查询时,可能使用的索引
key:表示实际使用的索引
key_len:索引字段的长度
rows:扫描出的行数(估算的行数)
Extra:执行情况的描述和说明
explain select * from emp where ename='JKLOIP'
如果要测试Extra的filesort可以对上面的语句修改
explain select * from emp order by ename\G
EXPLAIN详解
id
SELECT识别符。这是SELECT的查询序列号
id 示例
SELECT * FROM emp WHERE empno = 1 and ename = (SELECT ename FROM emp WHERE empno = 100001) \G;
select_type
PRIMARY :子查询中最外层查询
SUBQUERY : 子查询内层第一个SELECT,结果不依赖于外部查询
DEPENDENT SUBQUERY:子查询内层第一个SELECT,依赖于外部查询
UNION :UNION语句中第二个SELECT开始后面所有SELECT,
SIMPLE
UNION RESULT UNION 中合并结果
Table
显示这一步所访问数据库中表名称
Type
对表访问方式
ALL:
SELECT * FROM emp \G
完整的表扫描 通常不好
SELECT * FROM (SELECT * FROM emp WHERE empno = 1) a ;
system:表仅有一行(=系统表)。这是const联接类型的一个特
const:表最多有一个匹配行
Possible_keys
该查询可以利用的索引,如果没有任何索引显示 null
Key
Mysql 从 Possible_keys 所选择使用索引
Rows
估算出结果集行数
Extra
查询细节信息
No tables :Query语句中使用FROM DUAL 或不含任何FROM子句
Using filesort :当Query中包含 ORDER BY 操作,而且无法利用索引完成排序,
Impossible WHERE noticed after reading const tables: MYSQL Query Optimizer
通过收集统计信息不可能存在结果
Using temporary:某些操作必须使用临时表,常见 GROUP BY ; ORDER BY
Using where:不用读取表中所有信息,仅通过索引就可以获取所需数据;
The above is the detailed content of Do you know about Mysql database performance optimization?. For more information, please follow other related articles on the PHP Chinese website!