[Related learning recommendations: mysql tutorial】
Recently, there have been many database-related operations involved, and the company’s existing specifications are not too comprehensive. According to the relevant standards of various experts on the Internet, I have compiled some standard usage for my own use. I hope you can correct me.
Database environment
dev: Development environment
Development can read, write, and modify tables structure. Developers can modify the table structure and the data in it at will but need to ensure that it does not affect other development colleagues.
test: The test environment
is readable and writable by developers, and developers can modify the table structure through tools.
online: Online environment
Developers are not allowed to directly perform database operations in the online environment. If operations are required, they must find the DBA to perform the operations and make corresponding records. This is prohibited. pressure test.
The key issue is that the user permissions corresponding to the MySQL server in each environment must be clearly divided, identifiable, and able to specifically distinguish business scenarios, etc.
Naming convention
Basic naming rules
Why are all libraries, tables, and fields used? lower case?
In MySQL, databases and tables correspond to directories and files under those directories. Therefore, the sensitivity of the operating system determines the case sensitivity of database and table naming.
Table naming
Tables in the same module should use the same prefix as much as possible, and the table names should express the meaning as much as possible. All log tables start with log_
Field naming
Index naming
Constraint naming
Table design specifications
The table engine depends on the actual application scenario; logs and reports It is recommended to use myisam for class tables, and it is recommended to use the innodb engine for tables related to transactions, audits, and amounts. If there is no explanation, the innodb engine will be used when creating the table
The default use is utf8mb4 character set, and the database sorting rule uses utf8mb4_general_ci. (Since the database definition uses the default, the data table can no longer be defined, but for the sake of insurance, it is recommended Write
Why the character set is not selected utf8, and the sorting rule is not utf8_general_ci
MySQL using utf8 encoding cannot save the placeholder is 4 Bytes of Emoji expressions. In order to make the back-end project fully support the Emoji expressions input by the client, upgrading the encoding to utf8mb4 is the best solution. For the JDBC connection string, the characterEncoding is set to utf8 or the above configuration still cannot be inserted normally. In the case of emoji data, you need to specify the connection character set as utf8mb4 in the code.
All tables and fields should use the comment column attribute to describe the true meaning of the table and field. If it is an enumeration value, it is recommended Define all the content used in this field.
If there is no explanation, the first id field in the table must be the primary key and grow automatically. Data transfer as a context as a condition within a non-transaction is prohibited. It is prohibited to use varchar type as primary key statement design.
If there is no description, the table must contain the create_time and modify_time fields, that is, the table must contain fields that record the creation time and modification time.
If there is no description, the table must contain is_del, which is used to indicate whether the data has been deleted. In principle, physical deletion of database data is not allowed.
What is the difference between datetime and timestamp ?
Similar points:
The display format of the TIMESTAMP column is the same as that of the DATETIME column. The display width is fixed at 19 characters, and the format is YYYY-MM-DD HH:MM:SS.
Difference:
TIMESTAMP
How to use TIMESTAMP's automatic assignment attribute?
Set the current time as the default value of ts: ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP. When the row is updated, update the value of ts: ts TIMESTAMP DEFAULT 0 ON UPDATE CURRENT_TIMESTAMP.
You can combine 1 and 2: ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP.
How to use INT UNSIGNED to store ip?
Use INT UNSIGNED instead of char(15) to store the ipv4 address, and convert it through the MySQL functions inet_ntoa and inet_aton. IPv6 addresses currently do not have a conversion function and need to be stored using DECIMAL or two bigINTs.
INT[M]. What does the M value mean?
Note that the number after the numerical type brackets only indicates the width and has nothing to do with the storage range. Many people think that the value ranges of INT(4) and INT(10) are (-9999 to 9999) and (-9999999999 to 9999999999) respectively. This understanding is wrong. In fact, when the M value in the integer is used in combination with the ZEROFILL attribute, the column values can be equal in width. No matter what the value of M in INT[M] is, its value range is still (-2147483648 to 2147483647 when signed), (0 to 4294967295 when unsigned).
Display width does not limit the range of values that can be saved within the column, nor does it limit the display of values that exceed the specified width of the column. When used in conjunction with the optional extended attribute ZEROFILL, the default supplementary spaces are replaced with zeros. For example: For a column declared as INT(5) ZEROFILL, the value 4 is retrieved as 00004. Please note that if you store a value in an integer column that exceeds the display width, MySQL will encounter problems when generating temporary tables for complex joins, because in these cases MySQL believes that the data will fit in the original column width. If you specify ZEROFILL for a numeric column, MySQL Automatically add the UNSIGNED attribute to the column.
Use VARBINARY to store case-sensitive variable-length strings
When to use CHAR and when to use VARCHAR?
CHAR and VARCHAR types are similar, but they are saved and retrieved differently. They also differ in terms of their maximum length and whether trailing spaces are preserved. The length declared for the CHAR and VARCHAR types represents the maximum number of characters you want to save. For example, CHAR(30) can occupy 30 characters.
The length of the CHAR column is fixed to the length declared when creating the table. The length can be any value from 0 to 255. When saving CHAR values, pad them to the right with spaces to the specified length. When a CHAR value is retrieved, trailing spaces are removed. No case conversion is performed during storage or retrieval.
The values in the VARCHAR column are variable-length strings. The length can be specified as a value between 0 and 65,535. (The maximum effective length of VARCHAR is determined by the maximum row size and the character set used. The overall maximum length is 65,532 bytes). Compared with CHAR, the VARCHAR value only saves the required number of characters, plus one byte to record the length (if the declared length of the column exceeds 255, two bytes are used). VARCHAR values are saved without padding. Trailing spaces are retained when the value is saved and retrieved, conforming to standard SQL.
char is suitable for storing the MD5 hash value of the user password, and its length is always the same. char is also better than varchar for values that change frequently, because fixed-length rows are less prone to fragmentation, and char is also more efficient than varchar for very short columns. The char(1) string will only occupy one byte for a single-byte character set, but the varchar(1) string will occupy 2 bytes, because 1 byte is used to store the length information.
Index design specifications
The query speed of MySQL relies on good index design, so indexes are crucial for high performance. Reasonable indexes will speed up queries (including UPDATE and DELETE speeds. MySQL will load the page containing the row into memory and then perform UPDATE or DELETE operations), while unreasonable indexes will slow down the speed. MySQL index search is similar to the pinyin and radical search in Xinhua Dictionary. When the pinyin and radical index does not exist, it can only be searched by turning page by page. When MySQL queries cannot use indexes, MySQL will perform a full table scan, which will consume a lot of IO. Purpose of index: deduplication, speed up positioning, avoid sorting, overwrite index.
What is a covering index
In the InnoDB storage engine, the secondary index (non-primary key index) does not directly store row addresses and stores primary key values. If the user needs to query a data column that is not included in the secondary index, he or she needs to first find the primary key value through the secondary index, and then query the other data columns through the primary key, so it needs to be queried twice. The concept of covering index is that the query can be completed in an index, and the efficiency of covering index will be relatively high. The primary key query is a natural covering index. Reasonable creation of indexes and reasonable use of query statements can improve performance when using covering indexes. For example, SELECT email,uid FROM user_email WHERE uid=xx, if uid is not the primary key, you can add the index as index(uid,email) when appropriate to improve performance.
Basic specifications for indexes
Why can't there be too many indexes in a table?
InnoDB’s secondary index uses b tree for storage, so b tree needs to be adjusted during UPDATE, DELETE, and INSERT. Excessive indexes will slow down the update speed.
Use prefix index for strings. The length of prefix index should not exceed 8 characters. It is recommended to give priority to prefix index. If necessary, you can add pseudo columns and create indexes.
Do not index blob/text and other fields, do not index large fields, this will make the index take up too much storage space
What is a prefix index?
To put it bluntly, a prefix index indexes the first few characters of the text (specifically, the number of characters specified when creating the index), so that the index created is smaller, so the query is faster. Prefix index can effectively reduce the size of index files and improve indexing speed. But prefix indexes also have their disadvantages: MySQL cannot use prefix indexes in ORDER BY or GROUP BY, nor can they be used as covering indexes.
Syntax for establishing a prefix index: ALTER TABLE table_name ADD KEY(column_name(prefix_length));
Primary key guidelines
Important SQL must be indexed, core SQL gives priority to covering the index
The greatest differentiation Put the fields at the front
INDEX TABOO
Limitations on indexes in MYSQL
The sum of the MYISAM storage engine index lengths cannot exceed 1000 bytes
BLOB and TEXT type columns can only create prefix indexes
MYSQL currently does not support function indexes
When using not equal to (!= or <>), MYSQL cannot use the index.
After filtering fields using function operations (such as abs (column)), MYSQL cannot use indexes.
MYSQL cannot use the index when the join condition field types in the join statement are inconsistent
When using the LIKE operation, if the condition starts with a wildcard (such as ‘�c…’), MYSQL cannot use the index.
When using non-equivalent queries, MYSQL cannot use Hash indexes.
Statement design specifications
Use precompiled statements
Avoid implicit conversion
Will cause index failure
Make full use of prefix index
Do not use negative queries, such as not in/like
Avoid using stored procedures and triggers , UDF, events, etc.
Avoid using JOIN of large tables
What MySQL is best at is the primary key/secondary index query of a single table
JOIN Consume more memory and generate temporary tables
Avoid mathematical operations in the database
Reduce the number of interactions with the database
Reasonable use Paging
Limit the number of pages displayed by paging. You can only click on the previous page and next page using delayed association
How to use paging correctly?
If there is a paging statement similar to the following: SELECT * FROM table ORDER BY id LIMIT 10000, 10 Because the way LIMIT OFFSET is processed in MySQL is to take out all the data of OFFSET LIMIT, then remove OFFSET, and return to the bottom LIMIT. Therefore, when the OFFSET value is large, MySQL's query performance will be very low. It can be solved by using id > n:
The method of using id > n has limitations. For the problem of discontinuous ids, it can be solved by passing in the last id at the same time when turning the page. .
http://example.com/page.php?last=100 select * from table where id<100 order by id desc limit 10 //上一页 http://example.com/page.php?first=110 select * from table where id>110 order by id desc limit 10
The biggest disadvantage of this method is that if there is an insertion/deletion operation during browsing, the page turning will not be updated, and the total number of pages may still be calculated based on the new count(*), and finally Some records may be inaccessible. In order to fix this problem, you can continue to introduce the current page number and whether there are insertion/deletion operations that affect the total number of records since the last page turn and cache them
select * from table where id >= (select id from table order by id limit #offset#, 1)
The EXPLAIN statement (executed in the MySQL client) can obtain information about how MySQL executes the SELECT statement. By executing EXPLAIN on the SELECT statement, you can know whether MySQL uses indexes, full table scans, temporary tables, sorting and other information when executing the SELECT statement. Try to avoid MySQL from performing full table scans, using temporary tables, sorting, etc. See the official documentation for details.
Use union all instead of union
What is the difference between union all and union?
The union and union all keywords both merge two result sets into one, but they are different in terms of usage and efficiency.
Union will filter out duplicate records after table linking, so after table linking, it will sort the result set generated, delete duplicate records, and then return the results. For example:
select * from test_union1 union select * from test_union2
This SQL first takes out the results of the two tables when running, then uses the sorting space to sort and delete duplicate records, and finally returns the result set. If the table data is large, it may cause the disk to be used for processing. Sort.
And union all simply combines the two results and returns them. In this way, if there is duplicate data in the two result sets returned, the returned result set will contain duplicate data.
从效率上说,union all要比union快很多,所以,如果可以确认合并的两个结果集中不包含重复的数据的话,那么就使用union all,如下:
select * from test_union1 union all select * from test_union2
分表规范
单表一到两年内数据量超过500w或数据容量超过10G考虑分表,需提前考虑历史数据迁移或应用自行删除历史数据,采用等量均衡分表或根据业务规则分表均可。要分表的数据表必须与DBA商量分表策略
行为规范
其他规范
日志类数据不建议存储在MySQL上,优先考虑Hbase或OceanBase,如需要存储请找DBA评估使用压缩表存储。
相关图文教程:mysql数据库图文教程
The above is the detailed content of The most comprehensive sharing of MySQL usage specifications in history. For more information, please follow other related articles on the PHP Chinese website!