Home Database Mysql Tutorial How to design primary key in MySQL

How to design primary key in MySQL

May 23, 2020 am 11:21 AM
mysql

How to design primary key in MySQL

This article explains the primary key issue of Mysql, understands the knowledge related to Mysql primary key from the perspective of why, and extends to the primary key generation solution. No more fear of only knowing CRUD when asked about Mysql.

1. Why a primary key is needed

Data records must be unique (first normal form)

Data needs to be associated with join

The underlying index of the database is used to retrieve data.

The following is a lot of nonsense, you can skip directly to the next section.

“Information is what is used to eliminate random uncertainty” (Shannon). People can understand and transform the world by obtaining and identifying different information from nature and society to distinguish between different things. Data is a record that reflects the attributes of objective things and is a specific manifestation of information. After data is processed, it becomes information; and information needs to be digitally transformed into data before it can be stored and transmitted. A database is used to store data records. That being the case, records are deterministic (relative) information, and their certainty is uniqueness. We come to the first reason:

1. Data records need to be unique

The world is composed of objective existence and its relationships. Data exists in digital and modeled relationships. In addition to the descriptive value of data itself, its value lies in its interconnectedness. In order to achieve the accuracy of the association, the data needs to be externally associated with each other. Therefore, reflected in data storage, the second role of the primary key is also the second factor of existence:

2. Data needs to be associated

Data is used to describe objective reality, and it is meaningless in itself. Only after organizing according to subjective needs and satisfying people in a certain way to understand things can it have meaning. So the data needs to be retrieved and organized. Then the third role of the primary key:

3. The underlying index of the database is used to retrieve data

2. Why the primary key should not be too long

This The problem is the length. So what are the advantages of being shorter than being longer? (Hey hey hey, connotation) - short and does not take up space. But such a small amount of disk space is insignificant compared to the entire data volume, and we generally don't use the primary key column very much. Then the reason should be the speed, and it has little to do with the original data. From this, it is naturally concluded that it is related to the index and related to index reading. So why do long primary keys affect performance in indexes?

The above is the index data structure of Innodb. On the left is the clustered index, which locates data records by primary key. On the right is the secondary index, which indexes the column data and searches for the primary key of the data through the column data. If the data is queried through the secondary index, the process is as shown in the figure. First, the primary key is searched from the secondary index tree, and then the data row is searched through the primary key on the clustered index. The leaf nodes of the secondary index are directly stored primary key values, not primary key pointers. Therefore, if the primary key is too long, the number of index records that can be stored in a secondary index tree will decrease. In this way, in the limited index buffer, the number of disk reads will increase, so the performance will decrease.

3. Why is it recommended to use auto-increment ID

InnoDB uses a clustered index, as shown in the figure above, the data record itself is stored in the main index (a B Tree ) on the leaf node. This requires that each data record in the same leaf node (the size of one memory page or disk page) is stored in primary key order, so whenever a new record is inserted, MySQL will insert it into the appropriate node based on its primary key. and position, if the page reaches the load factor (InnoDB default is 15/16), a new page (node) is opened.

If the table uses an auto-incrementing primary key, then each time a new record is inserted, the records will be sequentially added to the subsequent position of the current index node. When a page is full, a new page will be automatically opened. This results in a compact index structure that is filled approximately sequentially. Since there is no need to move existing data every time it is inserted, it is very efficient and does not add a lot of overhead to maintaining the index, as shown on the left side of the figure below. Otherwise, since the value of the primary key inserted each time is approximately random, each new record must be inserted somewhere in the middle of the existing index page, and MySQL has to move the data in order to insert the new record into the appropriate position, as shown on the right side of the figure below. As shown on the side, this causes a certain amount of overhead. Due to this, Mysql may need to frequently refresh the buffer to maintain the index, which increases the number of method disk IOs, and the index structure often needs to be reorganized.

4. Business Key VS Logical Key

Business Key, that is, using an id with business significance as the Key, such as using the order serial number as the primary key Key of the order table. Logical Key, that is, Key that has nothing to do with business, generates Key according to certain rules, such as auto-incrementing Key.

Advantages of business Key

Key has business significance and can be used directly as a search keyword when querying

No need for additional columns and indexes Space

can reduce some join operations.

Disadvantages of business Key

When the business changes, sometimes the primary key needs to be changed

It is more difficult to operate when multiple columns of Key are involved

Business Key is often longer and takes up more space, resulting in larger disk IO

Data cannot be persisted before the Key is determined. Sometimes when we have not determined the data Key, we want to add a record first and then update the business Key

Design a Key generation that is both easy to use and performant The solution is more difficult

Advantages of logical Key

Key logic will not need to be modified due to business changes

Simple operation and easy management

Logical Key is often smaller and has better performance

Logical Key is easier to ensure uniqueness

Easier to optimize

Logical Key Disadvantages

Querying primary key columns and primary key indexes requires additional disk space

Additional IO is required when inserting and updating data

More joins may

If there is no uniqueness policy restriction, duplicate Keys are prone to appear

The test environment and the official environment Key are inconsistent, which is not conducive to troubleshooting

The value of the Key is not associated with the data and does not conform to the three paradigms

Cannot be used to search for keywords

Depends on the specific implementation of different database systems, which is not conducive to the replacement of the underlying database

5. Primary key generation

Under normal circumstances, we all use Mysql's auto-increment ID as the primary key of the table. It is so simple, and from the above, the performance is also the best. However, in the case of sub-databases and sub-tables, auto-incrementing IDs cannot meet the needs. We can take a look at how different databases generate IDs, and also look at some distributed ID generation solutions. It is helpful for us to think about and even implement our own distributed ID generation service.

Implementation of database

Mysql auto-increment

Mysql maintains an auto-increment counter in memory, each time auto is accessed -increment counter, InnoDB will add a lock named AUTO-INC until the end of the statement (note that the lock is only held until the end of the statement, not the end of the transaction). The AUTO-INC lock is a special table-level lock used to improve concurrent insertability of columns containing auto_increment.

In a distributed situation, you can actually use a separate service and database for id generation, and still rely on Mysql's table id auto-increment capability to uniformly generate ids for third-party services. For performance reasons, different tables can be used for different businesses.

Mongodb ObjectId

To prevent primary key conflicts, Mongodb designs an ObjectId as the primary key id. It consists of a 12-byte hexadecimal number that contains the following parts:

Time: Timestamp. 4 bytes. Seconds.

Machine: Machine identification. 3 bytes. Generally, it is the hash value of the machine host name. This ensures that different hosts generate different machine hash values, ensuring that there is no conflict in the distribution and that the same machine has the same value.

PID: Process ID. 2 bytes. The Machine above is to ensure that the objectId generated on different machines does not conflict, and the pid is to ensure that the objectId generated by different mongodb processes on the same machine does not conflict.

INC: self-increasing counter. 3 bytes. The first nine bytes ensure that objectIds generated by different processes on different machines within one second do not conflict. The self-increasing counter is used to ensure that objectIds generated within the same second will not conflict. Allowing 256 raised to the third power is equal to 16777216 entries. Record uniqueness.

Cassandra TimeUUID

Cassandra uses the following rules to generate a unique id: time MAC sequence

Scheme

Zookeeper auto-increment: Achieved through zk's auto-increment mechanism.

Redis self-increment: realized through Redis’s self-increment mechanism.

UUID: Use UUID string as Key.

snowflake algorithm: similar to Mongodb's implementation, 1 sign bit, 41-bit timestamp (millisecond level), 10-bit data machine bits, 12-bit sequence within milliseconds.

Open source implementation

Baidu UidGenerator: based on snowflake algorithm.

Meituan Leaf: It also implements mechanisms based on Mysql auto-increment (optimization) and snowflake algorithm.

The above is the detailed content of How to design primary key in MySQL. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
Two Point Museum: All Exhibits And Where To Find Them
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PHP's big data structure processing skills PHP's big data structure processing skills May 08, 2024 am 10:24 AM

Big data structure processing skills: Chunking: Break down the data set and process it in chunks to reduce memory consumption. Generator: Generate data items one by one without loading the entire data set, suitable for unlimited data sets. Streaming: Read files or query results line by line, suitable for large files or remote data. External storage: For very large data sets, store the data in a database or NoSQL.

How to use MySQL backup and restore in PHP? How to use MySQL backup and restore in PHP? Jun 03, 2024 pm 12:19 PM

Backing up and restoring a MySQL database in PHP can be achieved by following these steps: Back up the database: Use the mysqldump command to dump the database into a SQL file. Restore database: Use the mysql command to restore the database from SQL files.

How to optimize MySQL query performance in PHP? How to optimize MySQL query performance in PHP? Jun 03, 2024 pm 08:11 PM

MySQL query performance can be optimized by building indexes that reduce lookup time from linear complexity to logarithmic complexity. Use PreparedStatements to prevent SQL injection and improve query performance. Limit query results and reduce the amount of data processed by the server. Optimize join queries, including using appropriate join types, creating indexes, and considering using subqueries. Analyze queries to identify bottlenecks; use caching to reduce database load; optimize PHP code to minimize overhead.

How to insert data into a MySQL table using PHP? How to insert data into a MySQL table using PHP? Jun 02, 2024 pm 02:26 PM

How to insert data into MySQL table? Connect to the database: Use mysqli to establish a connection to the database. Prepare the SQL query: Write an INSERT statement to specify the columns and values ​​to be inserted. Execute query: Use the query() method to execute the insertion query. If successful, a confirmation message will be output.

How to create a MySQL table using PHP? How to create a MySQL table using PHP? Jun 04, 2024 pm 01:57 PM

Creating a MySQL table using PHP requires the following steps: Connect to the database. Create the database if it does not exist. Select a database. Create table. Execute the query. Close the connection.

How to use MySQL stored procedures in PHP? How to use MySQL stored procedures in PHP? Jun 02, 2024 pm 02:13 PM

To use MySQL stored procedures in PHP: Use PDO or the MySQLi extension to connect to a MySQL database. Prepare the statement to call the stored procedure. Execute the stored procedure. Process the result set (if the stored procedure returns results). Close the database connection.

How to fix mysql_native_password not loaded errors on MySQL 8.4 How to fix mysql_native_password not loaded errors on MySQL 8.4 Dec 09, 2024 am 11:42 AM

One of the major changes introduced in MySQL 8.4 (the latest LTS release as of 2024) is that the "MySQL Native Password" plugin is no longer enabled by default. Further, MySQL 9.0 removes this plugin completely. This change affects PHP and other app

The difference between oracle database and mysql The difference between oracle database and mysql May 10, 2024 am 01:54 AM

Oracle database and MySQL are both databases based on the relational model, but Oracle is superior in terms of compatibility, scalability, data types and security; while MySQL focuses on speed and flexibility and is more suitable for small to medium-sized data sets. . ① Oracle provides a wide range of data types, ② provides advanced security features, ③ is suitable for enterprise-level applications; ① MySQL supports NoSQL data types, ② has fewer security measures, and ③ is suitable for small to medium-sized applications.

See all articles