MySQL建表必须了解的要点-mysql教程-PHP中文网

Key Points You Must Know When Creating Tables in MySQL

Für Backend-Entwickler ist der Zugriff auf eine Datenbank von entscheidender Bedeutung.

Kernbenutzerdaten werden normalerweise sicher in Datenbanken wie MySQL oder Oracle gespeichert.

Tägliche Aufgaben umfassen oft die Erstellung von Datenbanken und Tabellen, um Geschäftsanforderungen zu erfüllen, aber Tabellen werden viel häufiger erstellt.

Dieser Artikel konzentriert sich auf die Tabellenerstellung, da das Ignorieren wichtiger Details zu kostspieligen Problemen bei der Wartung nach der Bereitstellung führen kann.

Übrigens können schlechte Datenbankdesignpraktiken auch dazu führen, dass Ihre API bei hoher Parallelität langsam reagiert. Das folgende Bild zeigt die Leistungstestergebnisse einer API mit dem EchoAPI-Tool.

Key Points You Must Know When Creating Tables in MySQL

Lassen Sie uns heute 18 Tipps zum Erstellen von Tabellen in einer Datenbank besprechen.

Viele der in diesem Artikel erwähnten Details stammen aus meinen eigenen Erfahrungen und Herausforderungen während der Arbeit und ich hoffe, dass sie für Sie hilfreich sein werden.

1. Benennung

Beim Erstellen von Tabellen, Feldern und Indizes ist es unglaublich wichtig, ihnen gute Namen zu geben.

1.1 Bedeutungsvolle Namen

Namen dienen als Gesicht von Tabellen, Feldern und Indizes und hinterlassen einen ersten Eindruck.

Gute Namen sind prägnant und selbsterklärend, was die Kommunikation und Wartung erleichtert.

Schlechte Namen sind mehrdeutig und verwirrend, was zu Chaos und Frustration führt.

Schlechte Beispiele:

Feldnamen wie abc, abc_name, name, user_name_123456789 werden Sie verwirren.

Gutes Beispiel:

Feldname als Benutzername.

Eine kleine Erinnerung: Namen sollten auch nicht zu lang sein, idealerweise nicht länger als 30 Zeichen.

1.2 Groß- und Kleinschreibung

Am besten verwenden Sie Kleinbuchstaben für Namen, da diese optisch leichter zu lesen sind.

Schlechte Beispiele:

Feldnamen wie PRODUCT_NAME, PRODUCT_name sind nicht intuitiv. Eine Mischung aus Groß- und Kleinschreibung ist weniger angenehm zu lesen.

Gutes Beispiel:

Feldname als Produktname sieht komfortabler aus.

1.3 Trennzeichen

Namen können zum besseren Verständnis oft mehrere Wörter enthalten.

Welches Trennzeichen sollte zwischen mehreren Wörtern verwendet werden?

Schlechte Beispiele:

Feldnamen wie Produktname, Produktname, Produktname oder Produkt@Name werden nicht empfohlen.

Gutes Beispiel:

Feldname als Produktname.

Die Verwendung eines Unterstrichs _ zwischen Wörtern wird dringend empfohlen.

1.4 Tabellennamen

Für Tabellennamen wird empfohlen, aussagekräftige, prägnante Namen zusammen mit einem Geschäftspräfix zu verwenden.

Für auftragsbezogene Tabellen stellen Sie dem Tabellennamen order_ voran, z. B. order_pay, order_pay_detail.

Für produktbezogene Tabellen stellen Sie „product_“ voran, z. B. „product_spu“ oder „product_sku“.

Diese Vorgehensweise hilft dabei, Tabellen, die sich auf dasselbe Unternehmen beziehen, schnell zusammenzufassen.

Wenn ein Nicht-Bestellunternehmen außerdem möglicherweise eine Tabelle mit dem Namen „pay“ erstellen muss, kann diese leicht als „finance_pay“ unterschieden werden, wodurch Namenskonflikte vermieden werden.

1.5 Feldnamen

Feldnamen ermöglichen maximale Flexibilität, können aber leicht zu Verwirrung führen.

Zum Beispiel kann die Verwendung eines Flags zur Angabe des Status in einer Tabelle und die Verwendung des Status in einer anderen Tabelle zu Inkonsistenzen führen.

Eine Standardisierung auf den Status zur Darstellung des Staates ist ratsam.

Wenn eine Tabelle den Primärschlüssel einer anderen Tabelle verwendet, hängen Sie _id oder _sys_no an das Ende des Feldnamens an, zum Beispiel „product_spu_id“ oder „product_spu_sys_no“.

Standardisieren Sie außerdem die Erstellungszeit als „create_time“ und die Änderungszeit als „update_time“, wobei der Löschstatus auf „delete_status“ festgelegt ist.

Andere gemeinsame Felder sollten zur besseren Übersichtlichkeit ebenfalls eine einheitliche Namenskonvention über verschiedene Tabellen hinweg beibehalten.

1.6 Indexnamen

In einer Datenbank gibt es verschiedene Arten von Indizes, darunter Primärschlüssel, reguläre Indizes, eindeutige Indizes und zusammengesetzte Indizes.

Eine Tabelle hat im Allgemeinen einen einzelnen Primärschlüssel, der normalerweise id oder sys_no genannt wird.

Reguläre und zusammengesetzte Indizes können das Präfix ix_ verwenden, zum Beispiel ix_product_status.

Eindeutige Indizes können das Präfix ux_ verwenden, z. B. ux_product_code.

2. Feldtypen

Beim Entwerfen von Tabellen besteht reichlich Freiheit bei der Auswahl der Feldtypen.

Zeitformatierte Felder können Datum, Datum/Uhrzeit oder Zeitstempel usw. sein.

Zu den Zeichendatentypen gehören Varchar, Char, Text usw.

Numerische Typen umfassen int, bigint, smallint und tinyint.

Die Auswahl eines geeigneten Feldtyps ist entscheidend.

高估类型（例如，对仅存储 1 到 10 之间的值的字段使用 bigint）会浪费空间； tinyint 就足够了。

相反，低估（例如使用int作为18位ID）会导致数据存储失败。

以下是选择字段类型的一些原则：

在满足正常业务需求的情况下优先选择小存储空间，从小到大进行选择。
使用 char 表示固定或相似的字符串长度，使用 varchar 表示不同的长度。
对布尔字段使用位。
枚举字段使用tinyint。
主键字段选择bigint。
货币字段使用小数。
时间字段使用时间戳或日期时间。

3. 字段长度

定义字段名称并选择适当的字段类型后，重点应该转移到字段长度，例如 varchar(20) 或 bigint(20)。

varchar 的长度表示什么——字节还是字符？

答案：在MySQL中，varchar和char代表字符长度，而大多数其他类型代表字节长度。

例如bigint(4)指定的是显示长度，而不是存储长度，仍然是8个字节。

如果设置了zerofill属性，小于4字节的数字将被填充，但即使填充，底层数据存储仍为8字节。

4. 字段数量

设计表格时，限制字段数量至关重要。

我见过有几十个甚至上百个字段的表，导致数据量大，查询效率低。

如果出现这种情况，请考虑将大表拆分为较小的表，同时保留公共主键。

根据经验，请将每个表的字段数量保持在 20 以下。

5. 主键

设置表时创建主键。

主键本质上带有主键索引，使查询更加高效，因为它们不需要额外的查找。

在单个数据库中，主键可以使用AUTO_INCREMENT进行自动增长。

对于分布式数据库，特别是在分片架构中，最好使用外部算法（如 Snowflake）来确保全局唯一的 ID。

此外，保持主键与业务值无关，以减少耦合并方便将来的扩展。

但是，对于一对一的关系，例如用户表和用户扩展表，直接使用用户表中的主键是可以接受的。

6. 存储引擎

MySQL 8之前，默认存储引擎是MyISAM；从 MySQL 8 开始，现在是 InnoDB。

历史上，关于选择哪种存储引擎存在很多争论。

MyISAM 将索引和数据存储分离，增强了查询性能，但缺乏事务和外键的支持。

InnoDB 虽然查询速度稍慢，但支持事务和外键，使其更加健壮。

之前建议对于读密集的场景使用 MyISAM，对于写密集的场景使用 InnoDB。

但是，MySQL 中的优化减少了性能差异，因此建议在 MySQL 8 及更高版本中使用默认的 InnoDB 存储引擎，无需任何额外修改。

7. 不为空

创建字段时，决定是否可以为NULL。

建议尽可能将字段定义为 NOT NULL。

为什么？

在 InnoDB 中，存储 NULL 值需要额外的空间，并且它们还会导致索引失败。

NULL 值只能使用 IS NULL 或 IS NOT NULL 来查询，因为使用 = 总是返回 false。

因此，只要可行，请将字段定义为 NOT NULL。

但是，当字段直接定义为NOT NULL，而输入时忘记了某个值时，会导致数据插入失败。

当添加新字段并在部署新代码之前运行脚本时，这是可以接受的情况，从而导致没有默认值的错误。

对于新添加的 NOT NULL 字段，设置默认值至关重要：

ALTER TABLE product_sku ADD COLUMN brand_id INT(10) NOT NULL DEFAULT 0;

登录后复制

8. 外键

MySQL 中的外键用于确保数据的一致性和完整性。

例如：

CREATE TABLE class (
  id INT(10) PRIMARY KEY AUTO_INCREMENT,
  cname VARCHAR(15)
);

登录后复制

这将创建一个类表。

然后，可以构建一个引用它的学生表：

CREATE TABLE student(
  id INT(10) PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(15) NOT NULL,
  gender VARCHAR(10) NOT NULL,
  cid INT,
  FOREIGN KEY (cid) REFERENCES class(id)
);

登录后复制

这里，学生表中的cid引用了班级表中的id。

尝试删除学生中的记录而不删除班级中相应的 cid 记录将引发外键约束错误：

外键约束失败。

因此，保持了一致性和完整性。

请注意，外键只能与 InnoDB 存储引擎一起使用。

If only two tables are linked, it might be manageable, but with several tables, deleting a parent record requires synchronously deleting many child records, which can impact performance.

Thus, for internet systems, it is generally advised to avoid using foreign keys to prioritize performance over absolute data consistency.

In addition to foreign keys, stored procedures and triggers are also discouraged due to their performance impact.

9. Indexes

When creating tables, beyond specifying primary keys, it’s essential to create additional indexes.

For example:

CREATE TABLE product_sku(
  id INT(10) PRIMARY KEY AUTO_INCREMENT,
  spu_id INT(10) NOT NULL,
  brand_id INT(10) NOT NULL,
  name VARCHAR(15) NOT NULL
);

登录后复制

This table includes spu_id (from the product group) and brand_id (from the brand table).

In situations that save IDs from other tables, a regular index can be added:

CREATE TABLE product_sku (
  id INT(10) PRIMARY KEY AUTO_INCREMENT,
  spu_id INT(10) NOT NULL,
  brand_id INT(10) NOT NULL,
  name VARCHAR(15) NOT NULL,
  KEY `ix_spu_id` (`spu_id`) USING BTREE,
  KEY `ix_brand_id` (`brand_id`) USING BTREE
);

登录后复制

Such indexes significantly enhance query efficiency.

However, do not create too many indexes as they can hinder data insertion efficiency due to additional storage requirements.

A single table should ideally have no more than five indexes.

If the number of indexes exceeds five during table creation, consider dropping some regular indexes in favor of composite indexes.

Also, when creating composite indexes, always apply the leftmost matching rule to ensure the indexes are effective.

For fields with high duplication rates (like status), avoid creating separate regular indexes. MySQL may skip the index and choose a full table scan instead if it’s more efficient.

I’ll address index inefficiency issues in a separate article later, so let’s hold off on that for now.

10. Time Fields

The range of types available for time fields in MySQL is fairly extensive: date, datetime, timestamp, and varchar.

Using varchar might be for API consistency where time data is represented as a string.

However, querying data by time ranges can be inefficient with varchar since it cannot utilize indexes.

Date is intended only for dates (e.g., 2020-08-20), while datetime and timestamp are suited for complete date and time.

There are subtle differences between them.

Timestamp: uses 4 bytes and spans from 1970-01-01 00:00:01 UTC to 2038-01-19 03:14:07. It’s also timezone-sensitive.

Datetime: occupies 8 bytes with a range from 1000-01-01 00:00:00 to 9999-12-31 23:59:59, independent of time zones.

Using datetime to save date and time is preferable for its wider range.

As a reminder, when setting default values for time fields, avoid using 0000-00-00 00:00:00, which can cause errors during queries.

11. Monetary Fields

MySQL provides several types for floating-point numbers: float, double, decimal, etc.

Given that float and double may lose precision, it’s recommended to use decimal for monetary values.

Typically, floating numbers are defined as decimal(m,n), where n represents the number of decimal places, and m is the total length of both integer and decimal portions.

For example, decimal(10,2) allows for 8 digits before the decimal point and 2 digits after it.

12. JSON Fields

During table structure design, you may encounter fields needing to store variable data values.

For example, in an asynchronous Excel export feature, a field in the async task table may need to save user-selected query conditions, which can vary per user.

Traditional database fields don’t handle this well.

Using MySQL’s json type enables structured data storage in JSON format for easy saving and querying.

MySQL also supports querying JSON data by field names or values.

13. Unique Indexes

Unique indexes are frequently used in practice.

You can apply unique indexes to individual fields, like an organization’s code, or create composite unique indexes for multiple fields, like category numbers, units, specifications, etc.

Unique indexes on individual fields are straightforward, but for composite unique indexes, if any field is NULL, the uniqueness constraint may fail.

Another common issue is having unique indexes while still producing duplicate data.

Due to its complexity, I’ll elaborate on unique index issues in a later article.

When creating unique indexes, ensure that none of the involved fields contain NULL values to maintain their uniqueness.

14. Character Set

MySQL supports various character sets, including latin1, utf-8, utf8mb4, etc.

Here’s a table summarizing MySQL character sets:

Character Set	Description	Encoding Size	Notes
latin1	Encounters encoding issues; rarely used in real projects	1 byte	Limited support for international characters
utf-8	Efficient in storage but cannot store emoji	3 bytes	Suitable for most text but lacks emoji support
utf8mb4	Supports all Unicode characters, including emoji	4 bytes	Recommended for modern applications

It’s advisable to set the character set to utf8mb4 during table creation to avoid potential issues.

15. Collation

When creating tables in MySQL, the COLLATE parameter can be configured.

For example:

CREATE TABLE `order` (
  `id` BIGINT NOT NULL AUTO_INCREMENT,
  `code` VARCHAR(20) COLLATE utf8mb4_bin NOT NULL,
  `name` VARCHAR(30) COLLATE utf8mb4_bin NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `un_code` (`code`),
  KEY `un_code_name` (`code`,`name`) USING BTREE,
  KEY `idx_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

登录后复制

The collation determines how character sorting and comparison are conducted.

Character collation depends on the character set, which for utf8mb4 would also start with utf8mb4_. Common types include utf8mb4_general_ci and utf8mb4_bin.

The utf8mb4_general_ci collation is case-insensitive for alphabetical characters, while utf8mb4_bin is case-sensitive.

This distinction is important. For example, if the order table contains a record with the name YOYO and you query it using lowercase yoyo under utf8mb4_general_ci, it retrieves the record. Under utf8mb4_bin, it will not.

Choose collation based on the actual business needs to avoid confusion.

16. Large Fields

Special attention is warranted for fields that consume substantial storage space, such as comments.

A user comment field might require limits, like a maximum of 500 characters.

Defining large fields as text can waste storage, thus it’s often better to use varchar for better efficiency.

For much larger data types, like contracts that can take up several MB, it may be unreasonable to store directly in MySQL.

Instead, such data could be stored in MongoDB, with the MySQL business table retaining the MongoDB ID.

17. Redundant Fields

To enhance performance and query speed, some fields can be redundantly stored.

For example, an order table typically contains a userId to identify users.

However, many order query pages also need to display the user ID along with the user’s name.

If both tables are small, a join is feasible, but for large datasets, it can degrade performance.

In that case, creating a redundant userName field in the order table can resolve performance issues.

While this adjustment allows direct querying from the order table without joins, it requires additional storage and may lead to inconsistency if user names change.

Therefore, carefully evaluate if the redundant fields strategy fits your particular business scenario.

18. Comments

When designing tables, ensure to add clear comments for tables and associated fields.

For example:

CREATE TABLE `sys_dept` (
  `id` BIGINT NOT NULL AUTO_INCREMENT COMMENT 'ID',
  `name` VARCHAR(30) NOT NULL COMMENT 'Name',
  `pid` BIGINT NOT NULL COMMENT 'Parent Department',
  `valid_status` TINYINT(1) NOT NULL DEFAULT 1 COMMENT 'Valid Status: 1=Valid, 0=Invalid',
  `create_user_id` BIGINT NOT NULL COMMENT 'Creator ID',
  `create_user_name` VARCHAR(30) NOT NULL COMMENT 'Creator Name',
  `create_time` DATETIME(3) DEFAULT NULL COMMENT 'Creation Date',
  `update_user_id` BIGINT DEFAULT NULL COMMENT 'Updater ID',
  `update_user_name` VARCHAR(30)  DEFAULT NULL COMMENT 'Updater Name',
  `update_time` DATETIME(3) DEFAULT NULL COMMENT 'Update Time',
  `is_del` TINYINT(1) DEFAULT '0' COMMENT 'Is Deleted: 1=Deleted, 0=Not Deleted',
  PRIMARY KEY (`id`) USING BTREE,
  KEY `index_pid` (`pid`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4 COMMENT='Department';

登录后复制

Detailed comments clarify the purpose of tables and fields.

Particularly for fields representing statuses (like valid_status), it immediately conveys the intent behind the data, such as indicating valid versus invalid.

Avoid situations where numerous status fields exist without comments, leading to confusion about what values like 1, 2, or 3 signify.

Initially, one might remember, but after a year of operation, it’s easy to forget, potentially leading to significant pitfalls.

Thus, when designing tables, meticulous commenting and regular updates of these comments are essential.

That wraps up the technical section of this article,If you have a different opinion, let me know?.

以上是MySQL建表必须了解的要点的详细内容。更多信息请关注PHP中文网其他相关文章！