How to design a MySQL high-performance table-LINUX-php.cn

Home

System Tutorial

LINUX

How to design a MySQL high-performance table

王林

Jan 07, 2024 pm 11:54 PM

linux linux tutorial Red Hat linux system linux command linux certification red hat linux linux video

Good logical design and physical design are the cornerstones of high performance. The schema should be designed according to the query statements that the system will execute, which often requires weighing various factors.

How to design a MySQL high-performance table

1. Select the optimized data type

MySQL supports many data types. Choosing the correct data type is crucial to achieving high performance.

Smaller is usually better

Smaller data types are generally faster because they occupy less disk, memory, and CPU cache, and require fewer CPU cycles to process.

Just keep it simple

Operations on simple data types generally require fewer CPU cycles. For example, integer operations are cheaper than character operations because the character set and collation rules (collation) make character comparisons more complex than integer comparisons.

Try to avoid NULL

If the query contains NULL columns, it is more difficult for MySQL to optimize, because NULL columns make indexes, index statistics, and value comparisons more complex. Columns that can be NULL use more storage space and require special handling in MySQL. When NULLable columns are indexed, each index record requires an extra byte, which in MyISAM can even cause a fixed-size index (such as an index with only one integer column) to become a variable-size index.

Of course there are exceptions. For example, InnoDB uses a separate bit to store NULL values, so it has good space efficiency for sparse data.

1.Integer type

There are two types of numbers: whole numbers and real numbers. If you store integers, you can use these integer types: TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT. Use 8, 16, 24, 32, and 64-bit storage space respectively.

The integer type has the optional **UNSIGNED ** attribute, which means that negative values are not allowed, which roughly doubles the upper limit of positive numbers. For example, TINYINT.UNSIGNED can store the range 0 - 255, while the storage range of TINYINT is -128 -127.

Signed and unsigned types use the same storage space and have the same performance, so you can choose the appropriate type according to the actual situation.

Your choice determines how MySQL saves data in memory and disk. However, integer calculations generally use 64-bit BIGINT integers, even in a 32-bit environment. (The exception is some aggregate functions, which use DECIMAL or DOUBLE for calculations).

MySQL can specify the width for integer types, such as INT(11), which is meaningless for most applications: it does not limit the legal range of values, but only specifies some interactive tools of MySQL (such as the MySQL command line client end) is used to display the number of characters. For storage and calculation purposes, INT(1) and INT(20) are the same.

2.Real number type

Real numbers are numbers with a decimal part. However, they are not just for storing decimal parts, DECIMAL can also be used to store integers larger than BIGINT.

FLOAT and DOUBLE types support approximate calculations using standard floating point operations.

DECIMAL type is used to store precise decimals.

Both floating point and DECIMAL types can specify precision. For DECIMAL columns, you can specify the maximum number of digits allowed before and after the decimal point. This affects the column's space consumption.

There are many ways to specify the precision required for floating-point columns, which will cause MySQL to choose a different data type, or to round off the value when storing. These precision definitions are non-standard, so we recommend specifying only the data type and not the precision.

Floating point types usually use less space than DECIMAL when storing values in the same range. FLOAT uses 4 bytes of storage. DOUBLE occupies 8 bytes and has higher precision and larger range than FLOAT. As with integer types, all you can choose is the storage type; MySQL uses DOUBLE as the type for internal floating point calculations.

Because of the additional space and computational overhead required, you should try to use DECIMAL only when performing precise calculations on decimals. But when the data is relatively large, you can consider using BIGINT instead of DECIMAL. Just multiply the currency unit to be stored by the corresponding multiple according to the number of decimal places.

3. String type

VARCHAR

Used to store variable length strings, the length is supported to 65535
Need to use 1 or 2 extra bytes to record the length of the string
Suitable for: the maximum length of the string is much larger than the average length; updates are rare

CHAR

Fixed length, the length range is 1~255
Suitable for: storing very short strings, or all values close to the same length; frequently changed

Generosity is unwise

The space overhead of using VARCHAR(5) and VARCHAR(200) to store 'hello' is the same. So are there any advantages to using shorter columns?

It turns out there are great advantages. Longer columns consume more memory because MySQL typically allocates fixed-size blocks of memory to hold internal values. This is especially bad when using in-memory temporary tables for sorting or operations. It's equally bad when sorting using disk temporary tables.

So the best strategy is to allocate only the space you really need.

4.BLOB and TEXT types

BLOB and TEXT are both string data types designed to store large amounts of data, and are stored in binary and character modes respectively.

Unlike other types, MySQL treats each BLOB and TEXT value as an independent object. Storage engines usually do special processing when storing. When the BLOB and TEXT values are too large, InnoDB will use a dedicated "external" storage area for storage. At this time, each value requires 1 - 4 bytes to be stored in the row. The storage area stores the actual value.

The only difference between BLOB and TEXT is that the BLOB type stores binary data without collation rules or character sets, while the TEXT type has character sets and collation rules

5. Date and time types

Most of the time there are no alternatives to the type, so there is no question of what is the best choice. The only problem is what needs to be done when saving the date and time. MySQL provides two similar date types: DATE TIME and TIMESTAMP.

But currently we recommend the method of storing timestamps, so there will no longer be too much explanation on DATE TIME and TIMESTAMP here.

6.Other types

6.1 Select identifier

The smallest data type should be selected on the premise that it can meet the needs of the value range and reserve room for future growth.

Integer type

Integers are usually the best choice for identity columns because they are fast and can use AUTO_INCREMENT.

ENUM and SET types

The EMUM and SET types are generally a poor choice for identity columns, although they may be fine for some static "definition tables" that only contain fixed states or types. ENUM and SET columns are suitable for storing fixed information, such as ordered status, product type, and person's gender.

String type

If possible, you should avoid using string types as identity columns because they are space-consuming and generally slower than numeric types.

You also need to pay more attention to completely "random" strings, such as strings generated by MDS(), SHAl() or UUID(). The new values generated by these functions are arbitrarily distributed over a large space, which can cause INSERT and some SELECT statements to be very slow. If UUID values are stored, the "-" sign should be removed.

6.2 Special type data

Some types of data wells are not directly consistent with built-in types. Timestamps with low kilosecond precision are one example. Another example is a 1Pv4 address. People often use VARCHAR(15) columns to store IP addresses. However, they are actually 32-bit unsigned integers, not strings. The representation of the address divided into four segments using decimal points is just to make it easier for people to read. So IP addresses should be stored as unsigned integers. MySQL provides INET_ATON() and INET_NTOA() functions to convert between these two representation methods.

2. Table structure design

1. Paradigm and anti-paradigm

There are usually many ways to represent any given data, from fully normalized to fully denormalized, and a compromise between the two. In a normalized database, each fact appears exactly once. In contrast, in a denormalized database, information is redundant and may be stored in multiple places.

Advantages and Disadvantages of Paradigms

When considering performance improvement, it is often recommended to normalize the schema, especially in write-intensive scenarios.

Normalized update operations are usually faster than denormalized.
When data is well normalized, there is little or no duplicate data, so less data needs to be modified.
Normalized tables are usually smaller and fit better in memory, so operations will be performed faster.
Less redundant data means fewer DISTINCT or GROUP BY statements are needed to retrieve list data.

Advantages and disadvantages of anti-paradigm

Without the need for related tables, the worst case for most queries—even if the table does not use an index—is a full table scan. This can be much faster than associative when the data is larger than memory because random I/0 is avoided.

Individual tables can also use more efficient indexing strategies.

Mixing normalization and denormalization

In practical applications, it is often necessary to mix them, and partially normalized schemas, cache tables, and other techniques may be used.

Add redundant fields appropriately to the table, such as performance priority, but it will increase complexity. Table related queries can be avoided.

Simple and familiar database paradigm

<br> First normal form (1NF): Field values are atomic and cannot be divided (all relational database systems satisfy the first normal form);<br> For example: name field, where the surname and the first name are a whole. If the surname and the first name are distinguished, two independent fields must be set up;

Second Normal Form (2NF): A table must have a primary key, that is, each row of data can be uniquely distinguished;
Note: The first normal form must be satisfied first;

Third Normal Form (3NF): A table cannot contain information about non-key fields in other related tables, that is, the data table cannot have redundant fields;
Note: The second normal form must be satisfied first;

2. Table fields are less precise

I/O efficient
Separate fields are easy to maintain
Single table 1G volume 500W evaluation
A single line shall not exceed 200Byte
No more than 50 INT fields in a single table
No more than 20 CHAR(10) fields in a single table
It is recommended that the number of fields in a single table be controlled within 20
Split TEXT/BLOB, the TEXT type processing performance is much lower than VARCHAR, forcing the generation of hard disk temporary tables wastes more space.

The above is the detailed content of How to design a MySQL high-performance table. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

3 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks ago By DDD

Strength Levels for Every Enemy & Monster in R.E.P.O.

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Roblox: Dead Rails - How To Tame Wolves

3 weeks ago By DDD

Blue Prince: How To Get To The Basement

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1655

CakePHP Tutorial

1413

Laravel Tutorial

1306

PHP Tutorial

1252

C# Tutorial

1226

Related knowledge

Linux Architecture: Unveiling the 5 Basic Components Apr 20, 2025 am 12:04 AM

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

vscode terminal usage tutorial Apr 15, 2025 pm 10:09 PM

vscode built-in terminal is a development tool that allows running commands and scripts within the editor to simplify the development process. How to use vscode terminal: Open the terminal with the shortcut key (Ctrl/Cmd). Enter a command or run the script. Use hotkeys (such as Ctrl L to clear the terminal). Change the working directory (such as the cd command). Advanced features include debug mode, automatic code snippet completion, and interactive command history.

How to check the warehouse address of git Apr 17, 2025 pm 01:54 PM

To view the Git repository address, perform the following steps: 1. Open the command line and navigate to the repository directory; 2. Run the "git remote -v" command; 3. View the repository name in the output and its corresponding address.

How to run java code in notepad Apr 16, 2025 pm 07:39 PM

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

What is the main purpose of Linux? Apr 16, 2025 am 12:19 AM

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

vscode Previous Next Shortcut Key Apr 15, 2025 pm 10:51 PM

VS Code One-step/Next step shortcut key usage: One-step (backward): Windows/Linux: Ctrl ←; macOS: Cmd ←Next step (forward): Windows/Linux: Ctrl →; macOS: Cmd →

vscode terminal command cannot be used Apr 15, 2025 pm 10:03 PM

Causes and solutions for the VS Code terminal commands not available: The necessary tools are not installed (Windows: WSL; macOS: Xcode command line tools) Path configuration is wrong (add executable files to PATH environment variables) Permission issues (run VS Code as administrator) Firewall or proxy restrictions (check settings, unrestrictions) Terminal settings are incorrect (enable use of external terminals) VS Code installation is corrupt (reinstall or update) Terminal configuration is incompatible (try different terminal types or commands) Specific environment variables are missing (set necessary environment variables)

How to run sublime after writing the code Apr 16, 2025 am 08:51 AM

There are six ways to run code in Sublime: through hotkeys, menus, build systems, command lines, set default build systems, and custom build commands, and run individual files/projects by right-clicking on projects/files. The build system availability depends on the installation of Sublime Text.

See all articles