Table of Contents
高性能MySql进化论(二):数据类型的优化_下
Home Database Mysql Tutorial 高性能MySql进化论(一):数据类型的优化

高性能MySql进化论(一):数据类型的优化

Jun 07, 2016 pm 03:01 PM
mysql optimization performance data database type high performance

在数据库的性能调优的过程中会涉及到很多的知识,包括字段的属性设置是否合适,索引的建立是否恰当,表结构涉及是否合理,数据库/操作系统 的设置是否正确…..其中每个topic可能都是一个领域。 在我看来,在数据库性能提升关键技术中,对字段的优化难度相对

在数据库的性能调优的过程中会涉及到很多的知识,包括字段的属性设置是否合适,索引的建立是否恰当,表结构涉及是否合理,数据库/操作系统 的设置是否正确…..其中每个topic可能都是一个领域。

 

在我看来,在数据库性能提升关键技术中,对字段的优化难度相对较低且对性能的影响也非常的大。由于Mysql支持的数据类型比较多,且每个类型都有其独特的特性,但是有时候在选择一个具体的数据类型时,往往都是随意的选择一个能用的类型,而不会考虑到这个类型是否是最优的。在具体的类型描述之前,先来看一些针对数据类型选择的主要原则:

a)      尽量选择占用空间小的类型
因为小的类型无论是在磁盘,还是在内存中占用的空间都是小的,在进行查询或者排序是临时表要求的空间也会相对较少。在数据量比较小的时候可能感觉不到,但是当数据量比较大时,这个原则的重要性可能就会得到显现。

 

例如,有一张“商品信息”表,记录为2000万条,这张表有个 “剩余商品数量”(COUNT)的字段,一般而言 SMALLINT (len:16  range:0-65535)已经足够表达这个字段,可是如果你在设计的过程中用了BIGINT(len:64 range:0-18446744073709551615)来表达,虽然说程序可能正确的运行,但是这一个字段将会额外的增加大概95M的磁盘存储空间(64-16)/8*20,000,000 Bytes),另外在做数据选择和排序时仅仅这一个字段就会增加你95M的内存消耗,基于以上行为的影响,数据库的Performance必然是会被影响的

这里说的尽量小的前提是确保你将要选择的类型可以满足日后业务发展的需求,因为在数据量比较大的时候做表结构的更新是个非常缓慢而且麻烦的事情。

 

b)    尽量选择简单/恰当的类型

在对表进行选择以及排序的时候,对于简单的类型往往只需要消耗较少的CPU时钟周期。例如,对于MySql server而言,整数类型值的Compare往往会比字符串类型值的Compare简单且快,所以当你需要对特定的表进行排序时应该尽量选择整数类型作为排序的依据

 

c)       尽量将字段设置为NOTNULL
一般情况下,如果你没有显示的制定一个字段为NULL,那么这个字段将会被数据库系统认为是NULLABLE, 系统的这种默认行为将会导致以下三个问题
(1) Mysql服务器自身的 查询优化功能将会受影响
(2) Mysql针对null值的字段需要额外的存储空间以及处理
(3) 如果一个null值是索引的一部分,那么索引的效果也会收到影响

 由于这个原则对于数据库性能提升的作用不是很大,所以对于已经存在的DB schema,其存在NULLABLE字段或者是索引为NULLABLE的,也不用专门的去修改它,但是对于新设计的DB或者索引需要尽量遵守这个原则。

 

介绍完了数据类型选择的原则后,接下来将会介绍Mysql中常见的数据类型以及在性能优化方面需要注意的地方。

·        整数
在Mysql 的整数家族成员中主要包括TINYINT(8bit), SMALLINT(16bit),  MEDIUMINT(24bit), INT(32bit), or BIGINT(64bit)。

对于有符号整数而言这些类型的存储范围为(-2(n-1) ,2(n-1)-1),对于无符号数而言表达的范围是(0,2n-1),对于数据库而言有符号数和无符号数占用相同的存储空间,所以在选择类型的时候可以只考虑数的区间,而不用考虑是signed还是unsigned

 Mysql允许你在定义整数类型时指定他的宽度,例如 INT(10)。INT(10) 对于Client/CMD Line的输出是有区别的,但在Mysql Server看来实际的存储空间/计算消耗/数字范围 INT(10)与INT(32)没有任何的区别。

·        小数
在Mysql中小数家族的数据类型主要包括FLOAT(4Bytes),DOUBLE(8Bytes),从这两种类型的存储空间可以看出小数的存取比整数需要消耗更多的空间,所以除非必须,否则应该尽量避免使用小数的类型

创建小数类型的字段时,你可以使用FLOAT(10,3)的方式来指定小数的精度,>=Mysql 5.0的版本中最大的精度支持到小数点后65位。

由于数据库采用Binary Array String的方式来存储小数点后面的数字,所以你要求的精度越高,存储空间/计算的CPU时钟可能消耗的也就越高。

 虽然使用小数可能会消耗更多的存储空间以及CPU资源,而且对于早期的Mysql版本还会出现当两个小数参与计算时精度丢失的情况,但是在很多情况下它又是必须的,例如在金融领域中关于金额的存储。在很多情况下为了减少存储的开销以及保证精度的准确性,往往会把小数扩大至整数存储在数据库中,而在Application中再进行小数的转换以及计算,例如 某个用户的账户余额还剩下999.35元,那么在数据中存储的金额为99935分,银行的处理程序拿到99935分后会先转换成999.35元,然后再进行相应的处理

 

·       字符串

不管对于哪门语言而言,字符串都是一个比较重要且复杂的类型,这个规律对于MYSQL同样适用
在MYSQL中主要包括VARCHAR以及CHAR两种字符串类型,对于这两种字符串类型在磁盘以及内存中存储方式是由Storage engine决定的,且不同的storage engine可能会有不同的存储方式。一般情况下对于一种storage engine 而言,在磁盘以及内存中的存储方式也是不同的,当数据在磁盘与内存之间转移时,storage engine将会负责把数据进行转换
VARCHAR
首先需要指出的是Mysql是用variable  length的方式来来存储VARCHAR,相对于fixed length,这种方式对存储空间采取的策略是“用多少,要多少”,是一种比较节省空间的存储方案,在没有特殊需求的情况下可以作为默认的类型

VARCHAR之所以可以实现定长,是因为每个VARCHAR值都会附加一个 长度为1-2byte 的长度指示器,例如当需要存储“I Love Java”时,底层的存储内容为 “11I Love Java”,其中11(1 Byte)代表长度。当需要存储内容的长度为1000时长度指示器就需要两个字节。因为2bytes的最大值为216,所以当存储的字符串超过这个长度时,会出现不可预料的异常,这时就需要使用CLOB来存储这种超长的字符串。

在MYSQL的不同版本中,针对VARCHAR字段的结尾空格处理也有所不同
Version>=5.0 保留结尾的空格
Version
以MYSQL  5.6 为例:
高性能MySql进化论(一):数据类型的优化

?       使用VARCHAR(5) 和VARCHAR(200) 存储’hello’的空间开销是一样的。那么使用更短的列有什么优势吗?

事实证明有很大的优势。更大的列会消耗更多的内存,因为MySQL 通常会分配固定大小的内存块来保存内部值。尤其是使用内存临时表进行排序或操作时会特别糟糕。在利用磁盘临时表进行排序时也同样糟糕。

所以最好的策略是只分配真正需要的空间。

CHAR
CHAR类型与VARCHAR类型最大的区别在于它是定长的。同时相比于VARCHAR它主要有以下特点
 1)在所有的MYSQL版本中,末尾的空格都会被截取
高性能MySql进化论(一):数据类型的优化
2)对于 一些短的且是长度基本相同的字段是个不错的选择例如MD5,ID Number
3)对于经常需要变更的字段,CHAR类型会更高效
4)对于一些超短的字段,也非常的节约空间。例如你保存“Y”或者是“N”,用CHAR只需要一个字节,而用VARCHAR 的话需要两个字节(1byte length+1 byte value)

对于定长的CHAR,Mysql server会根据其定义的长度采用补空格的方式来分配足够大的存储空间。有一点需要注意的是 VARCHAR/CHAR在进行“补空格”以及“去结尾空格”的操作是由Mysql server来实现的,与Storage engine 无关


DATE/TIMESTAMP, BLOB/CLOB/TEXT, ENUM,BIT 这几种类型会在下篇博客中进行讲解

高性能MySql进化论(二):数据类型的优化_下



Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Performance comparison of different Java frameworks Performance comparison of different Java frameworks Jun 05, 2024 pm 07:14 PM

Performance comparison of different Java frameworks: REST API request processing: Vert.x is the best, with a request rate of 2 times SpringBoot and 3 times Dropwizard. Database query: SpringBoot's HibernateORM is better than Vert.x and Dropwizard's ORM. Caching operations: Vert.x's Hazelcast client is superior to SpringBoot and Dropwizard's caching mechanisms. Suitable framework: Choose according to application requirements. Vert.x is suitable for high-performance web services, SpringBoot is suitable for data-intensive applications, and Dropwizard is suitable for microservice architecture.

How to fix mysql_native_password not loaded errors on MySQL 8.4 How to fix mysql_native_password not loaded errors on MySQL 8.4 Dec 09, 2024 am 11:42 AM

One of the major changes introduced in MySQL 8.4 (the latest LTS release as of 2024) is that the "MySQL Native Password" plugin is no longer enabled by default. Further, MySQL 9.0 removes this plugin completely. This change affects PHP and other app

AI startups collectively switched jobs to OpenAI, and the security team regrouped after Ilya left! AI startups collectively switched jobs to OpenAI, and the security team regrouped after Ilya left! Jun 08, 2024 pm 01:00 PM

Last week, amid the internal wave of resignations and external criticism, OpenAI was plagued by internal and external troubles: - The infringement of the widow sister sparked global heated discussions - Employees signing "overlord clauses" were exposed one after another - Netizens listed Ultraman's "seven deadly sins" Rumors refuting: According to leaked information and documents obtained by Vox, OpenAI’s senior leadership, including Altman, was well aware of these equity recovery provisions and signed off on them. In addition, there is a serious and urgent issue facing OpenAI - AI safety. The recent departures of five security-related employees, including two of its most prominent employees, and the dissolution of the "Super Alignment" team have once again put OpenAI's security issues in the spotlight. Fortune magazine reported that OpenA

70B model generates 1,000 tokens in seconds, code rewriting surpasses GPT-4o, from the Cursor team, a code artifact invested by OpenAI 70B model generates 1,000 tokens in seconds, code rewriting surpasses GPT-4o, from the Cursor team, a code artifact invested by OpenAI Jun 13, 2024 pm 03:47 PM

70B model, 1000 tokens can be generated in seconds, which translates into nearly 4000 characters! The researchers fine-tuned Llama3 and introduced an acceleration algorithm. Compared with the native version, the speed is 13 times faster! Not only is it fast, its performance on code rewriting tasks even surpasses GPT-4o. This achievement comes from anysphere, the team behind the popular AI programming artifact Cursor, and OpenAI also participated in the investment. You must know that on Groq, a well-known fast inference acceleration framework, the inference speed of 70BLlama3 is only more than 300 tokens per second. With the speed of Cursor, it can be said that it achieves near-instant complete code file editing. Some people call it a good guy, if you put Curs

iOS 18 adds a new 'Recovered' album function to retrieve lost or damaged photos iOS 18 adds a new 'Recovered' album function to retrieve lost or damaged photos Jul 18, 2024 am 05:48 AM

Apple's latest releases of iOS18, iPadOS18 and macOS Sequoia systems have added an important feature to the Photos application, designed to help users easily recover photos and videos lost or damaged due to various reasons. The new feature introduces an album called "Recovered" in the Tools section of the Photos app that will automatically appear when a user has pictures or videos on their device that are not part of their photo library. The emergence of the "Recovered" album provides a solution for photos and videos lost due to database corruption, the camera application not saving to the photo library correctly, or a third-party application managing the photo library. Users only need a few simple steps

How to optimize the performance of multi-threaded programs in C++? How to optimize the performance of multi-threaded programs in C++? Jun 05, 2024 pm 02:04 PM

Effective techniques for optimizing C++ multi-threaded performance include limiting the number of threads to avoid resource contention. Use lightweight mutex locks to reduce contention. Optimize the scope of the lock and minimize the waiting time. Use lock-free data structures to improve concurrency. Avoid busy waiting and notify threads of resource availability through events.

How to handle database connection errors in PHP How to handle database connection errors in PHP Jun 05, 2024 pm 02:16 PM

To handle database connection errors in PHP, you can use the following steps: Use mysqli_connect_errno() to obtain the error code. Use mysqli_connect_error() to get the error message. By capturing and logging these error messages, database connection issues can be easily identified and resolved, ensuring the smooth running of your application.

China Mobile: Humanity is entering the fourth industrial revolution and officially announced 'three plans” China Mobile: Humanity is entering the fourth industrial revolution and officially announced 'three plans” Jun 27, 2024 am 10:29 AM

According to news on June 26, at the opening ceremony of the 2024 World Mobile Communications Conference Shanghai (MWC Shanghai), China Mobile Chairman Yang Jie delivered a speech. He said that currently, human society is entering the fourth industrial revolution, which is dominated by information and deeply integrated with information and energy, that is, the "digital intelligence revolution", and the formation of new productive forces is accelerating. Yang Jie believes that from the "mechanization revolution" driven by steam engines, to the "electrification revolution" driven by electricity, internal combustion engines, etc., to the "information revolution" driven by computers and the Internet, each round of industrial revolution is based on "information and "Energy" is the main line, bringing productivity development

See all articles