The editor below will bring you a detailed explanation of the partition function and example analysis of mysql. The editor thinks it is quite good, so I will share it with you now and give it as a reference for everyone. Let’s follow the editor and take a look.
1. What is a database partition?
I wrote an article about mysql tables some time ago Article, let’s talk about what database partitioning is, taking mysql as an example. The data in the mysql database is stored on the disk in the form of files. It is placed under /mysql/data by default (can be viewed through the datadir in my.cnf). One table mainly corresponds to three files, one is frm storage In the table structure, one is myd that stores table data, and the other is myi that stores table indexes. If the amount of data in a table is too large, then myd and myi will become very large, and searching for data will become very slow. At this time, we can use the partition function of mysql to physically correspond to this table. The three files are divided into many small blocks. In this way, when we search for a piece of data, we don’t have to search all of them. We only need to know which block the data is in, and then search in that block. If the data in the table is too large, it may not fit on one disk. At this time, we can allocate the data to different disks.
Two ways of partitioning
1. Horizontal partitioning
What is What about horizontal partitions? It means partitioning horizontally. For example, if there are 1 million pieces of data, it is divided into ten parts. The first 100 thousand pieces of data are placed in the first partition, the second 100 thousand pieces of data are placed in the second partition, and so on. That is to say, the table is divided into ten parts, and merge is used to divide the table. It is a bit like this. When a piece of data is retrieved, this data contains all the fields in the table structure, which means horizontal partitioning does not change the structure of the table.
2, vertical partitioning
What is vertical partitioning? It is partitioned vertically. For example, when designing the user table, I did not consider it well at the beginning, but put all the personal information into one table, so that there will be relatively large fields in this table. , such as personal profile, and these profiles may not be viewed by many people, so when someone wants to read them, when searching and dividing tables, such large fields can be separated Come.
It feels like the partitioning of the database is like cutting an apple. Should it be cut horizontally or vertically? It depends on personal preference. The partitions provided by mysql belong to the first type, horizontal partitioning, and are subdivided into many ways. . An example will be given below.
Second, mysql partitioning
I think there is only one way to partition mysql, but it uses different algorithms and rules Just distribute the data into different blocks.
1, mysql5.1 and above support partition function
InstallationWhen installing, we can check it
[root@BlackGhost mysql-5.1.50]# ./configure --help |grep -A 3 Partition === Partition Support === Plugin Name: partition Description: MySQL Partitioning Support Supports build: static Configurations: max, max-no-ndb
Check it. If you find the above thing, it means that it supports partitioning and is turned on by default. If you have already installed mysql,
mysql> show variables like "%part%"; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | have_partitioning | YES | +-------------------+-------+ 1 row in set (0.00 sec)
Check the variables. If it is supported, there will be the above prompt.
2, rangePartition
The table partitioned according to RANGE is partitioned in the following way. Each partition contains Those rows where the value of the partition expression lies within a given continuous interval
//创建range分区表 mysql> CREATE TABLE IF NOT EXISTS `user` ( -> `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '用户ID', -> `name` varchar(50) NOT NULL DEFAULT '' COMMENT '名称', -> `sex` int(1) NOT NULL DEFAULT '0' COMMENT '0为男,1为女', -> PRIMARY KEY (`id`) -> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 -> PARTITION BY RANGE (id) ( -> PARTITION p0 VALUES LESS THAN (3), -> PARTITION p1 VALUES LESS THAN (6), -> PARTITION p2 VALUES LESS THAN (9), -> PARTITION p3 VALUES LESS THAN (12), -> PARTITION p4 VALUES LESS THAN MAXVALUE -> ); Query OK, 0 rows affected (0.13 sec) //插入一些数据 mysql> INSERT INTO `test`.`user` (`name` ,`sex`)VALUES ('tank', '0') -> ,('zhang',1),('ying',1),('张',1),('映',0),('test1',1),('tank2',1) -> ,('tank1',1),('test2',1),('test3',1),('test4',1),('test5',1),('tank3',1) -> ,('tank4',1),('tank5',1),('tank6',1),('tank7',1),('tank8',1),('tank9',1) -> ,('tank10',1),('tank11',1),('tank12',1),('tank13',1),('tank21',1),('tank42',1); Query OK, 25 rows affected (0.05 sec) Records: 25 Duplicates: 0 Warnings: 0 //到存放数据库表文件的地方看一下,my.cnf里面有配置,datadir后面就是 [root@BlackGhost test]# ls |grep user |xargs du -sh 4.0K user#P#p0.MYD 4.0K user#P#p0.MYI 4.0K user#P#p1.MYD 4.0K user#P#p1.MYI 4.0K user#P#p2.MYD 4.0K user#P#p2.MYI 4.0K user#P#p3.MYD 4.0K user#P#p3.MYI 4.0K user#P#p4.MYD 4.0K user#P#p4.MYI 12K user.frm 4.0K user.par //取出数据 mysql> select count(id) as count from user; +-------+ | count | +-------+ | 25 | +-------+ 1 row in set (0.00 sec) //删除第四个分区 mysql> alter table user drop partition p4; Query OK, 0 rows affected (0.11 sec) Records: 0 Duplicates: 0 Warnings: 0 /**存放在分区里面的数据丢失了,第四个分区里面有14条数据,剩下的3个分区 只有11条数据,但是统计出来的文件大小都是4.0K,从这儿我们可以看出分区的 最小区块是4K */ mysql> select count(id) as count from user; +-------+ | count | +-------+ | 11 | +-------+ 1 row in set (0.00 sec) //第四个区块已删除 [root@BlackGhost test]# ls |grep user |xargs du -sh 4.0K user#P#p0.MYD 4.0K user#P#p0.MYI 4.0K user#P#p1.MYD 4.0K user#P#p1.MYI 4.0K user#P#p2.MYD 4.0K user#P#p2.MYI 4.0K user#P#p3.MYD 4.0K user#P#p3.MYI 12K user.frm 4.0K user.par /*可以对现有表进行分区,并且会按規则自动的将表中的数据分配相应的分区 中,这样就比较好了,可以省去很多事情,看下面的操作*/ mysql> alter table aa partition by RANGE(id) -> (PARTITION p1 VALUES less than (1), -> PARTITION p2 VALUES less than (5), -> PARTITION p3 VALUES less than MAXVALUE); Query OK, 15 rows affected (0.21 sec) //对15数据进行分区 Records: 15 Duplicates: 0 Warnings: 0 //总共有15条 mysql> select count(*) from aa; +----------+ | count(*) | +----------+ | 15 | +----------+ 1 row in set (0.00 sec) //删除一个分区 mysql> alter table aa drop partition p2; Query OK, 0 rows affected (0.30 sec) Records: 0 Duplicates: 0 Warnings: 0 //只有11条了,说明对现有的表分区成功了 mysql> select count(*) from aa; +----------+ | count(*) | +----------+ | 11 | +----------+ 1 row in set (0.00 sec)
3, listPartition
## The definition and selection of each partition in the #LIST partition is based on the value of a column belonging to a value in a value list set, while the RANGE partition belongs to a set of continuous interval values.
//这种方式失败 mysql> CREATE TABLE IF NOT EXISTS `list_part` ( -> `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '用户ID', -> `province_id` int(2) NOT NULL DEFAULT 0 COMMENT '省', -> `name` varchar(50) NOT NULL DEFAULT '' COMMENT '名称', -> `sex` int(1) NOT NULL DEFAULT '0' COMMENT '0为男,1为女', -> PRIMARY KEY (`id`) -> ) ENGINE=INNODB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 -> PARTITION BY LIST (province_id) ( -> PARTITION p0 VALUES IN (1,2,3,4,5,6,7,8), -> PARTITION p1 VALUES IN (9,10,11,12,16,21), -> PARTITION p2 VALUES IN (13,14,15,19), -> PARTITION p3 VALUES IN (17,18,20,22,23,24) -> ); ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function //这种方式成功 mysql> CREATE TABLE IF NOT EXISTS `list_part` ( -> `id` int(11) NOT NULL COMMENT '用户ID', -> `province_id` int(2) NOT NULL DEFAULT 0 COMMENT '省', -> `name` varchar(50) NOT NULL DEFAULT '' COMMENT '名称', -> `sex` int(1) NOT NULL DEFAULT '0' COMMENT '0为男,1为女' -> ) ENGINE=INNODB DEFAULT CHARSET=utf8 -> PARTITION BY LIST (province_id) ( -> PARTITION p0 VALUES IN (1,2,3,4,5,6,7,8), -> PARTITION p1 VALUES IN (9,10,11,12,16,21), -> PARTITION p2 VALUES IN (13,14,15,19), -> PARTITION p3 VALUES IN (17,18,20,22,23,24) -> ); Query OK, 0 rows affected (0.33 sec)
When creating the list partition above, if there is a primary key, the primary key must be in it during partitioning, otherwise an error will be reported. If I don't use the primary key, the partition will be created successfully. Under normal circumstances, a table will definitely have a primary key. This is considered a limitation of a partition.
If you test the data, please refer to the range partition test to operate4, hash partition
HASH partitioning is mainly used to ensure that data is evenly distributed among a predetermined number of partitions. All you have to do is specify a column value or expression based on the column value that will be hashed, and specify that the partitioned table will be hashed. The number of partitions to split into.
mysql> CREATE TABLE IF NOT EXISTS `hash_part` ( -> `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '评论ID', -> `comment` varchar(1000) NOT NULL DEFAULT '' COMMENT '评论', -> `ip` varchar(25) NOT NULL DEFAULT '' COMMENT '来源IP', -> PRIMARY KEY (`id`) -> ) ENGINE=INNODB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 -> PARTITION BY HASH(id) -> PARTITIONS 3; Query OK, 0 rows affected (0.06 sec)
5, key partitioning
Partitioning according to KEY is similar to HASH Partitioning, in addition to HASH partitioning uses user-defined expressions, and the KEY partitioning hash function is provided by the MySQL server.
mysql> CREATE TABLE IF NOT EXISTS `key_part` ( -> `news_id` int(11) NOT NULL COMMENT '新闻ID', -> `content` varchar(1000) NOT NULL DEFAULT '' COMMENT '新闻内容', -> `u_id` varchar(25) NOT NULL DEFAULT '' COMMENT '来源IP', -> `create_time` DATE NOT NULL DEFAULT '0000-00-00 00:00:00' COMMENT '时间' -> ) ENGINE=INNODB DEFAULT CHARSET=utf8 -> PARTITION BY LINEAR HASH(YEAR(create_time)) -> PARTITIONS 3; Query OK, 0 rows affected (0.07 sec)
测试请参考range分区的操作
6,子分区
子分区是分区表中每个分区的再次分割,子分区既可以使用HASH希分区,也可以使用KEY分区。这 也被称为复合分区(composite partitioning)。
1,如果一个分区中创建了子分区,其他分区也要有子分区
2,如果创建了了分区,每个分区中的子分区数必有相同
3,同一分区内的子分区,名字不相同,不同分区内的子分区名子可以相同(5.1.50不适用)
mysql> CREATE TABLE IF NOT EXISTS `sub_part` ( -> `news_id` int(11) NOT NULL COMMENT '新闻ID', -> `content` varchar(1000) NOT NULL DEFAULT '' COMMENT '新闻内容', -> `u_id` int(11) NOT NULL DEFAULT 0s COMMENT '来源IP', -> `create_time` DATE NOT NULL DEFAULT '0000-00-00 00:00:00' COMMENT '时间' -> ) ENGINE=INNODB DEFAULT CHARSET=utf8 -> PARTITION BY RANGE(YEAR(create_time)) -> SUBPARTITION BY HASH(TO_DAYS(create_time))( -> PARTITION p0 VALUES LESS THAN (1990)(SUBPARTITION s0,SUBPARTITION s1,SUBPARTITION s2), -> PARTITION p1 VALUES LESS THAN (2000)(SUBPARTITION s3,SUBPARTITION s4,SUBPARTITION good), -> PARTITION p2 VALUES LESS THAN MAXVALUE(SUBPARTITION tank0,SUBPARTITION tank1,SUBPARTITION tank3) -> ); Query OK, 0 rows affected (0.07 sec)
官方网站说不同分区内的子分区可以有相同的名字,但是mysql5.1.50却不行会提示以下错误
ERROR 1517 (HY000): Duplicate partition name s1
三,分区管理
1,删除分区
1.mysql> alter table user drop partition p4;
2,新增分区
//range添加新分区 mysql> alter table user add partition(partition p4 values less than MAXVALUE); Query OK, 0 rows affected (0.06 sec) Records: 0 Duplicates: 0 Warnings: 0 //list添加新分区 mysql> alter table list_part add partition(partition p4 values in (25,26,28)); Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 //hash重新分区 mysql> alter table hash_part add partition partitions 4; Query OK, 0 rows affected (0.12 sec) Records: 0 Duplicates: 0 Warnings: 0 //key重新分区 mysql> alter table key_part add partition partitions 4; Query OK, 1 row affected (0.06 sec) //有数据也会被重新分配 Records: 1 Duplicates: 0 Warnings: 0 //子分区添加新分区,虽然我没有指定子分区,但是系统会给子分区命名的 mysql> alter table sub1_part add partition(partition p3 values less than MAXVALUE); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> show create table sub1_part\G; *************************** 1. row *************************** Table: sub1_part Create Table: CREATE TABLE `sub1_part` ( `news_id` int(11) NOT NULL COMMENT '新闻ID', `content` varchar(1000) NOT NULL DEFAULT '' COMMENT '新闻内容', `u_id` varchar(25) NOT NULL DEFAULT '' COMMENT '来源IP', `create_time` date NOT NULL DEFAULT '0000-00-00' COMMENT '时间' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 !50100 PARTITION BY RANGE (YEAR(create_time)) SUBPARTITION BY HASH (TO_DAYS(create_time)) (PARTITION p0 VALUES LESS THAN (1990) (SUBPARTITION s0 ENGINE = InnoDB, SUBPARTITION s1 ENGINE = InnoDB, SUBPARTITION s2 ENGINE = InnoDB), PARTITION p1 VALUES LESS THAN (2000) (SUBPARTITION s3 ENGINE = InnoDB, SUBPARTITION s4 ENGINE = InnoDB, SUBPARTITION good ENGINE = InnoDB), PARTITION p2 VALUES LESS THAN (3000) (SUBPARTITION tank0 ENGINE = InnoDB, SUBPARTITION tank1 ENGINE = InnoDB, SUBPARTITION tank3 ENGINE = InnoDB), PARTITION p3 VALUES LESS THAN MAXVALUE (SUBPARTITION p3sp0 ENGINE = InnoDB, //子分区的名子是自动生成的 SUBPARTITION p3sp1 ENGINE = InnoDB, SUBPARTITION p3sp2 ENGINE = InnoDB)) 1 row in set (0.00 sec)
3,重新分区
//range重新分区 mysql> ALTER TABLE user REORGANIZE PARTITION p0,p1,p2,p3,p4 INTO (PARTITION p0 VALUES LESS THAN MAXVALUE); Query OK, 11 rows affected (0.08 sec) Records: 11 Duplicates: 0 Warnings: 0 //list重新分区 mysql> ALTER TABLE list_part REORGANIZE PARTITION p0,p1,p2,p3,p4 INTO (PARTITION p0 VALUES in (1,2,3,4,5)); Query OK, 0 rows affected (0.28 sec) Records: 0 Duplicates: 0 Warnings: 0 //hash和key分区不能用REORGANIZE,官方网站说的很清楚 mysql> ALTER TABLE key_part REORGANIZE PARTITION COALESCE PARTITION 9; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'PARTITION 9' at line 1
四,分区优点
1,分区可以分在多个磁盘,存储更大一点
2,根据查找条件,也就是where后面的条件,查找只查找相应的分区不用全部查找了
3,进行大数据搜索时可以进行并行处理。
4,跨多个磁盘来分散数据查询,来获得更大的查询吞吐量
The above is the detailed content of Detailed explanation of mysql partition function and sample code analysis. For more information, please follow other related articles on the PHP Chinese website!