Whether it is daily business data processing or database import and export, you may encounter insertions that require processing of large amounts of data. Both the insertion method and the database engine will affect the insertion speed. This article aims to analyze and compare various methods from a theoretical and practical perspective to facilitate the selection of insertion methods in future applications.
The time required to insert a record in MySQL is composed of the following factors, and the numbers represent approximate proportions:
Connection: (3 )
Send query to server: (2)
Analyze query: (2)
Insert record: (1x record size)
Insert index: (1x index)
Close: (1)
If we execute a SQL statement every time we insert an item, then we need to execute all steps except connection and closing N times. This is very time-consuming. There are several optimization methods:
Write multiple rows in each insert statement, batch insert
Write all query statements into the transaction
Use Load Data to import data
The performance of each method is as follows.
InnoDB provides MySQL with transaction-safe (ACID compliant) transaction (commit), rollback (rollback) and crash recovery capabilities (crash recovery capabilities) ) type table. InnoDB provides row locking (locking on row level) and foreign key constraints (FOREIGN KEY constraints).
InnoDB is designed to handle large-capacity database systems, and its CPU utilization is unmatched by other disk-based relational database engines. Technically, InnoDB is a complete database system placed on the MySQL background. InnoDB establishes its own dedicated buffer pool in main memory for caching data and indexes.
Macbook Air 12mid apache2.2.26 php5.5.10 mysql5.6.16
Total 100W pieces of data
After insertion, the database size is 38.6MB ( No index), 46.8 (with index)
Total time taken for single insert without index: 229s Peak memory: 246KB
Single insert with index Total time taken: 242s Peak memory: 246KB
Batch insert without index Total time taken: 10s Peak memory: 8643KB
Batch insert with index Total time taken: 16s Peak memory: 8643KB
Total time taken for transaction insertion without index: 78s Peak memory: 246KB
Insertion with indexed transaction Total time taken: 82s Peak memory: 246KB
Load without index Total time spent inserting data: 12s Peak memory: 246KB
Load with index Total data insertion time: 11s Peak memory: 246KB
MyISAM is the default storage engine of MySQL. The design is simple and supports full-text search.
Macbook Air 12mid apache2.2.26 php5.5.10 mysql5.6.16
Total 100W pieces of data
After insertion, the database size is 19.1MB ( No index), 38.6 (with index)
Total time taken for single insert without index: 82s Peak memory: 246KB
Single insert with index Total time taken: 86s Peak memory: 246KB
Batch insert without index Total time taken: 3s Peak memory: 8643KB
Batch insert with index Total time taken: 7s Peak memory: 8643KB
Load without index Total time taken for data insertion: 6s Peak memory: 246KB
Load with index Total data insertion time: 8s Peak memory: 246KB
The amount of data I tested is not very large, but I can roughly understand the speed of these insertion methods. The fastest method must be the Load Data method. This method is relatively troublesome because it involves writing files, but it can take into account both memory and speed.
<?php $dsn = 'mysql:host=localhost;dbname=test'; $db = new PDO($dsn,'root','',array(PDO::ATTR_PERSISTENT => true)); //删除上次的插入数据 $db->query('delete from `test`'); //开始计时 $start_time = time(); $sum = 1000000; // 测试选项 $num = 1; if ($num == 1){ // 单条插入 for($i = 0; $i < $sum; $i++){ $db->query("insert into `test` (`id`,`name`) values ($i,'tsetssdf')"); } } elseif ($num == 2) { // 批量插入,为了不超过max_allowed_packet,选择每10万插入一次 for ($i = 0; $i < $sum; $i++) { if ($i == $sum - 1) { //最后一次 if ($i%100000 == 0){ $values = "($i, 'testtest')"; $db->query("insert into `test` (`id`, `name`) values $values"); } else { $values .= ",($i, 'testtest')"; $db->query("insert into `test` (`id`, `name`) values $values"); } break; } if ($i%100000 == 0) { //平常只有在这个情况下才插入 if ($i == 0){ $values = "($i, 'testtest')"; } else { $db->query("insert into `test` (`id`, `name`) values $values"); $values = "($i, 'testtest')"; } } else { $values .= ",($i, 'testtest')"; } } } elseif ($num == 3) { // 事务插入 $db->beginTransaction(); for($i = 0; $i < $sum; $i++){ $db->query("insert into `test` (`id`,`name`) values ($i,'tsetssdf')"); } $db->commit(); } elseif ($num == 4) { // 文件load data $filename = dirname(__FILE__).'/test.sql'; $fp = fopen($filename, 'w'); for($i = 0; $i < $sum; $i++){ fputs($fp, "$i,'testtest'\r\n"); } $db->exec("load data infile '$filename' into table test fields terminated by ','"); } $end_time = time(); echo "总耗时", ($end_time - $start_time), "秒\n"; echo "峰值内存", round(memory_get_peak_usage()/1000), "KB\n"; ?>
The above is the performance analysis and comparison of various methods for inserting large amounts of data into MySQL. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!