I have a table and a stored procedure as shown below,
CREATE TABLE `inspect_call` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `task_id` bigint(20) unsigned NOT NULL DEFAULT '0', `cc_number` varchar(63) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '', `created_at` bigint(20) unsigned NOT NULL DEFAULT '0', `updated_at` bigint(20) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (`id`), KEY `task_id` (`task_id`) ) ENGINE=InnoDB AUTO_INCREMENT=234031 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci CREATE PROCEDURE inspect_proc(IN task bigint,IN number varchar(63)) INSERT INTO inspect_call(task_id,cc_number) values (task, number)
I had assumed that calling the stored procedure would be much faster than just calling the insert. But to my surprise, that's not the case. When I insert 10000 rows of records, the insert command takes about 4 minutes and the stored procedure takes about 15 minutes.
I have run the test multiple times to confirm this. MySQL server is not a high end server but I don't understand why calling stored procedures is so much slower.
#using mysql-connector-python 8.0.31 command = ("INSERT INTO inspect_call (task_id,cc_number)" "VALUES (%s, %s)") for i in range(rows): cursor.execute(command, (task_id,f"{cc}{i}")) # cursor.callproc("inspect_proc", (task_id,f"{cc}{i}")) cnx.commit()
BTW, I read some articles saying I could set innodb_flush_log_at_trx_commit = 2
to increase insert speed, but I'm not going to do that.
--- renew ---
Based on the answers I got, I tried batch inserting (executemany
) to see if there was any improvement, but to my surprise there was no.
cursor = cnx.cursor(buffered=True) for i in range(int(rows/1000)): data = [] for j in range(1000): data.append((task_id,f"{cc}{i*1000+j}")) cursor.executemany(command,data) cnx.commit() # no improvement compared to cursor = cnx.cursor() for i in range(rows): cursor.execute(command, (task_id,f"{cc}{i}"))I tried many times (also tried
executemany 1 shot of 100 records) and found that their performance was basically the same.
--- Update 2 ---
I finally understand why the insertion is so slow! Because I'm running the script from my laptop and accessing the database from its external hostname. Once I uploaded the script to the server and accessed the database from within the intranet, it was much faster. Inserting 10,000 records takes about 3 to 4 seconds; inserting 100,000 records takes about 36 seconds. My lack of internet would make such a difference!
But
executemany did not improve the performance in my case.
Your example does not credit stored procedures because it does not take advantage of any of their advantages.
The main advantages of stored procedures are:
Suppose you have a logic complex enough that it cannot be manipulated via UPDATE , and you want to do it, for example in Python, it requires:
Imagine the same example implemented using stored procedures. In this kind of example, there is a good chance that the stored procedure will really make a difference.
In your example you don't have any logic, just inserting rows. This is an I/O bound use case. There is little or no benefit to having a compiled program. You'll have as many network exchanges as you would with INSERT. Either way, the row must be sent to the server. Network traffic has not increased either.
In your example, maybe
bulk inserts
can help achieve the best performance.