python多线程爬虫往mysql里面写数据导致死锁
PHP中文网
PHP中文网 2017-04-18 10:10:53
0
3
435

我在爬虫里面用的是MySQLdb这个包进行insert操作,我一开始是所有子线程公用一个mysql连接,结果发现数据写不进去,然后我又试过所有子线程公用一个cursor游标,出现了只有部分数据写进去了,而且自增id居然还自增了。(也就是数据没有写入进去,但是自增id的自增数增加了),最后我干脆每个子线程在每一次写入的时候创建一个连接句柄,然后就出现了如图症状,请问这该怎么办啊?有什么优化方法吗?

PHP中文网
PHP中文网

认证高级PHP讲师

reply all(3)
刘奇

Thanks for the invitation. Metadata Lock (MDL) was introduced into mysql's metadata protection mechanism in 5.5. 5.5 The protection of metadata is at the transaction level, and the MDL is only released after the transaction ends.
When a session performs a DML operation in the main database and has not yet submitted, another session performs a DDL operation on the same object such as drop table. Since MySQL's binlog records based on the order of transaction submission, the slave database When applying, there will be a situation where the table is dropped first and then inserted into the table, causing an error in the application from the library. Therefore, MySQL introduced the Metadata lock after version 5.5.3. The Metadata lock will only be released after the transaction ends, so DDL operations cannot be performed before the transaction is committed or rolled back.
The causes of Waiting for table metadata lock are generally the following simple scenarios:
Scenario 1: Long transaction running, blocking DDL, and then blocking all subsequent operations on the same table
Scenario 2: Uncommitted transactions, blocking DDL, Then block all subsequent operations on the same table

So how to solve it?
View all transaction locks in waiting blocked state

USE INFORMATION_SCHEMA
SELECT * 
FROM INNODB_LOCKS 
WHERE LOCK_TRX_ID IN (SELECT BLOCKING_TRX_ID FROM INNODB_LOCK_WAITS);

Or you can directly target a specific table

SELECT * FROM INNODB_LOCKS 
WHERE LOCK_TABLE = db_name.table_name;

Deleting the relevant records directly should unlock the lock.

Recommendation: Use a connection pool instead of sharing a connection for all threads, and remember to commit or rollback each transaction.
Recommended reference:
https://dev.mysql.com/doc/ref...
http://www.cnblogs.com/cchust...
https://gold.xitu.io/entry/57 ...
http://www.cnblogs.com/digdee...

Peter_Zhu

The table is from Innodb, but automatic submission is not enabled and there is no manual submission.

黄舟

First of all, multiple threads share a connection, which is equivalent to concurrency. Many threads need to wait.
The utilization rate of one connection per thread is not high, so it is generally a connection pool. There is no need for recycling, and the connection is always maintained, avoiding frequent disconnections.

With pictures, whether the subject has opened the transaction

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template