webgame中Mysql Deadlock ERROR 1213 (40001)错误的排查历程-Mysql Tutorial-php.cn

Home

Database

Mysql Tutorial

webgame中Mysql Deadlock ERROR 1213 (40001)错误的排查历程

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 pm 04:34 PM

1 deadlock error mysql

案例发现: 从我们正在运营的一款webgame的异常日志中看到一些程序执行MYSQL 语句的报错信息。比较多的是 “Deadlock found when trying to get lock; try restarting transaction” ,少部分是“Error number: 1205:Lock wait timeout exceeded; try restarti

案例发现:
从我们正在运营的一款webgame的异常日志中看到一些程序执行MYSQL 语句的报错信息。比较多的是“Deadlock found when trying to get lock; try restarting transaction”,少部分是“Error number: 1205:Lock wait timeout exceeded; try restarting transaction”，如下：

001 --> 2012-11-22 06:05:36 --> ERROR   -->system/database/Driver.php--777--log--Debug
002 --> 2012-11-22 06:05:36 --> ERROR   -->system/database/Driver.php--295--error--JV_Driver
003 --> 2012-11-22 06:05:36 --> ERROR   -->system/database/ActiveRecord.php--947--query--JV_Driver
004 --> 2012-11-22 06:05:36 --> ERROR   -->server/models/MRoleMonster.php--84--update--JV_ActiveRecord
005 --> 2012-11-22 06:05:36 --> ERROR   -->server/daemon/update.php--392--kill--MRoleMonster
006 --> 2012-11-22 06:05:36 --> ERROR   -->   DATABASE: xxx_roles_xxx(10.1.1.75)
    -->  Error number: 1205:#####Lock wait timeout exceeded; try restarting transaction#####
    -->  Error Message: #####db_query_error --> Query Error: UPDATE `monster` SET `kills` = kills + 1 WHERE `id` = '30036' AND `role_id` = '19863'.#####
    -->  query elapsed counter: 184293;time 590.4272678
    -->  Database Connection has be closed:dbwRole
001 --> 2012-11-28 15:59:47 --> ERROR   -->system/database/Driver.php--777--log--Debug
002 --> 2012-11-28 15:59:47 --> ERROR   -->system/database/Driver.php--295--error--JV_Driver
003 --> 2012-11-28 15:59:47 --> ERROR   -->system/database/ActiveRecord.php--948--query--JV_Driver
004 --> 2012-11-28 15:59:47 --> ERROR   -->server/models/MRole.php--1143--update--JV_ActiveRecord
005 --> 2012-11-28 15:59:47 --> ERROR   -->server/daemon/update_other.php--283--updateRoleState--MRole
006 --> 2012-11-28 15:59:47 --> ERROR   -->   DATABASE: xxx_roles_xxx(10.1.1.72)
    -->  Error number: 1213:#####Deadlock found when trying to get lock; try restarting transaction#####
    -->  Error Message: #####db_query_error --> Query Error: UPDATE `role_state` SET `state` = 1
WHERE `role_id` = '53016'.#####
    -->  query elapsed counter: 4972;time 4.2417307
    -->  Database Connection has be closed:dbwRole
007 --> 2012-11-28 15:59:47 --> ERROR   -->system/database/Driver.php--616--log--Debug
008 --> 2012-11-28 15:59:47 --> ERROR   -->server/daemon/combat_update.php--308--transComplete--JV_Driver
009 --> 2012-11-28 15:59:47 --> ERROR   --> DB Transaction Failure.

Copy after login

从报错的英文上理解，大约是发生了“死锁”，以及“事务锁等待超时”两个错误异常。而且，都是我们后台PHP常驻进程遇到的问题。异常的代码对应行数上，大约可理解为执行SQL语句的一个指令，并无特殊的东西。有经验的程序员，很容易看出来，这不是程序的异常，这是MYSQL事务中，锁竞争的异常，客户端(PHP常驻进程)是没有语法上的错误的。那该如何排查呢？

一串疑问：
这是什么问题？如何排查？什么时候发生死锁？我怎么知道他发生了？发生之后去哪里排查？如何排查？怎么确定他们对应的事务中的所有SQL语句？分别在哪几个事务中？谁先锁的？谁后锁的？谁没锁到？谁报的死锁错误？死锁是什么？为什么发生了？如何避免？还有哪些因素影响？

毫无头绪:
程序间数据交互，上strace神器?
跟踪谁？客户端(php)？你知道哪个客户端会发生这个问题？你知道啥时候会发生？在你开始抓包到抓到死锁的期间，这得是多大的数据量？
跟踪谁？服务端(Mysql)？玩笑开大了吧？mysql以进程模式来处理客户端请求，每次都是一个新的进程，strace -ff参数的话，想想日志文件得被创建多少个，数据量会小么？
“万军之中取上将首级”这本事我可没…strace排查这种错误，还是算了吧。
这是谁报的错？显然是mysql，那就从根源找起–MYSQL server。

抓获现场:
我们要还原案发现场，有幸的时，我们有监控记录BINLOG以及SHOW ENGINE INNODB STATUS。到对应mysql服务器上，执行“show engine innodb status”获取INNODB引擎当前信息，大约如下：

......
------------------------
LATEST DETECTED DEADLOCK
------------------------
121128 15:59:46
*** (1) TRANSACTION:
TRANSACTION AC512256, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 1
MySQL thread id 122562823, OS thread handle 0x7fa5c4fbe700, query id 7457663621 10.1.1.8 s001_gamedb Updating
UPDATE `role_state` SET `state` = 1
WHERE `role_id` = '53016'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 477 page no 1386 n bits 128 index `PRIMARY` of table `xxx_roles_xxx`.`role_state` trx id AC512256 lock_mode X locks rec but not gap waiting
Record lock, heap no 17 PHYSICAL RECORD: n_fields 80; compact format; info bits 0
 0: len 3; hex 00cf18; asc    ;;
......
......
*** (2) TRANSACTION:
TRANSACTION AC512250, ACTIVE 0 sec inserting, thread declared inside InnoDB 500
mysql tables in use 1, locked 1
6 lock struct(s), heap size 1248, 3 row lock(s), undo log entries 2
MySQL thread id 122679850, OS thread handle 0x7fac007ff700, query id 7457663711 10.1.1.8 s001_gamedb update
REPLACE INTO `role_fight` (`role_id`, `life_max`, `mana_max`, `attack_physical`, `attack_internal`,****) VALUES ('53016', 4967, 3291, 350, 174, ***)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 477 page no 1386 n bits 128 index `PRIMARY` of table `xxx_roles_xxx`.`role_state` trx id AC512250 lock_mode X locks rec but not gap
Record lock, heap no 17 PHYSICAL RECORD: n_fields 80; compact format; info bits 0
......
......
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 427 page no 488 n bits 192 index `PRIMARY` of table `xxx_roles_xxx`.`role_fight` trx id AC512250 lock_mode X locks rec but not gap waiting
Record lock, heap no 64 PHYSICAL RECORD: n_fields 51; compact format; info bits 0
......
*** WE ROLL BACK TRANSACTION (1)
......

Copy after login

这是我精简之后的信息，我抓去了LATEST DETECTED DEADLOCK部分的数据，这部分的数据是INNODB的最后一次发生死锁的信息，更详细的说明见MYSQL官方手册对Standard Monitor and Lock Monitor Output返回结果的解释。
OK,发现一场案例，保存这个INNODB的状态数据备用。迅速到程序异常日志中查看相同时间点是否有死锁发生。果然，我们程序异常日志中记录了这起案例(文章开头的日志)。
再到binlog中抓去这个时间段前后10分钟(大约范围)的mysql sql语句执行日志。

案情分析：
engine status中，大约看出MYSQL记录了两个事务之间发生锁竞争时，遗留的数据，

事务1“执行”(注意，这里加了双引号)

UPDATE `role_state` SET `state` = 1 WHERE `role_id` = '53016'

Copy after login

发现被修改资源已经被lock_mode X locks了(详情见:INNODB锁模式)，准备等待该资源锁被释放。

事务2执行

REPLACE INTO `role_fight` (`role_id`, `life_max`, `mana_max`, `attack_physical`, `attack_internal`,****) VALUES ('53016', 4967, 3291, 350, 174, ***)

Copy after login

也发现该资源被lock_mode X locks了。

最后部分，mysql给了很重要的一个数据“WE ROLL BACK TRANSACTION (1)” MYSQL回滚了事物1。既然mysql回滚了1，那么肯定是事务1的语句触发了死锁，被mysql回滚了，也就是应该为程序中的异常日志所记录的那部分。同时，MYSQL执行了事务2，那么事务2的SQL语句肯定被记录在BINLOG中了。

抽丝剥茧:
如何确定事务1、事务2执行了哪些SQL语句呢？
根据show engine innodb status的结果，确定事务2被执行的

SQL语句(业务逻辑的role_id唯一标识): REPLACE INTO `role_fight` (`role_id`, `life_max`, `mana_max`, `attack_physical`, `attack_internal`,****) VALUES (’53016′, 4967, 3291, 350, 174, ***)
线程ID(mysql的唯一标识): MySQL thread id 122679850
执行时间(时间线):121128 15:59:46

根据这三个标识，以及BINLOG的起始表示“BEGIN、COMMIT”，几乎可以100%确定该事务所包含的SQL语句。

BINLOG信息大约如下:

# at 511750764
#121128 15:59:46 server id 1  end_log_pos 511750843 	Query	thread_id=122679850	exec_time=0	error_code=0
SET TIMESTAMP=1354089586/*!*/;
BEGIN
/*!*/;
# at 511750843
#121128 15:59:46 server id 1  end_log_pos 511751090 	Query	thread_id=122679850	exec_time=0	error_code=0
use xxx_roles_xxx/*!*/;
SET TIMESTAMP=1354089586/*!*/;
UPDATE `role_pet` SET `in_supporting` = 0, `levelup_pause_time` = 1354089587, `auto_feed` = 0, `supporting_pause_time` = 1354089587
WHERE `role_id` = '53016'
AND `id` = 9234
/*!*/;
# at 511751090
#121128 15:59:46 server id 1  end_log_pos 511751240 	Query	thread_id=122679850	exec_time=0	error_code=0
SET TIMESTAMP=1354089586/*!*/;
UPDATE `role_state` SET `pet` = 0, `pet_level` = 0
WHERE `role_id` = '53016'
/*!*/;
# at 511751240
#121128 15:59:46 server id 1  end_log_pos 511751885 	Query	thread_id=122679850	exec_time=0	error_code=0
SET TIMESTAMP=1354089586/*!*/;
REPLACE INTO `role_fight` (`role_id`, `life_max`, `mana_max`, `attack_physical`, `attack_internal`, `defend_physical`, `defend_internal`, `dodge_rate`, `critical_rate`, `hit_rate`, `speed`, `defend_physical_plus`, `defend_internal_plus`, `dodge_level`,*****) VALUES ('53016', 4967, 3291, 350, 174, 518, 254, 500, 300, 9500, 913, 668, 668, 261, 700, 97, 133, 40.9, 34, *****)
/*!*/;
# at 511751885
#121128 15:59:46 server id 1  end_log_pos 511751912 	Xid = 7457663579
COMMIT/*!*/;

Copy after login

OK,事务2的SQL语句全部找齐了。那么事务1的呢？如何找？

根据php的异常报错，确定主要包含的语句SQL信息，以及程序跟踪的代码行数，根据代码逻辑去确定该事务的所有SQL语句。再去BINLOG中找到该用户的该业务的类似BINLOG：

# at 511805324
#121128 15:59:53 server id 1  end_log_pos 511805403 	Query	thread_id=122562823	exec_time=0	error_code=0
SET TIMESTAMP=1354089593/*!*/;
BEGIN
/*!*/;
# at 511805403
#121128 15:59:53 server id 1  end_log_pos 511805560 	Query	thread_id=122562823	exec_time=0	error_code=0
use xxx_roles_xxx/*!*/;
SET TIMESTAMP=1354089593/*!*/;
UPDATE `role_fight` SET `last_update_life` = '1354089587'
WHERE `role_id` = '53016'
/*!*/;
# at 511805560
#121128 15:59:53 server id 1  end_log_pos 511805695 	Query	thread_id=122562823	exec_time=0	error_code=0
SET TIMESTAMP=1354089593/*!*/;
UPDATE `role_state` SET `state` = 1
WHERE `role_id` = '53016'
/*!*/;
# at 511805695
#121128 15:59:53 server id 1  end_log_pos 511805889 	Query	thread_id=122562823	exec_time=0	error_code=0
use xxx_roles_xxx/*!*/;
SET TIMESTAMP=1354089593/*!*/;
DELETE FROM `queue_combats_update_roles`
WHERE `combat_id` = 'f27d62dad8efcaeb04cd8f5d7c0424db'
AND `role_id` = '53016'
/*!*/;
# at 511805889
#121128 15:59:53 server id 1  end_log_pos 511805916 	Xid = 7457670215
COMMIT/*!*/;

Copy after login

(请勿过于纠结上面binlog的thread_id跟show engine innodb status的thread_id一致的问题，这是因为我们程序是常驻进程，mysql连接不断开，不销毁，故一致了。而且，此日志是程序发现死锁之后，被mysql回滚之后，又重新提交的事务，算是不同时间点的相同事务)

案情还原：
根据案发现场的两个MYSQL INNODB事务的全部SQL语句，以及形成MYSQL INNODB 死锁的原因(感谢DBA组大雄哥的纠正)，我们大约可以这么还原案情:

事务1：
UPDATE `role_fight` SET `last_update_life` = ’1354089587′ WHERE `role_id` = ’53016′
UPDATE `role_state` SET `state` = 1 WHERE `role_id` = ’53016′

事务2：
UPDATE `role_state` SET `pet` = 0, `pet_level` = 0 WHERE `role_id` = ’53016′
REPLACE INTO `role_fight` (`role_id`, `life_max`, `mana_max`, `attack_physical`, `attack_internal`,****) VALUES (’53016′, 4967, 3291, 350, 174, ***)

这四条语句构成了本次事务死锁的全部原因。
执行顺序肯定如下:

时间点	事务1	事务2	备注
1	begin
2	begin
3	UPDATE `role_state` SET `pet` = 0, `pet_level` = 0 WHERE `role_id` = ’53016′	事务2 给 role_state表 role_id 53016记录上 X 锁
4	UPDATE `role_fight` SET `last_update_life` = ’1354089587′ WHERE `role_id` = ’53016′	事务1 给 role_fight表 role_id 53016记录上 X 锁
5	REPLACE INTO `role_fight` (`role_id`, `life_max`, `mana_max`, `attack_physical`, `attack_internal`,**) VALUES (’53016′, 4967, 3291, 350, 174, *)	这里是重点，事务2给role_fight表role_id的记录上 X 锁，发现被其他人(事务1)上锁了，开始等待他人提交事务…等待…
6	UPDATE `role_state` SET `state` = 1 WHERE `role_id` = ’53016′	事物1打算给role_state表role_id为53016记录上 X 排它锁，发现被其他事务上了，而且此事务居然还在等他提交，这时MYSQL立刻回滚事务1…(php发现MYSQL返回死锁信息，随记录该信息到异常日志…发送回滚指令…mysql已经“帮”他回滚了…)
7	【执行成功…】	事务2发现别人释放锁了，OK，获取X锁，修改成功
8	commit	PHP程序发现上一条指令执行完毕，且执行无错，即，发送commit指令，提交事务。

好像有个参数%^:
innodb_lock_wait_timeout参数是干啥的呢？从mysql官方手册上看，此参数是针对锁等待时，一个限定等待时间的参数。跟死锁并无关系，一旦mysql发现死锁，立刻回滚导致死锁的语句。并不会用到该参数。

规避方式：

缩小事务的语句数量
调整SQL语句执行顺序，变“死锁”为“锁等待”，等待一会，总比整个事务回滚掉，全部重新再执行这个流程要强.
其他.请补充

关于锁等待:
缩小事务间SQL语句的数量，减小规模吧。当然，提高检索速度，提高查询时间也是首要因素，我们就发现我们的SQL语句中，有几个没有用到索引，导致锁表，导致锁等待发生…

备注:
年底了，冲KPI的，各位见笑了.

原文地址：webgame中Mysql Deadlock ERROR 1213 (40001)错误的排查历程, 感谢原作者分享。

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Will R.E.P.O. Have Crossplay?

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7549

CakePHP Tutorial

1382

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

MySQL: Simple Concepts for Easy Learning Apr 10, 2025 am 09:29 AM

MySQL is an open source relational database management system. 1) Create database and tables: Use the CREATEDATABASE and CREATETABLE commands. 2) Basic operations: INSERT, UPDATE, DELETE and SELECT. 3) Advanced operations: JOIN, subquery and transaction processing. 4) Debugging skills: Check syntax, data type and permissions. 5) Optimization suggestions: Use indexes, avoid SELECT* and use transactions.

How to open phpmyadmin Apr 10, 2025 pm 10:51 PM

You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".

MySQL: An Introduction to the World's Most Popular Database Apr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

Why Use MySQL? Benefits and Advantages Apr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

How to use single threaded redis Apr 10, 2025 pm 07:12 PM

Redis uses a single threaded architecture to provide high performance, simplicity, and consistency. It utilizes I/O multiplexing, event loops, non-blocking I/O, and shared memory to improve concurrency, but with limitations of concurrency limitations, single point of failure, and unsuitable for write-intensive workloads.

MySQL and SQL: Essential Skills for Developers Apr 10, 2025 am 09:30 AM

MySQL and SQL are essential skills for developers. 1.MySQL is an open source relational database management system, and SQL is the standard language used to manage and operate databases. 2.MySQL supports multiple storage engines through efficient data storage and retrieval functions, and SQL completes complex data operations through simple statements. 3. Examples of usage include basic queries and advanced queries, such as filtering and sorting by condition. 4. Common errors include syntax errors and performance issues, which can be optimized by checking SQL statements and using EXPLAIN commands. 5. Performance optimization techniques include using indexes, avoiding full table scanning, optimizing JOIN operations and improving code readability.

MySQL's Place: Databases and Programming Apr 13, 2025 am 12:18 AM

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

How to recover data after SQL deletes rows Apr 09, 2025 pm 12:21 PM

Recovering deleted rows directly from the database is usually impossible unless there is a backup or transaction rollback mechanism. Key point: Transaction rollback: Execute ROLLBACK before the transaction is committed to recover data. Backup: Regular backup of the database can be used to quickly restore data. Database snapshot: You can create a read-only copy of the database and restore the data after the data is deleted accidentally. Use DELETE statement with caution: Check the conditions carefully to avoid accidentally deleting data. Use the WHERE clause: explicitly specify the data to be deleted. Use the test environment: Test before performing a DELETE operation.

See all articles