My MySQL database provides storage backend services for three web applications. However, I recently encountered the error "Waiting for table metadata lock" permanently. This happens almost all the time and I don't understand why.
mysql> show processlist -> ; +------+-----------+-----------------+------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +------+-----------+-----------------+------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+ | 36 | root | localhost:33444 | bookmaker2 | Sleep | 139 | | NULL | | 37 | root | localhost:33445 | bookmaker2 | Sleep | 139 | | NULL | | 38 | root | localhost:33446 | bookmaker2 | Sleep | 139 | | NULL | | 39 | root | localhost:33447 | bookmaker2 | Sleep | 49 | | NULL | | 40 | root | localhost:33448 | bookmaker2 | Sleep | 139 | | NULL | | 1315 | bookmaker | localhost:34869 | bookmaker | Sleep | 58 | | NULL | | 1316 | root | localhost:34874 | bookmaker3 | Sleep | 56 | | NULL | | 1395 | bookmaker | localhost:34953 | bookmaker | Sleep | 58 | | NULL | | 1396 | root | localhost:34954 | bookmaker3 | Sleep | 46 | | NULL | | 1398 | root | localhost:34956 | bookmaker3 | Query | 28 | Waiting for table metadata lock | CREATE TABLE IF NOT EXISTS LogEntries ( lid INT NOT NULL AUTO_INCREMEN | | 1399 | root | localhost | NULL | Query | 0 | NULL | show processlist | +------+-----------+-----------------+------------+---------+------+---------------------------------+------------------------------------------------------------------------------------------------------+
Of course you can kill the corresponding process. However, if I restart the program that is trying to create the table structure for the database "bookmaker3", the newly created process is in Metallock again.
I can't even delete the database:
mysql> drop database bookmaker3;
This will also produce metal locks.
How to fix this problem?
Unfortunately, the accepted solution is wrong. That's absolutely correct
This is indeed (almost certainly; see below) the thing to do. But then it shows,
...And 1398 is not the connection with the lock. how so? 1398 is a connection waiting to be locked. This means it hasn't acquired the lock yet, so killing it has no effect. The process holding the lock will still hold the lock, and the next thread trying to perform some operation will therefore also stop and enter the "wait for metadata lock" in the appropriate order.
You can't guarantee that a process "waiting for a metadata lock" (WFML) won't block as well, but you can be sure that just killing the WFML process will do nothing .
The real reason is that another process is holding the lock, and more importantly,
SHOW FULL PROCESSLIST
will not directly tell you which process.One thing you can be sure of is that there are no processes marked "waiting for metadata lock". It can be said that these people are victims.
SHOW FULL PROCESSLIST
WILL Tells you if the process is doing something, yes. Usually it will work. Here, the process holding the lock does nothing and is hidden in other threads that also do nothing and report as "sleeping".If
SHOW FULL PROCESSLIST
shows you a process running DML, or in a "Send Data" state, then that is almost certainly the culprit. Other processes are waiting for it to release the lock (they can be implicit locks; the process does not need to issue LOCK TABLE at all, which actually locks differently). But a process can hold a lock while not performing any operations and be appropriately marked as "sleeping".In the OP's case, the culprit is almost certainly process 1396, which was started before process 1398, is now in
sleeping
state, and has been Lasted 46 seconds. Since 1396 has apparently done everything it needs to do (it turns out it's sleeping now, and has been doing so for 46 seconds, as far as MySQL is concerned), no threads have entered at It can hold the lock and still hold it before sleeping (otherwise 1396 would also stop).Due to MySQL's "deadlock-free" locking policy, no process can hold a lock, release the lock, and restore the lock again; therefore, lock waits are always performed by a process that still holds the lock and has never held the lock before. caused by the process. This is useful (we'll exploit this fact below) because it guarantees that the lock "queue" is ordered.
IMPORTANT: If you connect to MySQL as a restricted user, SHOW FULL PROCESSLIST
So: Ifwill
not show all processes. So the lock may be held by a process that you don't see.SHOW FULL PROCESSLIST
The rest of this answer deals with a confusing situation where a process is waitingshows you everything and shows a
Running process, then that process is probably responsible and you need to wait for it to finish what it's doing Anything (or you can kill it - at your own risk).for no apparent reason and no one seems to be doing anything.
Better
Display process list
The above can be adjusted to only show processes that are in SLEEP state, and it will sort them in descending time order anyway, so it's easier to find hung processes (which are usually
due to the ordering in "Waiting for metadata" sleep one immediately before locking"; and it is always one of more sleeps than any pending sleep time.
important things
Keep any "waiting for metadata lock" processes separate .
Quick and dirty solution, not really recommended, but quick
Kill all processes in the "sleeping" state on the same database that are older than the oldest thread in the "waiting for metadata lock" state. This is what Arnaud Amaury would do:
KILL
, re-evaluate the situation and restart the process accordingly. The waiting processes may be running now, or they may have been running briefly and are now sleeping. They may even now hold new metadata locks.Ninety-nine times out of a hundred, the thread to be killed is the youngest ## thread that is sleeping and older than the older thread waiting for the metadata lock. #the rout:
(*) The TIME sequence actually has milliseconds, or I'm told, it just doesn't show them. So while both processes have a time value of 19, the lowest process should be younger.More focused repair
Run
SHOW ENGINE INNODB STATUSand look at the "TRANSACTION" section. Among other things, you'll find something like
Now you use
SHOW FULL PROCESSLISTto check what thread id 1396 is doing with its #1701 transaction. It is most likely in a "sleeping" state. So: an active transaction with an active lock (#1701), it even made some changes because it has an undo log entry... but is currently idle.
This
is the thread you need to kill. These changes are lost. Remember that doing nothing in MySQL does not mean doing nothing in general. If you get some records from MySQL and build a CSV for FTP upload, the MySQL connection is idle during the FTP upload.Actually, if the process using MySQL and the MySQL server are on the same computer, that computer is running Linux, and you have root privileges, there is a way to find out which
processowns the requested Connect lock. This in turn allows determining (based on CPU usage, or at worst strace -ff -p pid) whether the process is
really
doing something, to help determine Whether it is safe to kill someone.Why does this happen?
I've seen this happen with web applications that use "persistent" or "pooled" MySQL connections, now usually with very little time savings: the web application instance terminates, but the connection does not , so it's locked alive... and blocked from others.
Another interesting approach I've found is, in the hypothetical above, to run a query that returns some rows, and retrieve only some of them . If the query is not set to "auto-clean" (but the underlying DBA does), it will keep the connection open and prevent a full lock on the table. I ran into this with a piece of code that was verifying that a row existed by selecting it and verifying if it had an error (it doesn't exist) or not (it must exist), but didn't actually retrieve the row .
PHP and PDO
PDO has persistent connection capabilities. This is how I make sure PDO doesn't pool the connections and close each connection. Very messy.
When opening, set options (the fourth option is new PDO()):
When disconnected:
Ask the database
If you have the latest MySQL, but not too new , as this will be deprecated , another way to find the culprit is (again you will need the permissions info mode )
Actual solution, takes time and work
Problems are often caused by this architecture:
When the web application terminates or a web application lightweight thread instance terminates, the container/connection pool may not . It is the container that keeps the connection open, so obviously the connection is not closed. As expected, MySQL does not consider the operation complete.
If the web application does not clean up after itself (no
ROLLBACK
orCOMMIT
of transactions, noUNLOCK TABLES
, etc.), then the web application starts executing Any operations will still exist and may still block everyone else.Then there are two solutions. Even worse is lowering the idle timeout. But guess what happens if you wait too long between queries (literally: "MySQL server has gone away"). You can then use mysql_ping if available (soon to be deprecated. PDO has a workaround . Alternatively you can check for the error if Reopen the connection if this occurs (that's the Python way). So - with a small performance cost - this is doable.
Better, smarter solutions aren’t that simple to implement. Try to let the script clean up after itself, making sure to retrieve all rows or release all query resources, catch all exceptions and handle them correctly, or, skip persistent connections altogether if possible . Let each instance create its own connection or use a smart pool driver (in PHP PDO, use
PDO::ATTR_PERSISTENT
set explicitly tofalse
).Alternatively (like in PHP), you can have the destructor and exception handler force the connection to be cleaned up by committing or rolling back the transaction (which should be enough), or maybe even issue an explicit table unlock and RELEASE_ALL_LOCKS(), or submit connection suicide (
KILL CONNECTION_ID()
) for good results.I don't know of a way to query existing result set resources to release them; the only way is to save these resources into a private array.