In the interview, the interviewer only needs to ask about the ACID of MySQL, and then he can immediately recite the eight-part essay (some people may not be able to answer it yet). What's even more disgusting is that some interviewers don't follow the routine and continue to ask, how does MySQL implement ACID?
I’m confused. To be honest, this question can persuade 95% of people.
Today, this article mainly discusses the implementation principle of ACID under the MySQL InnoDB
engine. It does not elaborate too much on basic knowledge such as what a transaction is and the meaning of isolation level.
As a relational database, MySQL uses the most common InnoDB engine to ensure ACID.
Let’s talk about isolation first. The first is the four isolation levels.
Isolation Level | Description |
---|---|
When a transaction has not been submitted, the changes it makes can be seen by other transactions | |
After a transaction is submitted, the changes it makes can only be seen by other transactions. Will be seen by other transactions | |
Repeatable Reading | In a transaction, the result of reading the same data is always the same, regardless of whether other transactions operate on the data and whether the transaction is committed. InnoDB default level. |
Serialization | Transactions are executed serially. Each read requires a table-level shared lock. Reading and writing will block each other. The isolation level is the highest, sacrificing the system. Concurrency. |
Different isolation levels are designed to solve different problems. That is, dirty reads, phantom reads, and non-repeatable reads.
隔离级别 | 脏读 | 不可重复读 | 幻读 |
---|---|---|---|
Read uncommitted | can appear | can appear | can appear |
Read submission | Not allowed | Can occur | Can occur |
Repeatable read | Not allowed | Not allowed | Can appear |
Serialization | Not allowed | Not allowed | Not allowed |
So different isolation levels, how is isolation achieved, and why can different things not interfere with each other? The answer is Locks and MVCC.
Let’s talk about locks first. How many locks does MySQL have?
In terms of granularity, it is table lock, page lock, and row lock. Table locks include intentional shared locks, intentional exclusive locks, self-increasing locks, etc. Row locks are implemented by each engine at the engine level. But not all engines support row locks. For example, the MyISAM engine does not support row locks.
In InnoDB transactions, row locks are implemented by locking index entries on the index. This means that InnoDB uses row-level locks only when data is retrieved through index conditions, otherwise table locks are used. Row-level locks are also divided into two types: shared locks and exclusive locks, as well as intention shared locks and intention exclusive locks that need to be obtained before locking.
select...lock in share mode
Lock. insert, update, delete, for update
Lock. Row locks are added when needed, but they are not released immediately when they are no longer needed, but until the end of the transaction. Only then released. This is the two-phase lock protocol.
The lock on a single row record will always lock the index record.
Gap lock, think about the reason for phantom reading. In fact, row lock can only lock rows, but when inserting a new record, what needs to be updated is the "" between records. gap". So add gap lock to solve phantom reading.
Gap Lock Record Lock, left open and closed.
A general introduction to the lower lock can be seen. With locks, when a transaction is writing data, other transactions cannot obtain the write lock and cannot write data. This ensures the isolation between transactions to a certain extent. But as mentioned earlier, after adding a write lock, why can other transactions also read data? Isn’t it that the read lock cannot be obtained? ?
As mentioned earlier, with the lock, the current transaction cannot modify the data without a write lock, but it can still be read, and when reading, even if the row of data has been modified and submitted by other transactions, You can still read the same value repeatedly. This is MVCC, multi-version concurrency control, Multi-Version Concurrency Control.
The storage format of row records in Innodb has some additional fields: DATA_TRX_ID and DATA_ROLL_PTR.
undo log
. ##ReadView is created at the beginning of each SQL and has several important attributes:#undo log: Record the log before the data is modified, which will be discussed in detail later.
Now start the query, a select comes over, and a row of data is found.
DATA_TRX_ID
With locks and MVCC, transaction isolation Sex is resolved. Let me expand here. Does the default RR level solve phantom reading? Phantom reading usually targets INSERT, and non-repeatability targets UPDATE.
Thing 1 | Thing 2 |
---|---|
begin | |
- | insert into dept(name) values("A") |
- | commit |
update dept set name="B" | |
commit |
We expected it to be
id name 1 A 2 B
actually it was
id name 1 B 2 B
In fact, the isolation level of MySQL repeatable read does not completely solve the problem of phantom reading, but solves it. Phantom reading problem when reading data. There is still a phantom read problem for modification operations, which means that MVCC is not thorough in solving phantom reads.
Let’s talk about atomicity. As mentioned earlier, undo log rolls back the log. Isolation MVCC actually relies on it to achieve, as does atomicity. The key to achieving atomicity is to be able to undo all successfully executed SQL statements when the transaction is rolled back.
When a transaction modifies the database, InnoDB will generate the corresponding undo log; if the transaction execution fails or rollback is called, causing the transaction to be rolled back, the information in the undo log can be used to roll back the data to How it looked before modification. Undo log is a logical log, which records information related to SQL execution. When a rollback occurs, InnoDB will do the opposite of the previous work based on the contents of the undo log:
Take the update operation as an example: when a transaction executes an update, the undo log generated will contain the primary key of the modified row (so as to know which rows have been modified) and which columns have been modified. , the values of these columns before and after modification and other information, you can use this information to restore the data to the state before the update during rollback.
Innnodb has many logs, and persistence relies on redo log.
Persistence is definitely related to writing. WAL technology is often mentioned in MySQL. The full name of WAL is Write-Ahead Logging. Its key point is to write first. log, and then write to disk. Just like doing business in a small shop, there is a pink board and an account book. When guests come, write on the pink board first, and then write down the account book when you are not busy.
redo log is this pink board. When a record needs to be updated, the InnoDB engine will first write the record to the redo log (and update the memory). At this time, the update is completed. At the appropriate time, this operation record is updated to the disk, and this update is often done when the system is relatively idle, just like what the shopkeeper does after closing.
redo log has two characteristics:
There are two stages for redo log: commit and prepare If you do not use "two-phase commit", the state of the database may be inconsistent with the state of the library restored using its log. Well, let's go here first and look at the other one.
InnoDB also provides a cache. The Buffer Pool contains the mapping of some data pages on the disk as a buffer for accessing the database:
The use of Buffer Pool greatly improves the efficiency of reading and writing data, but it also brings new problems: if MySQL goes down, the modified data in the Buffer Pool has not been refreshed at this time. to the disk will result in data loss, and the durability of the transaction cannot be guaranteed.
So I added redo log. When the data is modified, in addition to modifying the data in the Buffer Pool, the operation will also be recorded in the redo log;
When the transaction is submitted, the fsync interface will be called to flush the redo log.
If MySQL goes down, you can read the data in the redo log and restore the database when restarting.
The redo log uses WAL (Write-ahead logging, write-ahead log). All modifications are first written to the log and then updated to the Buffer Pool, ensuring that the data will not be lost due to MySQL downtime, thus Durability requirements are met. And there are two advantages to doing this:
Speaking of this, you may wonder that there is also a bin log that is also used for write operations and is used for data recovery. What is the difference?
For the statement update T set c=c 1 where ID=2;
Why write redo log first?
Consistency is the ultimate goal pursued by the transaction. The atomicity and durability mentioned in the previous question Security and isolation are actually to ensure the consistency of the database state. Of course, the above are all guarantees at the database level, and the implementation of consistency also requires guarantees at the application level.
That is, your business. For example, the purchase operation only deducts the user's balance without reducing the inventory. It is definitely impossible to ensure that the status is consistent.
We are all familiar with MySQL, and we also know what ACID is, but how is MySQL’s ACID implemented?
Sometimes, just like you know that there are undo log and redo log, but you may not know why there are them. When you know the purpose of the design, it will become clearer.
The above is the detailed content of Interviewer: How does MySQL implement ACID?. For more information, please follow other related articles on the PHP Chinese website!