Redis Log: Tips for Quick Recovery-Redis-php.cn

Redis Log: Tips for Quick Recovery

“
It’s right to be independent, and it’s right to integrate into the circle. The key is to figure out what you want. Life, what price are you willing to pay for this.
”

We usually use Redis as a cache to improve read response performance. Once Redis goes down, all the data in the memory will be lost. If you access it directly now A large amount of database traffic hitting MySQL may cause more serious problems.

In addition, the performance of slowly reading from the database to Redis will inevitably be faster than getting it from Redis, which will also cause the response to slow down.

In order to achieve fast recovery without fear of downtime, Redis has designed two major killers, namely AOF (Append Only FIle) logs and RDB snapshots.

When learning a technology, you usually only come into contact with scattered technical points, without establishing a complete knowledge framework and architecture system in your mind, and without a systematic view. This will be very difficult, and it will appear that you can do it at first glance, but then you will forget it and be confused.

Let’s understand Redis thoroughly together and master the core principles and practical skills of Redis in depth. Build a complete knowledge framework and learn to organize the entire knowledge system from a global perspective.

This article is hardcore, I suggest you save it, like it, calm down and read it, I believe you will gain a lot.

The previous article analyzed the core data structure, IO model, thread model of Redis, and used appropriate data encoding according to different data. Deeply grasp the reasons why it is really fast!

Recommended (free): redis

##This article will focus on the following points:

How to recover quickly after a downtime?
If the machine is down, how can Redis avoid data loss?
What is RDB memory snapshot?
AOF log implementation mechanism
What is copy-on-write technology?
….

The knowledge points involved are as shown in the figure:

Redis Log: Tips for Quick Recovery

Redis Log: The secret to fearless downtime and fast recovery

Redis Panorama

The panorama can be expanded around two dimensions, namely:

Application dimension: cache usage, cluster usage, clever use of data structures

System dimension: can be classified into three high

High performance: thread model, network IO Model, data structure, persistence mechanism;
High availability: master-slave replication, sentinel cluster, Cluster sharded cluster;
High scalability ：Load Balancing

The Redis series of chapters revolves around the following mind map, and let’s explore the secrets of Redis’s high-performance and persistence mechanism.

Redis Log: Tips for Quick Recovery

Understand Redis

Have a panoramic view and master the system view.

The system view is actually crucial. To a certain extent, when solving problems, having a system view means that you can locate and solve problems in a well-founded and organized manner.

RDB memory snapshot allows quick recovery from downtime

“

65 Brother: Redis is down for some reason, which will cause all traffic to be interrupted. When I got to the backend MySQL, I immediately restarted Redis, but its data was stored in the memory. Why was there still no data after the restart? How to prevent the data from being lost after restarting?
"

65 Brother, don't worry, "Code "Brother Byte" will take you step by step to deeply understand how to quickly recover after Redis crashes.

Redis data is stored in memory. Is it possible to consider writing the data in memory to disk? When Redis restarts, the data saved on the disk is quickly restored to the memory, so that normal services can be provided after the restart.

65 Brother: I thought of a solution. Each time a "write" operation is performed to operate the memory, it is written to the disk at the same time.
"

This solution has a fatal problem: Each write instruction not only writes to the memory but also to the disk. The performance of the disk is too slow compared to the memory, which will cause the performance of Redis to be greatly reduced.

Memory Snapshot

65 Brother: How to avoid this simultaneous writing problem?
”

We usually use Redis as a cache, so even if Redis does not save all the data, it can still be obtained through the database, so Redis will not save all the data. Redis data persistence uses " RDB data snapshot" method to achieve rapid recovery from downtime.

“

65 Brother: So what is an RDB memory snapshot?
”

During the process of Redis executing the "write" command, the memory data will continue to change. The so-called memory snapshot refers to the status data of the data in Redis memory at a certain moment.

It’s like time is frozen at a certain moment. When we take pictures, we can completely record the moment of a certain moment through photos.

Redis is similar to this, which is to capture the data at a certain moment in the form of a file and write it to the disk. This snapshot file is called an RDB file, and RDB is the abbreviation of Redis DataBase.

Redis executes RDB memory snapshots regularly, so that it is not necessary to write to the disk every time the "write" command is executed. It only needs to be written to the disk when the memory snapshot is executed. It not only ensures that it is fast but not broken, it also achieves durability and can recover quickly from downtime.

Redis Log: Tips for Quick Recovery

RDB memory snapshot

When doing data recovery, directly read the RDB file into the memory to complete the recovery.

“
65 Brother: Which data should be snapshotted? Or how often should snapshots be taken? This will affect the execution efficiency of the snapshot.
”

65 Brother, that’s good. Start thinking about data efficiency. In "Redis Core: The Fast and Unbreakable Secret" we know that its single-threaded model determines that we should try our best to avoid operations that will block the main thread and avoid RDB file generation from blocking the main thread.

Generate RDB strategy

Redis provides two instructions for generating RDB files:

save: main thread execution , will block;
bgsave: Call the glibc function fork to generate a sub-process for writing RDB files, and snapshot persistence is completely handled by the sub-process. , the parent process continues to process client requests and generates the default configuration of the RDB file.

"
65 Brother: When taking a "snapshot" of the memory data, can the memory data still be modified? That is, can the write command be processed normally?
”

First of all, we need to make it clear that avoiding blocking and being able to handle write operations during RDB file generation are not the same thing. Although the main thread is not blocked, in order to ensure the consistency of the snapshot data at that time, It can only process read operations and cannot modify the data of the snapshot being executed.

Obviously, Redis does not allow writing operations to be suspended in order to generate RDB.

"
65 Brother: So how does Redis process write requests and generate RDB files at the same time?
”

Redis uses the operating system’s multi-process copy-on-write technology COW (Copy On Write) to achieve snapshot persistence. This mechanism is very interesting and few people know about it. Multi-process COW is also an authentication program. An important indicator of the breadth of employee knowledge.

Redis will call the function of glibc during persistence fork to generate a child process. The snapshot persistence is completely handled by the child process, and the parent process continues Process client requests.

When the child process is just created, it shares the code segment and data segment in the memory with the parent process. At this time, you can imagine the parent-child process as a conjoined baby, sharing the body.

This is the mechanism of the Linux operating system. In order to save memory resources, they are shared as much as possible. At the moment when the process is separated, there is almost no obvious change in the growth of memory.

bgsave The child process can share all the memory data of the main thread, read the data of the main thread and write it to the RDB file.

After executing the SAVE command or BGSAVEWhen the command creates a new RDB file, the program will check the keys in the database, and expired keys will not be saved to the newly created RDB file.

When the main thread executes the write command to modify the data time, a copy of this data will be made, bgsave The sub-process reads this copy data and writes it to the RDB file, so the main thread can directly modify the original data.

Redis Log: Tips for Quick Recovery

Copy-on-write technology ensures data modification during the snapshot period

This not only ensures the integrity of the snapshot, but also allows the main thread to modify the data at the same time, avoiding the impact on normal business.

Redis will use bgsave to take a snapshot of all data in the current memory. This operation is completed by the child process in the background, which allows the main thread to modify the data at the same time.

“
65 Brother: Can we execute the RDB file every second? In this way, even if there is a downtime, we will lose up to 1 second of data.
”

Executing full data snapshots too frequently has two serious performance overheads:

Frequently generate RDB files and write them to disk, causing excessive disk pressure. It will appear that the previous RDB has not been executed yet, and the next one starts to be generated again, falling into an infinite loop.
fork out of the bgsave sub-process will block the main thread. The larger the memory of the main thread, the more blocked it will be. The longer the time.

Advantages and Disadvantages

The recovery speed of snapshots is fast, but the frequency of generating RDB files is difficult to control. If the frequency is too low, the data lost in downtime will be relatively large. Too much; too fast, which will consume additional overhead.

RDB uses binary data compression to write to disk, with small file size and fast data recovery speed.

In addition to RDB full snapshots, Redis also The AOF post-write log is designed. Next, let’s talk about what an AOF log is.

AOF post-write log to avoid data loss during downtime

AOF log storage is the sequential command sequence of the Redis server, and the AOF log only records the command records that modify the memory.

Assuming that the AOF log records all modified instruction sequences since the creation of the Redis instance, then the memory of the current Redis instance can be restored by sequentially executing all instructions on an empty Redis instance, that is, "replaying" The state of the data structure.

Comparison between pre-write and post-write logs

Write Ahead Log (WAL): Write the modified data to the log file before actually writing the data , fault recovery is guaranteed.

For example, the redo log (redo log) in the MySQL Innodb storage engine is a data log that records modifications. Before actually modifying the data, the modification log is recorded and the modified data is executed.

Post-write log: First execute the "write" command request, write the data into the memory, and then record the log.

Redis Log: Tips for Quick Recovery

AOF post-write command

Log format

When Redis receives the "set key MageByte" command, After the data is written to the memory, Redis will write the AOF file in the following format.

『*3』: Indicates that the current command is divided into three parts. Each part starts with "$ number", followed by the specific "command, key, value".
"Number": Indicates the number of bytes occupied by this part of the command, key, and value. For example, "$3" means that this part contains 3 bytes, which is the "set" command.

Redis Log: Tips for Quick Recovery

AOF log format

“
65 Brother: Why does Redis use post-write logging?
”

Post-writing logs avoid additional checking overhead and do not require syntax checking of executed commands. If you use write-ahead logging, you need to check whether the syntax is correct first. Otherwise, the log records wrong commands, and an error will occur when using log recovery.

In addition, recording the log after writing will not block the execution of the current "write" command.

“
65 Brother: So with AOF, is it foolproof?
”

Silly boy, it’s not that simple. If Redis has just finished executing the command and crashes before recording the log, the data related to the command may be lost.

Also, AOF avoids the blocking of the current command, but may bring the risk of blocking to the next command. The AOF log is executed by the main thread. During the process of writing the log to the disk, if the disk pressure is high, the writing to the disk will be very slow, causing subsequent "write" instructions to be blocked.

Have you found out? These two problems are related to disk writeback. If you can reasonably control the timing of writing the AOF log back to the disk after the "write" command is executed, the problem will be solved.

Write back strategy

In order to improve the writing efficiency of the file, when the user calls the write function to write some data to the file, the operating system usually The written data is temporarily stored in a memory buffer. The data in the buffer is not actually written to the disk until the buffer space is filled up or the specified time limit is exceeded.

Although this approach improves efficiency, it also brings security issues to the written data, because if the computer shuts down, the written data stored in the memory buffer will be lost.

To this end, the system provides two synchronization functions, fsync and fdatasync, which can force the operating system to immediately write the data in the buffer to the hard disk. , thereby ensuring the security of written data.

AOF configuration items provided by RedisappendfsyncThe writeback strategy directly determines the efficiency and security of the AOF persistence function.

always: Synchronous writeback, the contents of the aof_buf buffer will be flushed to the AOF file immediately after the write command is executed.
everysec: Write back every second. After the write command is executed, the log will only be written to the AOF file buffer, and the buffer content will be synchronized to the disk every second.
no: Under the control of the operating system, after the write execution is completed, the log is written to the AOF file memory buffer, and the operating system decides when to flush it to the disk.

There is no best-of-both-worlds strategy, we need to make a trade-off between performance and reliability.

always Synchronous writeback can ensure that data is not lost, but each "write" command needs to be written to the disk, resulting in the worst performance.

everysecWrite back every second, avoiding the performance overhead of synchronous write back. In the event of a downtime, data written to the disk may be lost for one second. This is a compromise between performance and reliability. compromise.

no Operating system control, after executing the write command, write the AOF file buffer and then execute the subsequent "write" command. The performance is the best, but a lot of data may be lost.

“
65 Brother: How should I choose a strategy?
”

We can choose the writeback strategy according to the system’s requirements for high performance and high reliability. To summarize: if you want to obtain high performance, choose the No strategy; if you want to obtain high reliability guarantee, Just choose the Always policy; if you allow a little data loss but want performance to be greatly affected, then choose the Everysec policy.

Advantages and Disadvantages

Advantages: The log is only recorded when the execution is successful, avoiding the overhead of instruction syntax checking. At the same time, the current "write" instruction will not be blocked.

Disadvantages: Since AOF records the content of each instruction, please see the log format above for the specific format. Every command needs to be executed during fault recovery. If the log file is too large, the entire recovery process will be very slow.

In addition, the file system also has restrictions on file size. Files that are too large cannot be saved. As the file becomes larger, the appending efficiency will also become lower.

The log is too large: AOF rewriting mechanism

“
65 Brother: What should I do if the AOF log file is too large?
”

AOF pre-write log records each "write" command operation. It will not cause performance loss like RDB full snapshot, but the execution speed is not as fast as RDB. At the same time, too large log files will also cause performance problems. For a real man like Redis who only wants to be fast, he absolutely cannot tolerate problems caused by too large logs.

So, Redis has designed a killer "AOF rewriting mechanism". Redis provides the bgrewriteaof instruction to slim down the AOF log.

The principle is to open a sub-process to traverse the memory and convert it into a series of Redis operation instructions, which are serialized into a new AOF log file. After the serialization is completed, the incremental AOF log that occurred during the operation is appended to the new AOF log file. After the appending is completed, the old AOF log file is immediately replaced, and the slimming work is completed.

"
65 Brother: Why can the AOF rewriting mechanism reduce the size of the log file?
"

The rewriting mechanism has a "multiple to one" function, which converts old logs into Multiple instructions became one instruction after rewriting.

As shown below:

Redis Log: Tips for Quick Recovery

AOF rewriting mechanism (error correction: 3 instructions were recorded before rewriting)

“
65 Brother: After rewriting, the AOF log became smaller, and finally the operation log of the latest data of the entire database was flushed to the disk. Will rewriting block the main thread?
"

　code As mentioned above, the AOF log is written back by the main thread. The AOF rewriting process is actually completed by the background sub-process bgrewriteaof to prevent blocking the main thread.

Rewriting process

Different from the AOF log being written back by the main thread, the rewriting process is completed by the background sub-process bgrewriteaof. This is also to avoid blocking the main thread and causing database performance to decrease. .

In general, there are two logs in total, a memory data copy, which are the old AOF log, the new AOF rewrite log and the Redis data copy.

Redis will record the "write" command operations received during the rewriting process to the old AOF buffer and the AOF rewrite buffer at the same time, so that the rewrite log also saves the latest operations. After all operation records of the copied data are rewritten, the latest operations recorded in the rewrite buffer will also be written to the new AOF file.

Every time AOF is rewritten, Redis will first perform a memory copy to traverse the data and generate rewrite records; use two logs to ensure that the newly written data will not be lost during the rewrite process. and maintain data consistency.

Redis Log: Tips for Quick Recovery

AOF rewriting process

“
65 Brother: AOF rewriting also has a rewriting log, why does it not share the log using AOF itself? What?
”

This is a good question for the following two reasons:

One reason is that when parent and child processes write the same file, competition problems will inevitably occur. , controlling competition means affecting the performance of the parent process.
If the AOF rewrite process fails, the original AOF file is equivalent to being contaminated and cannot be restored. Therefore, Redis AOF rewrites a new file. If the rewriting fails, just delete the file directly. It will not affect the original AOF file. After the rewriting is completed, just replace the old file.

Redis 4.0 Hybrid Log Model

When restarting Redis, we rarely use rdb to restore the memory state because a large amount of data will be lost. We usually use AOF log replay, but the performance of AOF log replay is much slower than RDB, so when the Redis instance is large, it takes a long time to start.

In order to solve this problem, Redis 4.0 brings a new persistence option - hybrid persistence. Store the contents of the rdb file together with the incremental AOF log file. The AOF log here is no longer the full log, but the incremental AOF log that occurred during the period from the beginning of persistence to the end of persistence. Usually this part of the AOF log is very small.

So when Redis restarts, you can load the rdb content first, and then replay the incremental AOF log, which can completely replace the previous AOF full file replay, and the restart efficiency is greatly improved.

So RDB memory snapshots are executed at a slightly slower frequency, using AOF logs to record all "write" operations that occurred during the two RDB snapshots.

In this way, snapshots do not need to be executed frequently. At the same time, because AOF only needs to record the "write" instructions that occur between two snapshots, it does not need to record all operations to avoid excessive file size.

Summary

Redis designed bgsave and copy-on-write to avoid the impact on read and write instructions during snapshot execution. Frequent snapshots will put pressure on the disk and fork blocks the main thread.

Redis has designed two major features to achieve rapid recovery from downtime without data loss.

To prevent the log from being too large, an AOF rewriting mechanism is provided. According to the latest data status of the database, the data writing operation is generated as a new log, and is completed in the background without blocking the main thread.

Integrating AOF and RDB provides a new persistence strategy and hybrid log model in Redis 4.0. When Redis restarts, you can first load the rdb content, and then replay the incremental AOF log, which can completely replace the previous AOF full file replay, and the restart efficiency is greatly improved.

Finally, regarding the choice of AOF and RDB, "Code Byte" has three suggestions:

When data cannot be lost, the mixed use of memory snapshots and AOF It is a good choice;
If minute-level data loss is allowed, you can only use RDB;
If you only use AOF, priority is given Use the everysec configuration option because it strikes a balance between reliability and performance.

After two series of Redis articles, readers should have an overall understanding of Redis.

The above is the detailed content of Redis Log: Tips for Quick Recovery. For more information, please follow other related articles on the PHP Chinese website!