How to achieve fast recovery and persistence without fear of downtime in Redis-Redis-php.cn

How to achieve fast recovery and persistence without fear of downtime in Redis? The following article will take you through it, I hope it will be helpful to you!

How to achieve fast recovery and persistence without fear of downtime in Redis

#It is right to be independent, and it is also right to integrate into the circle. The key is to figure out what kind of life you want and what price you are willing to pay for it.

We usually use Redis as a cache to improve read response performance. Once Redis goes down, all data in the memory will be lost. If we directly access the database and a large amount of traffic hits MySQL, it may cause more problems. serious problem. [Related recommendations: Redis Video Tutorial]

In addition, the performance of slowly reading from the database to Redis will inevitably be faster than getting it from Redis, which will also cause the response to slow down.

In order to achieve fast recovery without fear of downtime, Redis has designed two major killer features, namely AOF (Append Only FIle) logs and RDB snapshots.

When learning a technology, you usually only come into contact with scattered technical points, without establishing a complete knowledge framework and architecture system in your mind, and without a systematic view. This will be very difficult, and it will appear that you can do it at first glance, but then you will forget it and be confused.

Follow "Ma Ge Byte" to understand Redis thoroughly and master the core principles and practical skills of Redis in depth. Build a complete knowledge framework and learn to organize the entire knowledge system from a global perspective.

This article is hardcore, I suggest you save it, like it, calm down and read it, I believe you will gain a lot.

The previous article analyzed the core data structure, IO model, thread model of Redis, and used appropriate data encoding according to different data. Deeply grasp the reasons why it is really fast!

This article will focus on the following points:

How to quickly recover after a downtime?
If the machine is down, how can Redis avoid data loss?
What is RDB memory snapshot?
AOF log implementation mechanism
What is copy-on-write technology?
….

The knowledge points involved are as shown in the figure:

How to achieve fast recovery and persistence without fear of downtime in Redis

Redis Panorama

The panorama can be expanded around two dimensions, which are:

Application dimension: cache usage, cluster usage, clever use of data structures

System dimensions: can be classified into three highs

High performance: thread model, network IO model, data structure, persistence mechanism;
High availability: Master-slave replication, sentinel cluster, Cluster sharding cluster;
High expansion: load balancing

Redis series chapters revolve around the following mind map. This time Let’s explore the secrets of Redis’s high-performance and persistence mechanism.

How to achieve fast recovery and persistence without fear of downtime in Redis

Have a panoramic view and master the system view.

The system view is actually crucial. To a certain extent, when solving problems, having a system view means that you can locate and solve problems in a well-founded and organized manner.

RDB memory snapshot, allowing quick recovery from downtime

65 Brother: Redis is down for some reason, which will cause all traffic to be hit Backend MySQL, I restarted Redis immediately, but its data is stored in the memory. Why is there still no data after restarting? How to prevent data loss after restarting?

65 Don’t worry, "Code Byte" will take you step by step to understand how to quickly recover after Redis crashes.

Redis data is stored in memory. Is it possible to consider writing the data in memory to disk? When Redis restarts, the data saved on the disk is quickly restored to the memory, so that normal services can be provided after the restart.

65 Brother: I thought of a solution. Each time a "write" operation is performed to operate the memory, it is written to the disk at the same time.

This solution has a fatal problem: every time The write command not only writes to the memory but also to the disk. The performance of the disk is too slow compared to the memory, which will cause the performance of Redis to be greatly reduced.

Memory Snapshot

65 Brother: How to avoid this simultaneous writing problem?

We usually use Redis as a cache, so even if Redis does not save all the data, it can still be obtained through the database, so Redis will not save all the data. Redis data persistence uses "RDB "Data snapshot" method to achieve rapid recovery from downtime.

65 Brother: So what is RDB memory snapshot?

When Redis executes the "write" command, the memory data will continue to change. The so-called memory snapshot refers to the status data of the data in Redis memory at a certain moment.

It’s like time is frozen at a certain moment. When we take pictures, we can completely record the moment of a certain moment through photos.

Redis is similar to this, which is to capture the data at a certain moment in the form of a file and write it to the disk. This snapshot file is called RDB file. RDB is the abbreviation of Redis DataBase.

Redis executes RDB memory snapshots regularly, so that it is not necessary to write to the disk every time the "write" command is executed. It only needs to be written to the disk when the memory snapshot is executed. It not only ensures that it is fast but not broken, it also achieves durability and can recover quickly from downtime.

How to achieve fast recovery and persistence without fear of downtime in Redis

#When doing data recovery, directly read the RDB file into the memory to complete the recovery.

65 Brother: What data should be snapshotted? Or how often to take snapshots? This will affect the execution efficiency of the snapshot.

65 That’s good, I’m starting to think about data efficiency. In the previous article, we learned that his single-threaded model determines that we should avoid operations that will block the main thread as much as possible, and avoid RDB file generation from blocking the main thread.

Generate RDB strategy

Redis provides two instructions for generating RDB files:

save: executed by the main thread and will block ;
bgsave: Call the glibc function fork to generate a child process for writing RDB files. Snapshot persistence is completely handled by the child process, and the parent process continues to process client requests. Default configuration for generating RDB files.

65 Brother: When taking a "snapshot" of the memory data, can the memory data still be modified? That is, can the write command be processed normally?

First of all, we need to make it clear that avoiding blocking and being able to handle write operations during RDB file generation are not the same thing. Although the main thread is not blocked, in order to ensure the consistency of the snapshot data, it can only process read operations and cannot modify the data of the snapshot being executed.

Obviously, Redis does not allow writing operations to be suspended in order to generate RDB.

65 Brother: How can Redis process write requests and generate RDB files at the same time?

Redis uses the operating system's multi-processcopy-on-write technology COW (Copy On Write) to achieve snapshot persistence. This mechanism is very interesting and few people know it. Multi-process COW is also an important indicator of the breadth of a programmer's knowledge.

Redis will call the glibc function during persistence fork to generate a child process. Snapshot persistence is completely handled by the child process, and the parent process continues to process client requests.

When the child process is just created, it shares the code segment and data segment in the memory with the parent process. At this time, you can imagine the father-son process as a conjoined twin, sharing the body.

This is the mechanism of the Linux operating system. In order to save memory resources, they should be shared as much as possible. At the moment when the process is separated, there is almost no obvious change in the memory growth.

bgsave The child process can share all the memory data of the main thread, read the data of the main thread and write it to the RDB file.

When executing the SAVE command or the BGSAVE command to create a new RDB file, the program will check the keys in the database, and expired keys will not be Save to the newly created RDB file.

When the main thread executes the write command to modify the data, the data will be copied. bgsave The child process reads the copy data and writes it to the RDB file, so the main thread can directly Modify the original data.

How to achieve fast recovery and persistence without fear of downtime in Redis

This not only ensures the integrity of the snapshot, but also allows the main thread to modify the data at the same time, avoiding the impact on normal business.

Redis will use bgsave to take a snapshot of all data in the current memory. This operation is completed by the child process in the background, which allows the main thread to modify the data at the same time.

65 Brother: Can you execute the RDB file every second? In this way, even if there is a downtime, up to 1 second of data will be lost.

Executing full data snapshots too frequently has two serious performance overheads:

Frequently generate RDB files and write them to disk, which causes excessive disk pressure. It will appear that the previous RDB has not been executed yet, and the next one starts to be generated again, falling into an infinite loop.
Fork out of the bgsave sub-process will block the main thread. The larger the memory of the main thread, the longer the blocking time.

Advantages and Disadvantages

The recovery speed of snapshots is fast, but the frequency of generating RDB files is difficult to control. If the frequency is too low, the data lost will be lost if the machine crashes. It will be more; if it is too fast, it will consume additional overhead.

RDB uses binary data compression to write to disk, with small file size and fast data recovery speed.

In addition to RDB full snapshots, Redis also designs AOF post-write logs. Next, let’s talk about what AOF logs are.

AOF post-write log to avoid data loss during downtime

The AOF log stores the sequential instruction sequence of the Redis server. The AOF log only records instructions that modify the memory. Record.

Assuming that the AOF log records all modified instruction sequences since the creation of the Redis instance, then the memory of the current Redis instance can be restored by sequentially executing all instructions on an empty Redis instance, that is, "replaying" The state of the data structure.

Comparison between pre-write and post-write logs

Write Ahead Log (WAL): Before actually writing the data, the modification will The data is written to the log file, and fault recovery is guaranteed.

For example, the redo log in the MySQL Innodb storage engine is a data log that records modifications. Before actually modifying the data, the modification log is recorded and the modified data is executed.

Post-write log: First execute the "write" command request, write the data into the memory, and then record the log.

How to achieve fast recovery and persistence without fear of downtime in Redis

Log format

When Redis receives the "set key MageByte" command to write data to the memory, Redis will proceed as follows format to write AOF files.

『*3』: Indicates that the current command is divided into three parts. Each part starts with "$ number", followed by the specific "command, key, value" of that part.
"Number": Indicates the size of bytes occupied by this part of the command, key, and value. For example, "$3" means that this part contains 3 bytes, which is the "set" command.

How to achieve fast recovery and persistence without fear of downtime in Redis

65 Brother: Why does Redis use post-write logging?

Post-writing logs avoid additional checking overhead and do not require syntax checking of executed commands. If you use write-ahead logging, you need to check whether the syntax is correct first. Otherwise, the log records wrong commands, and an error will occur when using log recovery.

In addition, the log is recorded after writing, will not block the execution of the current "write" command.

65 Brother: So with AOF, is it foolproof?

Silly boy, it’s not that simple. If Redis has just finished executing the command and crashes before recording the log, the data related to the command may be lost.

Also, AOF avoids the blocking of the current command, but may bring the risk of blocking to the next command. The AOF log is executed by the main thread. During the process of writing the log to the disk, if the disk pressure is high, the writing to the disk will be very slow, causing subsequent "write" instructions to be blocked.

Have you found out? These two problems are related to disk writeback. If you can reasonably control the timing of writing the AOF log back to the disk after the "write" command is executed, the problem will be solved.

Write back strategy

In order to improve the writing efficiency of the file, when the user calls the write function to write some data to the file , the operating system usually temporarily stores the written data in a memory buffer, and waits until the buffer space is filled or the specified time limit is exceeded before actually writing the data in the buffer to the disk.

Although this approach improves efficiency, it also brings security issues to the written data, because if the computer shuts down, the written data stored in the memory buffer will be lost.

To this end, the system provides two synchronization functions, fsync and fdatasync, which can force the operating system to immediately write the data in the buffer to the hard disk. , thereby ensuring the security of written data.

AOF configuration items provided by RedisappendfsyncThe writeback strategy directly determines the efficiency and security of the AOF persistence function.

always: Synchronous write back, the content in the aof_buf buffer will be written to the AOF file immediately after the write command is executed.
everysec: Write back every second. After the write command is executed, the log will only be written to the AOF file buffer, and the buffer content will be synchronized to the disk every second.
no: Under the control of the operating system, after the write execution is completed, the log is written to the AOF file memory buffer, and the operating system decides when to flush it to the disk.

There is no best-of-both-worlds strategy, we need to make a trade-off between performance and reliability.

always Synchronous writeback can ensure that data is not lost, but each "write" command needs to be written to the disk, resulting in the worst performance.

everysecWrite back every second, avoiding the performance overhead of synchronous write back. In the event of a downtime, data written to the disk may be lost for one second. This is a compromise between performance and reliability. compromise.

no Operating system control, after executing the write command, write the AOF file buffer and then execute the subsequent "write" command. The performance is the best, but a lot of data may be lost.

65 Brother: Then how should I choose a strategy?

We can choose the writeback strategy based on the system's requirements for high performance and high reliability. To summarize: If you want to get high performance, choose the No strategy; if you want to get high reliability guarantee, choose the Always strategy; if you allow a little bit of data loss but want performance to be greatly affected, then choose the Everysec strategy. .

Advantages and Disadvantages

Advantages: The log is recorded only after successful execution, avoiding the overhead of instruction syntax checking. At the same time, the current "write" command will not be blocked.

Disadvantages: Since AOF records the content of each instruction, please see the log format above for the specific format. Every command needs to be executed during fault recovery. If the log file is too large, the entire recovery process will be very slow.

In addition, the file system also has restrictions on file size. Files that are too large cannot be saved. As the file becomes larger, the appending efficiency will also become lower.

The log is too large: AOF rewriting mechanism

65 Brother: What should I do if the AOF log file is too large?

AOF pre-write log records each "write" command operation. It will not cause performance loss like RDB full snapshot, but the execution speed is not as fast as RDB. At the same time, too large log files will also cause performance problems. For a real man like Redis who only wants to be fast, he absolutely cannot tolerate problems caused by too large logs.

So, Redis has designed a killer "AOF rewriting mechanism". Redis provides the bgrewriteaof instruction to slim down the AOF log.

The principle is to open a sub-process to traverse the memory and convert it into a series of Redis operation instructions, which are serialized into a new AOF log file. After the serialization is completed, the incremental AOF log that occurred during the operation is appended to the new AOF log file. After the appending is completed, the old AOF log file is immediately replaced, and the slimming work is completed.

65 Brother: Why can the AOF rewriting mechanism shrink the log file?

The rewriting mechanism has a "multiple to one" function, which turns multiple instructions in the old log into one instruction after rewriting.

As shown below:

Three LPUSH instructions are generated after AOF rewriting. For scenes that have been modified multiple times, the reduction effect is more obvious.

How to achieve fast recovery and persistence without fear of downtime in Redis

#65 Brother: After rewriting, the AOF log became smaller, and finally the operation log of the latest data of the entire database was flushed to the disk. Will rewriting block the main thread?

"Brother Code" mentioned above that the AOF log is written back by the main thread. The process of AOF rewriting is actually completed by the background sub-process bgrewriteaof to prevent blocking the main thread.

The rewriting process

is different from the AOF log written back by the main thread. The rewriting process is completed by the background sub-process bgrewriteaof. This is also to avoid blocking the main thread. threads, causing database performance to degrade.

In general, there are two logs, one memory data copy, which are the old AOF log, the new AOF rewrite log and the Redis data copy.

Redis will record the "write" command operations received during the rewriting process to the old AOF buffer and the AOF rewrite buffer at the same time, so that the rewrite log also saves the latest operations. After all operation records of the copied data are rewritten, the latest operations recorded in the rewrite buffer will also be written to the new AOF file.

Every time AOF is rewritten, Redis will first perform a memory copy to traverse the data to generate rewrite records; use two logs to ensure that the newly written data will not be lost during the rewrite process. and maintain data consistency.

How to achieve fast recovery and persistence without fear of downtime in Redis

#65 Brother: AOF rewrite also has a rewrite log. Why doesn’t it share the log of AOF itself?

This is a good question for the following two reasons:

One reason is that writing the same file between parent and child processes will inevitably cause competition problems. Control competition. This means that it will affect the performance of the parent process.
If the AOF rewrite process fails, the original AOF file is equivalent to being contaminated and cannot be restored. Therefore, Redis AOF rewrites a new file. If the rewriting fails, just delete the file directly. It will not affect the original AOF file. After the rewriting is completed, just replace the old file.

Redis 4.0 Hybrid Log Model

When restarting Redis, we rarely use rdb to restore the memory state because a large amount of data will be lost. We usually use AOF log replay, but the performance of AOF log replay is much slower than RDB, so when the Redis instance is large, it takes a long time to start.

Redis 4.0 In order to solve this problem, it brings a new persistence option-Hybrid persistence. Store the contents of the rdb file together with the incremental AOF log file. The AOF log here is no longer the full log, but the incremental AOF log that occurred during the period from the beginning of persistence to the end of persistence. Usually this part of the AOF log is very small.

So when Redis restarts, you can load the rdb content first, and then replay the incremental AOF log, which can completely replace the previous AOF full file replay, and the restart efficiency is greatly improved.

So RDB memory snapshots are executed at a slightly slower frequency, using AOF logs to record all "write" operations that occurred during the two RDB snapshots.

In this way, snapshots do not need to be executed frequently. At the same time, because AOF only needs to record the "write" instructions that occur between two snapshots, it does not need to record all operations to avoid excessive file size.

Summary

Redis designed bgsave and copy-on-write to avoid the impact on read and write instructions during snapshot execution. Frequent snapshots will put pressure on the disk and fork blocks the main thread.

Redis has designed two major features to achieve rapid recovery from downtime without data loss.

To prevent the log from being too large, an AOF rewriting mechanism is provided. According to the latest data status of the database, the data writing operation is generated as a new log, and is completed in the background without blocking the main thread.

Integrating AOF and RDB provides a new persistence strategy and hybrid log model in Redis 4.0. When Redis restarts, you can first load the rdb content, and then replay the incremental AOF log, which can completely replace the previous AOF full file replay, and the restart efficiency is greatly improved.

Finally, regarding the choice of AOF and RDB, "Code Byte" has three suggestions:

When data cannot be lost, the mixed use of memory snapshots and AOF is a very good solution. Good choice;
If minute-level data loss is allowed, you can only use RDB;
If you only use AOF, give priority to the everysec configuration option because it is between reliability and performance. Strike a balance.

Stay tuned...

Original address: https://juejin.cn/post/6961735998547951653

Author: Code Brother Byte

For more programming-related knowledge, please visit: Programming Video! !

The above is the detailed content of How to achieve fast recovery and persistence without fear of downtime in Redis. For more information, please follow other related articles on the PHP Chinese website!