This article will introduce to you the shortcomings of Redis's two persistence modes (RDB and AOF). It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.
[Related recommendations: Redis video tutorial]
1. RDB persistence mode defects
1. Problem description:
There are 200 concurrent routes, and the simulation continues to write to Redis. After 4 hours, a large number of interface calls begin to fail. The error message is as follows:
{"data":{"sendResult":null},"base":{"returncode":"99999","returndesc":"系统异常:MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error."},"qrybase":{"total":0,"count":0,"start":0}}
2. Reason Analysis:
Interpreted the error message and thought it was caused by insufficient disk space. It was found that 42% of the disk was left, as shown below:
So based on the error message It prompts to turn on the Redis log and continue the stress test. The interface still reports an error, but from the Redis log information
Can't save in background: fork: Cannot allocate memory
It is related to the improper use of memory by the process. Check the memory occupied by the main process of Redis as follows: occupying nearly 55%*4G of memory
Specific reason: In order to avoid the main process suspended animation when Redis saves data to the hard disk, it needs to Fork a copy The main process, and then completes the data saving to the hard disk in the Fork process. If the main process uses 2.2GB of memory, an additional 2.2GB is needed when Fork the child process. At this time, the memory is not enough, Fork fails, and the data is saved. The hard drive also failed.
3. Mitigation plan (cannot fundamentally solve the problem):
3.1 Modify the configuration item stop-writes-on-bgsave-error no in the redis.conf file (the default value is yes), that is When an error occurs in the bgsave snapshot operation, stop writing data to the disk. In this way, any subsequent write errors will fail. In order not to affect subsequent write operations, you need to change this value to no
3.2 Modify the kernel parameters (3 below) method), but requires root permissions:
(1) 编辑/etc/sysctl.conf ,改vm.overcommit_memory=1,然后sysctl -p 使配置文件生效 (2)sysctl vm.overcommit_memory=1 (3)echo 1 > /proc/sys/vm/overcommit_memory
2. AOF persistence mode defects
1. Problem 1 description:
Redis master-slave All nodes turned on AOF mode, with 200 concurrent connections, simulating continuous writing to Redis. After 15 minutes, a large number of interface calls began to fail, and the Linux virtual server where Redis was located hung.
The interface error is as follows:
{"data":null,"base":{"returndesc":"系统异常","returncode":"999999"},"qrybase":null} Biz(dubbo)接口报错如下: 2015-06-05 11:28:28.760 [DubboServerHandler-X.X.X.X:20882-thread-173] ERROR - error while validate jedis! redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
Cause analysis:
From the dubbo interface error message, it is caused by the interface API operation Redis timeout. Judging from the system logs and IO monitoring, it is shown that the above problems are caused by IO bottlenecks (system IO is too busy), as shown below:
It can also be seen from the system log that the IO blocking time exceeds 120 seconds, and the machine hangs due to the system security mechanism.
Summary
The test results prove that the AOF mode has the most obvious flaw, that is, IO will become a performance bottleneck when access pressure is high, resulting in service unavailability.
3. Mitigation plan (cannot fundamentally solve the problem)
Edit /etc/sysctl.conf and add the following configuration:
vm.dirty_background_ratio = 5 vm.dirty_ratio = 10
Then sysctl -p makes the configuration file take effect.
Description of Problem 2:
Whether AOF mode or RDB (snapshot mode) is used, when the size of the two files (.aof or .rdb) exceeds 80% of the system memory, the Redis process will be killed by the system down, causing the service to become unavailable.
Summary
The above problems indicate that we need to plan the system memory capacity in advance when using Redis, because once Redis crashes, a large amount of data will be lost and it is unrecoverable.
For more programming related knowledge, please visit: Programming Video! !
The above is the detailed content of A brief discussion on the defects of RDB and AOF persistence modes in Redis. For more information, please follow other related articles on the PHP Chinese website!