This article will summarize and share with you some Redis high-frequency interview questions, which have certain reference value. Friends in need can refer to them. I hope it will be helpful to everyone.
From the interviewer’s perspective, the purpose of this question is to test your level of knowledge about caching, and Combined with caching capabilities to process business and improve architecture. This question obviously allows you to express yourself freely and gives you the opportunity to lead the interviewer to the knowledge points you are most familiar with, so you should seize this opportunity as much as possible and give the interviewer a good impression. If you can talk about this question well, you can communicate in depth for several hours. If it is just one session, you can basically solve it easily. [Related recommendations: Redis Video Tutorial]
But don’t start chatting to death, the chat is too shallow, then basically just go back and wait for notifications ……
For example, the following answer:
Many people will say that this answer is correct! That's right, that's right, but there is always a feeling that the opportunities given to you are useless.
At this moment, the interviewer is thinking something like this:
If you don’t want to resist the interviewer’s Eighteen Dragon Subduing Palms, you should take the initiative to arouse the interviewer’s interest, and take the lead in improving your own pattern (horizontal breadth and depth), and Speak out as much as possible about what you know.
For example, the following answer method:
My understanding of this is roughly like this interviewer! !
High performance: A big criterion for high performance is fast response time. Redis is based on memory storage and has fast CPU access speed. In addition, Redis's extreme optimization of data structure, internal thread model and network I/O model design determine that Redis is a high-performance storage database. The only disadvantage is that memory is relatively expensive and resources are usually limited. Therefore, the cache architecture for very large data should be designed reasonably. We usually do not store excessively large data in Redis, because this will cause Redis performance to decrease.
High concurrency: Common indicators of high concurrency include response time (Response Time), throughput (Throughput), query rate per second QPS (Query Per Second) and number of concurrent users. Although Redis is a single process Single-threaded model, but Redis is indeed a powerful tool in high-concurrency business scenarios. Currently, Redis's QPS can reach 100,000 or even 1 million levels. This is absolutely high concurrency.
High availability: Redis high availability is mainly reflected in master-slave replication, sentinel (sentinel mode) and Cluster (cluster mode)
My understanding of this is roughly like this interviewer! !
The single thread of Redis refers to the use of a single thread to execute command operations. After the release of Redis6. These points:
My understanding of this is roughly like this interviewer! !
Redis has 5 basic data types: String, List, Hash, Set, and ZSet; in addition, there are three special data types: Bitmaps, Geospatial, and HyperLogLog
Data type | Simple description | Usage scenarios |
---|---|---|
String | String (string) is the simplest and most widely used data structure of Redis. Its interior is a character array. String (string) is a dynamic string that allows modification; its structural implementation is similar to ArrayList in Java (an initial array of size 10 is constructed by default). This is the idea of redundant allocation of memory, also known as pre- Allocation; this idea can reduce the performance consumption caused by expansion. When the size of string (string) reaches the expansion threshold, the string (string) will be expanded. There are three main situations for string (string) expansion: 1. The length is less than 1MB, and the expansion will be twice the original size; length = length * 2 2. The length is greater than 1MB, and increases by 1MB after expansion; length = length 1MB 3. The maximum length of the string is 512MB | Cache, counter, distributed lock, etc. |
List | Redis’ list is equivalent to LinkedList in Java language. It is a doubly linked list data structure (but this structure design is more clever, which will be introduced later). Supports sequential traversal. The insertion and deletion operations of the linked list structure are fast, with a time complexity of O(1), but the query is slow, with a time complexity of O(n). Redis's list (list) is not a simple one. LinkedList, but quicklist - "quick list", quicklist is a two-way list composed of multiple ziplists (compressed lists); | linked list, asynchronous queue, Weibo follower timeline list... |
Hash | Redis's hash (dictionary) is equivalent to HashMap in Java language. It is an unordered dictionary distributed according to hash values. The internal elements are based on key-value pairs. way to store. The implementation of hash (dictionary) is also consistent with the structure of HashMap (JDK1.7) in Java. Its data structure is also a two-dimensional structure composed of an array linked list. The node elements are hashed on the array. If a hash collision occurs, the linked list is used. Concatenated on array nodes. The value stored in hash (dictionary) in Redis can only be string values. In addition, the expansion is different from HashMap in Java. HashMap in Java is completed in one go when expanding its capacity, while Redis takes into account that its core access is a single-threaded performance issue, and in order to pursue high performance, it adopts a progressive rehash strategy. Progressive rehash means that it is not completed once, but is completed multiple times, so the old hash structure needs to be retained. Therefore, the hash (dictionary) in Redis will have two hash structures, the old and the new. After the rehash is completed, it is the old hash. Only after all the values are moved to the new hash, the new hash will completely replace the previous hash in function. | User information, Hash table... |
Set | The set (set) of Redis is equivalent to the HashSet in the Java language, and its internal keys Value pairs are unordered and unique. It implements a special dictionary internally where all values are null. After the last element in the collection is removed, the data structure is automatically deleted and the memory is reclaimed. | Duplication removal function, likes, dislikes, mutual friends... |
ZSet | zset (ordered set) is the most frequently asked question in Redis data structure. It is similar to the combination of SortedSet and HashMap in the Java language. On the one hand, it uses set to ensure the uniqueness of the internal value, and on the other hand, it sorts by the score (weight) of the value. This sorting function is implemented through Skip List. After the last element value of zset (ordered set) is removed, the data structure is automatically deleted and the memory is recycled. | Fans list, student performance sorting, traffic ranking, click ranking... |
Bitmaps | Bitmaps are called bitmaps, strictly It is not a data type. The bottom layer of Bitmaps is a string (key-value) byte array. We can use ordinary get/set to directly obtain and set the contents of the bitmap, or we can treat the byte array as a "bit array" through the bitmap operations getbit/setbit provided by Redis. Each cell of the "bit array" of Bitmaps can only store 0 and 1. The subscript of the array is called an offset in Bitmaps. When setting Bitmaps, a new string will be automatically generated if the key does not exist. If the set offset exceeds the range of the existing content, the bit array will be automatically extended by zero. | Employee punch-in... |
Geospatial | Geospatial is the geographical location GEO module added by Redis after version 3.2 | WeChat nearby people, order online "nearby restaurants"... … |
HyperLogLog | HyperLogLog is an algorithm used to do cardinality statistics. It provides an inexact deduplication and counting scheme (this inaccuracy is not very inexact), The standard error is 0.81%, which is an acceptable error range for UV statistics. The advantage of HyperLogLog is that when the number or volume of input elements is very large, the storage space for cardinality calculation is fixed. In Redis, each HyperLogLog key only costs 12KB of memory, and nearly 2^64 different bases can be calculated. However, HyperLogLog can only count the size of the cardinality (that is, the size of the data set and the number of sets). It cannot store the element itself and cannot store the element itself like a set collection, which means that it cannot return elements. | Cardinal statistics such as UV etc. |
My understanding of this is roughly like this interviewer! !
The data structure of Redis can set the expiration time (TTL) of the key through EXPIRE key seconds. We are also accustomed to thinking that the Redis key will be automatically deleted when it expires. Obviously, this idea is not correct. The design of Redis takes into account comprehensive factors such as performance/memory and designs a set of expiration strategies.
Active deletion (lazy deletion) refers to when the key is accessed time, first check whether the key has expired, and if it has expired, delete it actively.
Passive deletion (periodic strategy) refers to the Redis server testing the expiration time of the key regularly and randomly. If it expires, it will be deleted passively. The existence of passive deletion is essential because there are some keys that have expired and are no longer accessed forever. If they rely on active deletion, they will occupy memory permanently.
In order to ensure high-performance services, Redis passively deletes expired keys and adopts a greedy strategy/probabilistic algorithm. By default, it scans every 10 seconds. The specific strategy is as follows:
1. Randomly select 20 keys from the expiration dictionary (a collection of keys with expiration time set) and check whether they have expired
2. Delete the expired keys
In addition, when designing the Redis cache architecture, developers must pay attention to avoid as much as possible ( Prohibited) Setting a large number of keys to the same expiration time, because combined with passive deletion, it can be seen that when Redis passively deletes expired keys, it will cause the service to be temporarily unavailable; if there are a large number of keys that expire at the same time, this will lead to three steps of passive deletion of keys. Looping multiple times will cause the Redis service to freeze, which is unacceptable in large traffic projects.
Therefore, in order to avoid this situation, some keys whose allowed expiration time does not need to be very precise must be set to a more random expiration time, so that the lag time can be reduced.
#My understanding of this is roughly like this interviewer! !
In distributed scenarios, our common distributed lock solutions include (if you know how to do it, you can bring out the other two types here, but if you don’t, don’t screw yourself! ):
Distributed lock implemented based on database lock mechanism
Distributed lock implemented based on Zookeeper
The solution for Redis to implement distributed lock is as follows.
If Redis is in a stand-alone environment, We can implement distributed locks through the atomic instructions provided by Redis
set key value [EX seconds] [PX milliseconds ] [NX | Unlocking is allowed, but Redis does not provide such a function. We can only handle it through Lua scripts, because Lua scripts can ensure the atomic execution of multiple instructions. Finally, we have to consider the lock timeout issue. If the client never releases the lock, it will definitely not work. Therefore, the lock can only guarantee that it will not be unlocked by other clients within the specified timeout time range. It will be automatically released after the timeout. This kind of The situation is difficult. We can optimize it like this:
will increase A new problem, such as sentinel's one-master and multiple-slave environment, may be that the client applies for a lock on the master node, but the synchronization is not completed and the master node is down. At this time, the lock on the newly elected master node is invalid. The handling of this situation should be considered in this way. First of all, direct Redis master-slave synchronization cannot solve the problem of data loss in any case. So we consider changing the lock application from one Redis to multiple stand-alone Redis lock applications. Only most of the applications are successful. This idea is RedLock.
RedLock uses multiple Redis instances. There is no master-slave relationship between each instance and they are independent of each other. When locking, the client sends locking instructions to all nodes. If more than half of the nodes are set successfully, Lock successfully. When releasing the lock, you need to send del instructions to all nodes to release the lock.
Although the red lock solves the problem of master-slave synchronization, it brings new complex problems:
Therefore In RedLock, it is necessary to calculate the minimum validity period of the requested lock. Assume that the client applies for the lock successfully, the time when the first key is successfully set is TF, the time when the last key is successfully set is TL, the lock timeout is TTL, and the clock difference between different processes is CLOCK_DIFF, then the minimum validity of the lock The duration is:
TIME = TTL - (TF- TL) - CLOCK_DIFF
Using Redis to implement distributed locks, it is inseparable from server downtime and other unavailability issues , the same is true for RedLock here. Even if multiple servers apply for locks, we must also consider the processing after the server goes down. The official recommendation is to use AOF persistence processing.
However, AOF persistence can only be restarted and restored for normal SHUTDOWN instructions. However, if there is a power outage, the lock data from the last persistence to the power outage may be lost. When the server restarts , distributed lock semantic errors may occur. Therefore, in order to avoid this situation, the official recommendation is that after the Redis service is restarted, the Redis service will be unavailable within a maximum client TTL (no lock application service is provided). This can indeed solve the problem, but it is obvious that this will definitely affect the performance of the Redis server. , and when this happens to most nodes, the system will become globally unavailable.
#My understanding of this is roughly like this interviewer! !
Redis is very fast. A large part of the reason is that Redis data is stored in memory. Since it is in memory, when the server goes down or is powered off, the data will All are lost, so Redis provides two mechanisms to ensure that all Redis data will not be lost due to failures. This mechanism is called the persistence mechanism of Redis.
Redis has two persistence mechanisms:
RDB(Redis DataBase) refers to writing a snapshot of the data set in memory to disk within a specified time interval. RDB is a memory snapshot (a binary sequence of memory data (in the form of persistence), each time a snapshot is generated from Redis to fully back up the data.
Advantages:
Disadvantages:
RDB triggered rules are divided into two categories, namely manual triggering and automatic triggering:
Automatic triggering:
Configuration triggering Rule
shutdown trigger
Manual trigger:
save
bgsave
AOF(Append Only File) is to save all instructions that modify the memory (write operations ) is recorded in an independent log file, and the data is restored by executing the Redis command in the AOF file during restart. AOF can solve the real-time problem of data persistence and is the mainstream persistence solution in the current Redis persistence mechanism (we will talk about hybrid persistence after 4.0 later).
Advantages:
Disadvantages:
The AOF log exists in the form of a file. When the program writes to the AOF log file, the content is actually written to the file descriptor allocated by the kernel. In a memory buffer, the kernel will then asynchronously flush the data in the buffer to disk. If the server crashes before the data in the buffer has time to be flushed back to disk, the data will be lost.
Therefore, Redis calls fsync(int fid) provided by glibc of the Linux operating system to force the contents of the specified file from the kernel buffer back to the disk to ensure that the data in the buffer will not be lost. However, this is an IO operation, which is very slow compared to the performance of Redis, so it cannot be executed frequently.
There are three configurations for refreshing the buffer in the Redis configuration file:
appendfsync always
Every Redis write operation is written to the AOF log. In theory, the Linux operating system cannot handle this configuration because the concurrency of Redis far exceeds the maximum refresh frequency provided by the Linux operating system, even if there are relatively few Redis write operations. In this case, this configuration is also very performance-intensive because it involves IO operations, so this configuration will basically not be refreshed once per second. The data in the buffer is transferred to the AOF file. The default strategy in this Redis configuration file is compatible with the compromise between performance and data integrity. With this configuration, theoretically, the data is lost in about one second.
appendfsync no
The Redis process will not actively refresh the data in the buffer to the AOF file, but directly leaves it to the operating system for judgment. This operation is not recommended. The possibility of losing data is very high.
When I mentioned the shortcomings of AOF earlier, I said that AOF is a form of log append to store Redis write instructions. This will lead to a large number of redundant instruction storage, making the AOF log file very large. In this case Not only does it occupy memory, it will also cause recovery to be very slow, so Redis provides a rewriting mechanism to solve this problem. After Redis's AOF persistence mechanism performs rewriting, it saves only the minimum instruction set for restoring data. If we want to manually trigger it, we can use the following instructions bgrewriteaof
Since rewriting the AOF file will have a certain impact on the performance of Redis, automatic rewriting cannot be performed casually. Redis provides two configuration indicators for automatic AOF rewriting. Only Rewriting will only occur when these two indicators are met at the same time:
auto-aof-rewrite-percentage 100: refers to when the file’s memory reaches twice the original memory auto-aof-rewrite-min-size 64mb: refers to the minimum memory size for file rewritingIn addition, most usage scenarios after Redis 4.0 will not Use RDB or AOF alone as the persistence mechanism, but take into account the advantages of both and use them in combination.
Finally, to summarize the two, which one is better?
It is recommended to enable both If you are not sensitive to data, you can choose to use RDB aloneRedis is a key-value database based on memory storage. We know that although the memory is fast, the space is small. When the physical memory reaches the upper limit, the system will run very slowly, so we will set up Redis Maximum memory. When Redis memory reaches the set threshold, memory recycling will be triggered. Redis provides many memory elimination strategies:
noeviction:
When the memory limit is reached and the client tries An error is returned when executing a command that may cause more memory to be used. Simply put, read operations are still allowed, but new data is not allowed to be written.del (delete) requests can
.is randomly sampled from all keys with expiration times set), and is compared with the most recent access timestamp recorded in the key object, and the oldest key among the five keys is eliminated; if the memory is still not enough, continue Repeat this step.
When maxmemory_samples is set to 10 in Redis 3.0, Redis's approximate LRU algorithm is very close to the real LRU algorithm, but obviously setting maxmemory_samples to 10 consumes more CPU computing time than setting maxmemory_samples to 5, because each sample is As the sample data increases, the calculation time will also increase.
The LRU algorithm of Redis3.0 is more accurate than the LRU algorithm of Redis2.8, because Redis3.0 adds an elimination pool of the same size as maxmemory_samples. Every time a key is eliminated, It is first compared with the keys waiting to be eliminated in the elimination pool, and finally the oldest key is eliminated. In fact, the keys selected for elimination are put together and compared, and the oldest one is eliminated.
LRU has an obvious shortcoming. It cannot correctly represent the popularity of a Key. If a key has never been accessed, it was only accessed by the user a moment before the memory elimination occurred. In the LRU algorithm, this Will be considered a hot key. LFU (Least Frequently Used) is an elimination algorithm introduced in Redis 4.0. It eliminates keys by comparing the access frequency of keys. The key point is Frequently Used.
The difference between LRU and LFU:
In LFU mode, the 24-bit lru field of the Redis object header is divided into two segments for storage, with the high 16-bit storing ldt (Last Decrement Time) and the low 8-bit Store logc (Logistic Counter). The upper 16 bits are used to record the last time the counter decreased. Since there are only 8 bits, stores the Unix minute timestamp modulo 2^16. The maximum value that 16 bits can represent is 65535 (65535/24/60≈ 45.5), it will turn back in about 45.5 days (turning back means that the modulo value starts from 0 again).
The lower 8 bits are used to record the access frequency. The maximum value that 8 bits can represent is 255. Logc will definitely not be able to record the real number of Rediskey accesses. In fact, it can be seen from the name that it stores the logarithm of the number of accesses. The initial value of logc for each newly added key is 5 (LFU_INITI_VAL), which ensures that the newly added value will not be selected and eliminated first; logc will be updated every time the key is accessed; in addition, logc will decay over time.
Logistic Counter will not only grow, but also decay. The rules of growth and decay can also be configured through redis.conf.
My understanding of this is roughly like this interviewer! !
Cache breakdown:
It means that a hotspot key that is accessed very frequently is in a centralized high concurrent access situation. When this key is invalid In an instant, a large number of requests penetrated the cache, directly requested the database, and directly passed through Redis.
Solution:
Cache penetration:
refers to data that does not exist in the cache or database being requested. This situation is usually caused by hackers. If you do not take good defense, it can easily lead to the database being killed by requests. For example, if a hacker uses a negative ID to query one of your tables, our ID is usually not set to a negative number.
Solution:
Cache avalanche:
Cache avalanche occurs when a large number of caches fail at the same time, which will cause the database to crash instantly (high concurrency scenario), and In this case, if the cache is not restored, the database will be useless and will continue to be crashed.
Solution:
At this point in the answer, the interviewer's face showed a long-lost smile. We only had to take this trick, and this interview would be successful.
Of course, this knowledge point cannot be explained clearly in a few sentences, so I suggest that you read this article and you can easily hold it
《Redis Distributed—— Complete master-slave replication, Sentinel, and clustering》
This article is reproduced from: https://juejin.cn/post/7019088999792934926
Author: Li Ziba
For more programming-related knowledge, please visit: Introduction to Programming! !
The above is the detailed content of Sharing of Redis high-frequency interview questions in 2023 (with answer analysis). For more information, please follow other related articles on the PHP Chinese website!