This article has compiled 20 classic Redis interview questions for you. I hope it will be helpful to you.
Redis, the full English name is Remote Dictionary Server (remote dictionary service), is an open source log type written in ANSI C language, supports the network, can be based on memory and can be persisted. Key-Value database and provides APIs in multiple languages.
Different from the MySQL database, Redis data is stored in memory. Its read and write speeds are very fast and can handle more than 100,000 read and write operations per second. Therefore, redis is widely used in caching. In addition, Redis is also often used for distributed locks. In addition, Redis supports transactions, persistence, LUA scripts, LRU driven events, and various cluster solutions. 2. Talk about the basic data structure types of Redis
String (string)
Introduction: String is the most basic data structure type of Redis , it is binary safe and can store pictures or serialized objects. The maximum value stored is 512Mget key
, etc.
Application scenarios: shared session, distributed lock, counter, current limit.
, while Redis uses SDS (simple dynamic string)
encapsulation. The sds source code is as follows: struct sdshdr{
unsigned int len; // 标记buf的长度
unsigned int free; //标记buf中未使用的元素个数
char buf[]; // 存放元素的坑
}
Why does Redis choose the
SDS structure, while the C language native char[] Doesn’t it smell good?
Hash (Hash)Introduction: In Redis, the hash type refers to v (value) itself which is a key-value pair (k-v) structure
hget key field
Internal encoding: hashtable (hash table)
Application scenarios: caching user information, etc. List (list)
Introduction: List The (list) type is used to store multiple ordered strings. A list can store up to 2^32-1 elements.lrange key start end
Internal encoding: ziplist (compressed list) , linkedlist (linked list)list application scenarios refer to the following:
lpush lpop=Stack (stack)Introduction: The set type is also used to save multiple string elements, but duplicate elements are not allowed.Set (collection)
- lpush rpop=Queue (queue)
- lpsh ltrim=Capped Collection (limited collection)
- lpush brpop=Message Queue (message queue)
smembers key
Internal encoding: hashtable (hash table)
zadd key score member [score member...]
, zrank key member
ziplist (compressed list)
, skiplist (skip list)
Why is Redis so fast
We all know that memory reading and writing are faster than Compared with the MySQL database where data is stored on disk, the Redis database implemented based on memory storage, which is much faster than the disk, saves the consumption of disk I/O.
We know that in order to improve efficiency, Mysql index chooses the B-tree data structure. In fact, a reasonable data structure can make your application/program faster. Let’s first look at the data structure & internal encoding diagram of Redis:
SDS simple dynamic string
- String length processing: Redis obtains the string length, the time complexity is O(1), while in C language, it needs to be traversed from the beginning, the complexity is O(n);
- Space pre-allocation : The more frequently the string is modified, the more frequent the memory allocation will be, which will consume performance. SDS modification and space expansion will allocate additional unused space, reducing performance loss.
- Lazy space release: When SDS is shortened, instead of recycling excess memory space, free records the excess space. If there are subsequent changes, the space recorded in free will be directly used to reduce allocation.
- Binary safety: Redis can store some binary data. In C language, the string will end when it encounters '\0', while in SDS, the len attribute marks the end of the string.
Dictionary
Redis is a K-V type memory database, and all key values are stored in dictionaries. A dictionary is a hash table, such as HashMap, and the corresponding value can be obtained directly through the key. As for the characteristics of the hash table, the corresponding value can be obtained in O(1) time complexity.
Jump table
- The skip table is a unique data structure of Redis. It is based on the linked list and adds multi-level index improvement. Search efficiency.
- The skip table supports node search with average O(logN) and worst-case O(N) complexity, and can also process nodes in batches through sequential operations.
Redis supports multiple data types, and each basic type may have multiple data structures. When, what data structure to use, and what encoding to use are the results of the redis designer's summary and optimization.
- String: If numbers are stored, int type encoding is used; if non-numbers are stored, a string less than or equal to 39 bytes is embstr; if it is greater than 39 bytes, then It is raw encoding.
- List: If the number of elements in the list is less than 512, the value of each element in the list is less than 64 bytes (default), use ziplist encoding, otherwise use linkedlist encoding
- Hash: Ha If the number of hash type elements is less than 512 and all values are less than 64 bytes, use ziplist encoding, otherwise use hashtable encoding.
- Set: If the elements in the set are all integers and the number of elements is less than 512, use intset encoding, otherwise use hashtable encoding.
- Zset: When the number of elements in the ordered set is less than 128 and the value of each element is less than 64 bytes, use ziplist encoding, otherwise use skiplist (skip list) encoding
I/O Multiplexing
I/O multiplexing
Multiple I/O multiplexing technology allows a single thread to efficiently handle multiple connection requests, and Redis uses epoll as I/O multiplexing technology realization. Moreover, Redis's own event processing model converts connections, reads, writes, and shutdowns in epoll into events, without wasting too much time on network I/O.
What is I/O multiplexing?
- I/O: Network I/O
- Multi-channel: multiple network connections
- Multiplexing: reusing the same thread.
- IO multiplexing is actually a synchronous IO model, which implements a thread that can monitor multiple file handles; once a file handle is ready, it can notify the application to perform corresponding read and write operations; When no file handle is ready, the application will be blocked and the CPU will be handed over.
Single-threaded model
Redis directly builds the VM mechanism by itself. It will not call system functions like ordinary systems, which will waste a certain amount of time. Go move and request.
What is the virtual memory mechanism of Redis?
The virtual memory mechanism is to temporarily swap infrequently accessed data (cold data) from memory to disk, thereby freeing up valuable memory space for other data that needs to be accessed (hot data). data). The VM function can realize the separation of hot and cold data, so that the hot data is still in the memory and the cold data is saved to the disk. This can avoid the problem of slow access speed caused by insufficient memory.
Let’s first look at a common cache usage method: when a read request comes, check the cache first, and there will be a cache hit. , it will return directly; if the cache misses, it will check the database, then update the database value to the cache, and then return.
Read cache
Cache penetration: Refers to querying a data that must not exist, which is required because the cache misses If the data is queried from the database, it will not be written to the cache. This will cause the non-existent data to be queried in the database every time it is requested, which will put pressure on the database.
In layman's terms, when a read request is accessed, neither the cache nor the database has a certain value, which will cause each query request for this value to penetrate into the database. This is cache penetration. .
Cache penetration is generally caused by the following situations:
How to avoid cache penetration? Generally there are three methods.
Bloom filter principle: It consists of a bitmap array with an initial value of 0 and N hash functions. Perform N hash algorithms on a key to obtain N values. Hash these N values in the bit array and set them to 1. Then when checking, if these specific positions are all 1, then Bloom filtering The server determines that the key exists.
Cache snow run: refers to the expiration time of large batches of data in the cache, and the amount of query data Huge, all requests directly access the database, causing excessive pressure on the database and even downtime.
Cache breakdown: refers to when the hotspot key expires at a certain point in time, and it happens At this point in time, there are a large number of concurrent requests for this Key, and a large number of requests are hit to the db.
Cache breakdown looks a bit similar. In fact, the difference between them is that cache crash means that the database is under excessive pressure or even down. Cache breakdown is just a large number of concurrent requests to the DB database level. It can be considered that breakdown is a subset of cache snowrun. Some articles believe that the difference between the two is that breakdown is aimed at a certain hot key cache, while Xuebeng is targeted at many keys.
There are two solutions:
What is the hot key? In Redis, we call keys with high access frequency as hotspot keys.
If a request for a hotspot key is sent to the server host, due to a particularly large request volume, it may cause insufficient host resources or even downtime, thus affecting normal services.
How is the hotspot Key generated? There are two main reasons:
- The data consumed by users is much greater than the data produced, such as flash sales, hot news and other scenarios where more reading and less writing are required.
- Request sharding is concentrated and exceeds the performance of a single Redi server. For example, if the fixed name key and Hash fall into the same server, the amount of instant access will be huge, exceeding the machine bottleneck, and causing hot key problems.
So how to identify hot keys in daily development?
- Determine which hot keys are based on experience;
- Report client statistics;
- Report to service agent layer
How to solve the hot key problem?
- Redis cluster expansion: add shard copies to balance read traffic;
- Distribute hot keys to different servers;
- Use secondary Cache, that is, JVM local cache, reduces Redis read requests.
When we set key
, we can set an expiration time for it, such as expire key 60
. Specify that this key will expire after 60 seconds. How will redis handle it after 60 seconds? Let’s first introduce several expiration strategies:
Scheduled expiration
Each key with an expiration time needs to create a timer, and the key will be cleared immediately when the expiration time is reached. . This strategy can immediately clear expired data and is very memory-friendly; however, it will occupy a large amount of CPU resources to process expired data, thus affecting the cache response time and throughput.
Lazy expiration
Only when a key is accessed, it will be judged whether the key has expired, and it will be cleared if it expires. This strategy can save CPU resources to the maximum extent, but it is very unfriendly to memory. In extreme cases, a large number of expired keys may not be accessed again, thus not being cleared and occupying a large amount of memory.
Periodic expiration
Every certain time, a certain number of keys in the expires dictionary of a certain number of databases will be scanned, and the expired keys will be cleared. This strategy is a compromise between the first two. By adjusting the time interval of scheduled scans and the limited time consumption of each scan, the optimal balance between CPU and memory resources can be achieved under different circumstances.
The expires dictionary will save the expiration time data of all keys with expiration time set, where key is a pointer to a key in the key space, and value is the UNIX timestamp of the key with millisecond precision. Expiration. The key space refers to all the keys saved in the Redis cluster.
Redis uses both lazy expiration and periodic expirationtwo expiration strategies.
但是呀,如果定期删除漏掉了很多过期的key,然后也没走惰性删除。就会有很多过期key积在内存内存,直接会导致内存爆的。或者有些时候,业务量大起来了,redis的key被大量使用,内存直接不够了,运维小哥哥也忘记加大内存了。难道redis直接这样挂掉?不会的!Redis用8种内存淘汰策略保护自己~
- volatile-lru:当内存不足以容纳新写入数据时,从设置了过期时间的key中使用LRU(最近最少使用)算法进行淘汰;
- allkeys-lru:当内存不足以容纳新写入数据时,从所有key中使用LRU(最近最少使用)算法进行淘汰。
- volatile-lfu:4.0版本新增,当内存不足以容纳新写入数据时,在过期的key中,使用LFU算法进行删除key。
- allkeys-lfu:4.0版本新增,当内存不足以容纳新写入数据时,从所有key中使用LFU算法进行淘汰;
- volatile-random:当内存不足以容纳新写入数据时,从设置了过期时间的key中,随机淘汰数据;。
- allkeys-random:当内存不足以容纳新写入数据时,从所有key中随机淘汰数据。
- volatile-ttl:当内存不足以容纳新写入数据时,在设置了过期时间的key中,根据过期时间进行淘汰,越早过期的优先被淘汰;
- noeviction:默认策略,当内存不足以容纳新写入数据时,新写入操作会报错。
我们一提到redis,自然而然就想到缓存,国内外中大型的网站都离不开缓存。合理的利用缓存,比如缓存热点数据,不仅可以提升网站的访问速度,还可以降低数据库DB的压力。并且,Redis相比于memcached,还提供了丰富的数据结构,并且提供RDB和AOF等持久化机制,强的一批。
当今互联网应用,有各种各样的排行榜,如电商网站的月度销量排行榜、社交APP的礼物排行榜、小程序的投票排行榜等等。Redis提供的zset
数据类型能够实现这些复杂的排行榜。
比如,用户每天上传视频,获得点赞的排行榜可以这样设计:
zadd user:ranking:2021-03-03 Jay 3
zincrby user:ranking:2021-03-03 Jay 1
zrem user:ranking:2021-03-03 John
zrevrangebyrank user:ranking:2021-03-03 0 2
各大网站、APP应用经常需要计数器的功能,如短视频的播放数、电商网站的浏览数。这些播放数、浏览数一般要求实时的,每一次播放和浏览都要做加1的操作,如果并发量很大对于传统关系型数据的性能是一种挑战。Redis天然支持计数功能而且计数的性能也非常好,可以说是计数器系统的重要选择。
如果一个分布式Web服务将用户的Session信息保存在各自服务器,用户刷新一次可能就需要重新登录了,这样显然有问题。实际上,可以使用Redis将用户的Session进行集中管理,每次用户更新或者查询登录信息都直接从Redis中集中获取。
几乎每个互联网公司中都使用了分布式部署,分布式服务下,就会遇到对同一个资源的并发访问的技术难题,如秒杀、下单减库存等场景。
Likes/dislikes, fans, mutual friends/favorites, push, pull-down refresh, etc. are essential functions of social networking sites. Due to social Website visits are usually large, and traditional relational data is not suitable for saving this type of data. The data structure provided by Redis can realize these functions relatively easily.
Message queue is a must-have middleware for large websites, such as ActiveMQ, RabbitMQ, Kafka and other popular message queue middleware. It is mainly used for business decoupling. , traffic peak shaving and asynchronous processing of services with low real-time performance. Redis provides publish/subscribe and blocking queue functions, which can implement a simple message queue system. In addition, this cannot be compared with professional message middleware.
is used in scenarios with hundreds of millions of data, such as system check-ins for hundreds of millions of users, statistics on the number of logins without duplicates, and whether a user is online etc. Tencent has 1 billion users. How can we check whether a user is online within a few milliseconds? Never say to create a key for each user and then record it one by one (you can calculate the memory required, which will be very scary, and there are many such similar requirements. Use appropriate operations here - use the setbit, getbit, and bitcount commands. The principle is: build a long enough array in redis, each array element can only have two values 0 and 1, and then the subscript index of this array is used to represent the user ID (must be a number), then obviously, this A large array hundreds of millions long can build a memory system through subscripts and element values (0 and 1).
Redis is a memory-based non-relational K-V database. Since it is memory-based, if the Redis server hangs up, the data will be lost. In order to avoid data loss, Redis provides persistence , that is, saving the data to disk.
Redis provides RDB and AOF two persistence mechanisms. Its persistent file loading process is as follows:
RDB is to save the memory data to the disk in the form of a snapshot.
What is a snapshot? You can understand it this way, take a picture of the data at the current moment, and then save it.
RDB persistence refers to executing the specified number of times within a specified time interval The write operation writes the snapshot of the data set in the memory to the disk, which is the default persistence method of Redis. After the operation is completed, a dump.rdb
file will be generated in the specified directory, and Redis restarts When, restore the data by loading the dump.rdb
file. The RDB triggering mechanisms mainly include the following:
Advantages of RDB
RDB Disadvantages
AOF (append only file) Persistence, uses the form of log to record each write operation, append to the file, and then re-execute the command in the AOF file to restore the data when restarting. It mainly solves the problem of data persistence The real-time problem of optimization. It is not enabled by default.
The workflow of AOF is as follows:
Advantages of AOF
Disadvantages of AOF
When we use Redis in the project, we will definitely not deploy the Redis service at a single point. Because once the single-point deployment goes down, it is no longer available. In order to achieve high availability, a common practice is to copy multiple copies of the database and deploy them on different servers. If one of them fails, it can continue to provide services. There are three deployment modes for Redis to achieve high availability: master-slave mode, sentinel mode, and cluster mode.
In the master-slave mode, Redis deploys multiple machines, with a master node responsible for reading and writing operations, and a slave node responsible only for reading. operate. The data of the slave node comes from the master node, and the implementation principle is Master-slave replication mechanism
Master-slave replication includes full replication and incremental replication. Generally, when the slave starts to connect to the master for the first time, or it is considered to be the first time to connect, full copy is used. The full copy process is as follows:
After version 2.8 of redis, psync has been used to replace sync, because the sync command consumes system resources and psync is more efficient.
After the slave is fully synchronized with the master, if the data on the master is updated again, incremental replication will be triggered.
When data increases or decreases on the master node, the replicationFeedSalves()
function will be triggered. Each command subsequently called on the Master node will use replicationFeedSlaves()
To synchronize to the Slave node. Before executing this function, the master node will determine whether the command executed by the user has data updates. If there is data update and the slave node is not empty, this function will be executed. The function of this function is: Send the command executed by the user to all slave nodes, and let the slave node execute it. The process is as follows:
Sentinel mode, a Sentinel system composed of one or more Sentinel instances, which can monitor all Redis master nodes and slave nodes, and enter the offline state when the monitored master node When, automatically upgrades a slave node under the offline master server to the new master node . However, when a sentinel process monitors a Redis node, problems may occur (Single point problem). Therefore, multiple sentinels can be used to monitor Redis nodes, and there will be ongoing communication between each sentinel. monitor.
Sentinel Sentinel ModeSimply speaking, Sentinel Mode has three functions:What is the failover process?
Assuming that the main server is down and Sentinel 1 detects this result first, the system will not The failover process is carried out immediately. Sentinel 1 only subjectively believes that the main server is unavailable, and this phenomenon becomes a subjective offline. When the subsequent sentinels also detect that the main server is unavailable and the number reaches a certain value, a vote will be held between the sentinels. The result of the vote will be initiated by one sentinel to perform a failover operation. After the switch is successful, each sentinel will use the publish-subscribe mode to switch the slave server it monitors to the host. This process is called objective offline. This way everything is transparent to the client.The working mode of Sentinel is as follows:
Under normal circumstances, each Sentinel will send an INFO command to all Masters and Slaves it knows once every 10 seconds.
When the Master is marked as objectively offline by Sentinel, the frequency of Sentinel sending INFO commands to all slaves of the offline Master will be changed from once every 10 seconds to once per second
If there are not enough Sentinels to agree that the Master is offline, the Master's objective offline status will be removed; if the Master returns a valid reply to the Sentinel's PING command, the Master's subjective offline status will be removed. The status will be removed.
The sentinel mode is based on the master-slave mode and realizes the separation of reading and writing. It can also automatically switch and the system availability is higher. . However, the data stored in each node is the same, which wastes memory and is not easy to expand online. Therefore, the Cluster cluster came into being. It was added in Redis3.0 and implemented Redis's distributed storage. Segment the data, which means that each Redis node stores different content to solve the problem of online expansion. Moreover, it also provides replication and failover capabilities.
Cluster cluster node communication
A Redis cluster consists of multiple nodes, How do each node communicate? Through Gossip protocol!
The Redis Cluster cluster communicates through the Gossip protocol. The nodes continuously exchange information. The information exchanged includes node failure, new node joining, master-slave node change information, slot information, etc. Commonly used gossip messages are divided into four types: ping, pong, meet, and fail.
- meet message: Notify new nodes to join. The message sender notifies the receiver to join the current cluster. After the meet message communication is completed normally, the receiving node will join the cluster and perform periodic ping and pong message exchanges.
- Ping message: The most frequently exchanged message in the cluster. Each node in the cluster sends ping messages to multiple other nodes every second, which is used to detect whether the nodes are online and exchange status information with each other.
- pong message: When receiving a ping or meet message, it will reply to the sender as a response message to confirm the normal communication of the message. The pong message internally encapsulates its own status data. A node can also broadcast its own pong message to the cluster to notify the entire cluster to update its status.
- fail message: When a node determines that another node in the cluster is offline, it will broadcast a fail message to the cluster. After receiving the fail message, other nodes will update the corresponding node to the offline state.
Specially, each node communicates with other nodes through cluster bus(cluster bus). When communicating, use a special port number, that is, the external service port number plus 10000. For example, if the port number of a node is 6379, then the port number it uses to communicate with other nodes is 16379. Communication between nodes uses a special binary protocol.
Hash Slot Slot Algorithm
Since it is distributed storage, is the distributed algorithm used by the Cluster cluster Consistent Hash? No, but Hash Slot slot algorithm.
Slot algorithmThe entire database is divided into 16384 slots (slots). Each key-value pair entering Redis is hashed according to the key and allocated to these 16384 slots. one of. The hash map used is also relatively simple. It uses the CRC16 algorithm to calculate a 16-bit value, and then modulo 16384. Each key in the database belongs to one of these 16384 slots, and each node in the cluster can handle these 16384 slots.
Each node in the cluster is responsible for a part of the hash slots. For example, the current cluster has nodes A, B, and C, and the number of hash slots on each node = 16384/3, then there is:
Redis Cluster Cluster
In the Redis Cluster cluster, it is necessary to ensure that the nodes corresponding to the 16384 slots are working normally. If a node fails, the slot it is responsible for will also fail. The entire The cluster will not work.
Therefore, in order to ensure high availability, the Cluster cluster introduces master-slave replication, and one master node corresponds to one or more slave nodes. When other master nodes ping a master node A, if more than half of the master nodes communicate with A times out, then the master node A is considered to be down. If the master node goes down, the slave node will be enabled.
On each node of Redis, there are two things, one is the slot, and its value range is 0~16383. The other is cluster, which can be understood as a cluster management plug-in. When the key we access arrives, Redis will obtain a 16-bit value based on the CRC16 algorithm, and then take the result modulo 16384. Each key in Jiangzi corresponds to a hash slot numbered between 0 and 16383. Use this value to find the node corresponding to the corresponding slot, and then automatically jump to the corresponding node for access operations.
虽然数据是分开存储在不同节点上的,但是对客户端来说,整个集群Cluster,被看做一个整体。客户端端连接任意一个node,看起来跟操作单实例的Redis一样。当客户端操作的key没有被分配到正确的node节点时,Redis会返回转向指令,最后指向正确的node,这就有点像浏览器页面的302 重定向跳转。
故障转移
Redis集群实现了高可用,当集群内节点出现故障时,通过故障转移,以保证集群正常对外提供服务。
redis集群通过ping/pong消息,实现故障发现。这个环境包括主观下线和客观下线。
主观下线: 某个节点认为另一个节点不可用,即下线状态,这个状态并不是最终的故障判定,只能代表一个节点的意见,可能存在误判情况。
主观下线
客观下线: 指标记一个节点真正的下线,集群内多个节点都认为该节点不可用,从而达成共识的结果。如果是持有槽的主节点故障,需要为该节点进行故障转移。
流程如下:
客观下线
故障恢复:故障发现后,如果下线节点的是主节点,则需要在它的从节点中选一个替换它,以保证集群的高可用。流程如下:
分布式锁,是控制分布式系统不同进程共同访问共享资源的一种锁的实现。秒杀下单、抢红包等等业务场景,都需要用到分布式锁,我们项目中经常使用Redis作为分布式锁。
选了Redis分布式锁的几种实现方法,大家来讨论下,看有没有啥问题哈。
if(jedis.setnx(key,lock_value) == 1){ //加锁 expire(key,100); //设置过期时间 try { do something //业务请求 }catch(){ } finally { jedis.del(key); //释放锁 } }
如果执行完setnx
加锁,正要执行expire设置过期时间时,进程crash掉或者要重启维护了,那这个锁就“长生不老”了,别的线程永远获取不到锁啦,所以分布式锁不能这么实现。
long expires = System.currentTimeMillis() + expireTime; //系统时间+设置的过期时间 String expiresStr = String.valueOf(expires); // 如果当前锁不存在,返回加锁成功 if (jedis.setnx(key, expiresStr) == 1) { return true; } // 如果锁已经存在,获取锁的过期时间 String currentValueStr = jedis.get(key); // 如果获取到的过期时间,小于系统当前时间,表示已经过期 if (currentValueStr != null && Long.parseLong(currentValueStr) < System.currentTimeMillis()) { // 锁已过期,获取上一个锁的过期时间,并设置现在锁的过期时间(不了解redis的getSet命令的小伙伴,可以去官网看下哈) String oldValueStr = jedis.getSet(key_resource_id, expiresStr); if (oldValueStr != null && oldValueStr.equals(currentValueStr)) { // 考虑多线程并发的情况,只有一个线程的设置值和当前值相同,它才可以加锁 return true; } } //其他情况,均返回加锁失败 return false; }
笔者看过有开发小伙伴是这么实现分布式锁的,但是这种方案也有这些缺点:
jedis.getSet()
,最终只能有一个客户端加锁成功,但是该客户端锁的过期时间,可能被别的客户端覆盖。10.3:set的扩展命令(set ex px nx)(注意可能存在的问题)
if(jedis.set(key, lock_value, "NX", "EX", 100s) == 1){ //加锁 try { do something //业务处理 }catch(){ } finally { jedis.del(key); //释放锁 } }
这个方案可能存在这样的问题:
if(jedis.set(key, uni_request_id, "NX", "EX", 100s) == 1){ //加锁 try { do something //业务处理 }catch(){ } finally { //判断是不是当前线程加的锁,是才释放 if (uni_request_id.equals(jedis.get(key))) { jedis.del(key); //释放锁 } } }
在这里,判断当前线程加的锁和释放锁是不是一个原子操作。如果调用jedis.del()释放锁的时候,可能这把锁已经不属于当前客户端,会解除他人加的锁。
一般也是用lua脚本代替。lua脚本如下:
if redis.call('get',KEYS[1]) == ARGV[1] then return redis.call('del',KEYS[1]) else return 0 end;
这种方式比较不错了,一般情况下,已经可以使用这种实现方式。但是存在锁过期释放了,业务还没执行完的问题(实际上,估算个业务处理的时间,一般没啥问题了)。
分布式锁可能存在锁过期释放,业务没执行完的问题。有些小伙伴认为,稍微把锁过期时间设置长一些就可以啦。其实我们设想一下,是否可以给获得锁的线程,开启一个定时守护线程,每隔一段时间检查锁是否还存在,存在则对锁的过期时间延长,防止锁过期提前释放。
当前开源框架Redisson就解决了这个分布式锁问题。我们一起来看下Redisson底层原理是怎样的吧:
只要线程一加锁成功,就会启动一个watch dog
看门狗,它是一个后台线程,会每隔10秒检查一下,如果线程1还持有锁,那么就会不断的延长锁key的生存时间。因此,Redisson就是使用Redisson解决了锁过期释放,业务没执行完问题。
Redis一般都是集群部署的,假设数据在主从同步过程,主节点挂了,Redis分布式锁可能会有哪些问题呢?一起来看些这个流程图:
如果线程一在Redis的master节点上拿到了锁,但是加锁的key还没同步到slave节点。恰好这时,master节点发生故障,一个slave节点就会升级为master节点。线程二就可以获取同个key的锁啦,但线程一也已经拿到锁了,锁的安全性就没了。
为了解决这个问题,Redis作者 antirez提出一种高级的分布式锁算法:Redlock。Redlock核心思想是这样的:
搞多个Redis master部署,以保证它们不会同时宕掉。并且这些master节点是完全相互独立的,相互之间不存在数据同步。同时,需要确保在这多个master实例上,是与在Redis单实例,使用相同方法来获取和释放锁。
我们假设当前有5个Redis master节点,在5台服务器上面运行这些Redis实例。
RedLock的实现步骤:如下
- 1.获取当前时间,以毫秒为单位。
- 2.按顺序向5个master节点请求加锁。客户端设置网络连接和响应超时时间,并且超时时间要小于锁的失效时间。(假设锁自动失效时间为10秒,则超时时间一般在5-50毫秒之间,我们就假设超时时间是50ms吧)。如果超时,跳过该master节点,尽快去尝试下一个master节点。
- 3.客户端使用当前时间减去开始获取锁时间(即步骤1记录的时间),得到获取锁使用的时间。当且仅当超过一半(N/2+1,这里是5/2+1=3个节点)的Redis master节点都获得锁,并且使用的时间小于锁失效时间时,锁才算获取成功。(如上图,10s> 30ms+40ms+50ms+4m0s+50ms)
- 如果取到了锁,key的真正有效时间就变啦,需要减去获取锁所使用的时间。
- 如果获取锁失败(没有在至少N/2+1个master实例取到锁,有或者获取锁时间已经超过了有效时间),客户端要在所有的master节点上解锁(即便有些master节点根本就没有加锁成功,也需要解锁,以防止有些漏网之鱼)。
简化下步骤就是:
跳跃表
What is delayed double deletion? The flow chart is as follows:
Delayed double deletion process
How long does it usually take to sleep for a while? Are they all 1 second?
This sleep time = the time it takes to read business logic data is several hundred milliseconds. In order to ensure that the read request ends, the write request can delete cached dirty data that may be brought by the read request.
This solution is not bad. Only during the period of sleep (for example, just 1 second), there may be dirty data, and general businesses will accept it. But what if the second cache deletion fails? The cache and database data may still be inconsistent, right? How about setting a natural expire expiration time for the Key and letting it expire automatically? Does the business have to accept data inconsistencies within the expiration period? Or is there any other better solution?
Because delayed double deletion may fail in the second step of deleting the cache, resulting in data inconsistency. You can use this solution to optimize: if the deletion fails, delete it a few more times to ensure that the cache deletion is successful~ So you can introduce a deletion cache retry mechanism
Delete cache retry Trial process
Retry the deletion cache mechanism Sure, but it will cause a lot of business code intrusions. In fact, it can also be optimized like this: asynchronously eliminating keys through the binlog of the database.
Take mysql as an example
The use of multi-threading by redis does not completely abandon single-threading. Redis still uses a single-threaded model to process client requests. It only uses multi-threading to handle data reading and writing and protocol analysis. To execute commands, it is still used Single threaded.
The purpose of this is because the performance bottleneck of redis lies in network IO rather than CPU. Using multi-threading can improve the efficiency of IO reading and writing, thereby improving the overall performance of redis.
Redis implements the transaction mechanism through a set of commands such as MULTI, EXEC, WATCH. Transactions support executing multiple commands at one time, and all commands in a transaction will be serialized. During the transaction execution process, the commands in the queue will be executed serially in order, and command requests submitted by other clients will not be inserted into the transaction execution command sequence.
In short, a Redis transaction is a sequential, one-time, exclusive execution of a series of commands in a queue.
The process of Redis transaction execution is as follows:
Redis is a K-V in-memory database, which uses a global hash to save all key-value pairs. . This hash table is composed of multiple hash buckets. The entry element in the hash bucket stores the key and value pointers, where *key points to the actual key and *value points to the actual value. .
The hash table lookup speed is very fast, somewhat similar to HashMap in Java, which allows us to quickly find key-value pairs in O(1) time complexity. First, calculate the hash value through the key, find the corresponding hash bucket location, then locate the entry, and find the corresponding data in the entry.
What is a hash collision?
Hash conflict: The same hash value is calculated through different keys, resulting in the same hash bucket.
In order to resolve hash conflicts, Redis uses chain hashing. Chained hashing means that multiple elements in the same hash bucket are stored in a linked list, and they are connected in turn using pointers.
#Some readers may still have questions: The elements on the hash conflict chain can only be searched one by one through pointers and then operated. When a lot of data is inserted into the hash table, the more conflicts there will be, the longer the conflict linked list will be, and the query efficiency will be reduced.
In order to maintain efficiency, Redis will perform a rehash operation on the hash table, which means adding hash buckets and reducing conflicts. In order to make rehash more efficient, Redis also uses two global hash tables by default, one for current use, called the main hash table, and one for expansion, called the backup hash table.
Yes, Redis provides two instructions to generate RDB, namely save and bgsave.
RESP, the full English name is Redis Serialization Protocol, which is a serialization protocol specially designed for redis. This protocol is actually It has already appeared in redis version 1.2, but it was only in redis2.0 that it finally became the standard of redis communication protocol.
RESP mainly has the advantages of simple implementation, fast parsing speed, and good readability.
To deal with the cache penetration problem, we can use Bloom filter. What is a bloom filter?
The Bloom filter is a data structure that takes up very little space. It consists of a long binary vector and a set of Hash mapping functions. It is used to retrieve whether an element is in a set, space The efficiency and query time are much better than the general algorithm. The disadvantage is that there is a certain misrecognition rate and deletion difficulty.
What is the principle of Bloom filter? Suppose we have a set A, and there are n elements in A. Use k hash hash functions to map each element in A to different positions in an array B with a length of a bit. The binary numbers at these positions Both are set to 1. If the element to be checked is mapped by these k hash functions and it is found that the binary numbers at its k positions are all 1, this element is likely to belong to set A. Otherwise, must not belong to the set A.
Let’s take a simple example. Suppose set A has 3 elements, namely {d1,d2,d3}. There is 1 hash function, which is Hash1. Now map each element of A to an array B of length 16 bits.
We now map d1, assuming Hash1 (d1) = 2, we change the grid with subscript 2 in array B to 1, as follows: We now also mapd2, assuming Hash1(d2) = 5, we also change the grid with subscript 5 in array B into 1, as follows:
Then we mapd3, assuming that Hash1 (d3) is also equal to 2, it also sets the subscript to 2 Grid subscript 1:
Therefore, we need to confirm whether an element dn is in set A. We only need to calculate the index subscript obtained by Hash1 (dn), as long as it is 0 , that means this elementis not in the set A. What if the index subscript is 1? Then the element may be an element in A. Because you see, the subscript values obtained by d1 and d3 may both be 1, or they may be mapped to other numbers. The Bloom filter has this disadvantage: there will be a hash False positive caused by collision , there is an error in judgment.
How toreduce this error?
We add another Hash2Hash mapping function, assuming Hash2(d1)=6, Hash2(d3)=8, they will not conflict if they are not Well, as follows:
Even if there is an error, we can find that the Bloom filter does not store complete data, it just uses a series of The hash map function calculates the position and then fills the binary vector. If the number is very large, the Bloom filter can save a lot of storage space through a very small error rate, which is quite cost-effective.
Currently there are open source class libraries that implement the Bloom filter accordingly, such as Google's Guava class library, Twitter's Algebird class library, which can be easily picked up, or based on the ones that come with Redis It is also possible for Bitmaps to implement its own design.
For more programming related knowledge, please visit: Programming Video! !
The above is the detailed content of Summary of 20 classic Redis interview questions and answers (share). For more information, please follow other related articles on the PHP Chinese website!