Golden Nine and Silver Ten are coming soon. This article will share with you 20 Redis classic interview questions. I hope it will be helpful to everyone!
Redis, the full English name is Remote Dictionary Server (remote dictionary service), is an open source written in ANSI C language, supports the network, can be memory-based and persistent Log type, Key-Value database, and provides APIs in multiple languages. [Related recommendations: Redis Video Tutorial]
Unlike the MySQL database, Redis data is stored in memory. Its read and write speeds are very fast and can handle more than 100,000 read and write operations per second. Therefore, redis is widely used in caching. In addition, Redis is also often used for distributed locks. In addition, Redis supports transactions, persistence, LUA scripts, LRU driven events, and various cluster solutions.
2. Let’s talk about the basic data structure types of RedisString (string)
#String (string)
Introduction: String is the most basic data structure type of Redis , it is binary safe and can store pictures or serialized objects. The maximum value stored is 512Mget key
, etc.
Application scenarios: shared session, distributed lock, counter, current limit.
, while Redis uses SDS (simple dynamic string)
encapsulation. The sds source code is as follows: struct sdshdr{
unsigned int len; // 标记buf的长度
unsigned int free; //标记buf中未使用的元素个数
char buf[]; // 存放元素的坑
}
Why does Redis choose the
SDS structure, while the C language native char[] Doesn’t it smell good?
Hash (Hash)Introduction: In Redis, the hash type refers to v (value) itself which is a key-value pair (k-v) structure
hget key field
Internal encoding: hashtable (hash table)
Application scenarios: caching user information, etc. List (list)
Introduction: List The (list) type is used to store multiple ordered strings. A list can store up to 2^32-1 elements.lrange key start end
Internal encoding: ziplist (compressed list) , linkedlist (linked list)list application scenarios refer to the following:
lpush lpop=Stack (stack)Set (collection)
- lpush rpop=Queue (queue)
- lpsh ltrim=Capped Collection (limited collection)
- lpush brpop=Message Queue (message queue)
sadd key element [element ... ]
, smembers key
intset (integer set)
, hashtable (hash table)
zadd key score member [score member ...]
,zrank key member
ziplist (compressed list)
, skiplist (Jump table)
We all know that memory reading and writing are much faster than disk. Compared with the database implemented by Redis based on memory storage, The data is stored in the MySQL database on the disk, eliminating disk I/O consumption.
We know that in order to improve efficiency, Mysql index chooses the B-tree data structure. In fact, a reasonable data structure can make your application/program faster. Let’s first look at the data structure & internal encoding diagram of Redis:
- String length processing: Redis obtains the string length, the time complexity is O(1), while in C language, it needs to be traversed from the beginning, the complexity is O(n);
- Space pre-allocation : The more frequently the string is modified, the more frequent the memory allocation will be, which will consume performance. SDS modification and space expansion will allocate additional unused space, reducing performance loss.
- Lazy space release: When SDS is shortened, instead of recycling excess memory space, free records the excess space. If there are subsequent changes, the space recorded in free will be directly used to reduce allocation.
- Binary safety: Redis can store some binary data. In C language, the string will end when it encounters '\0', while in SDS, the len attribute marks the end of the string.
Redis is a K-V type memory database, and all key values are stored in dictionaries. A dictionary is a hash table, such as HashMap, and the corresponding value can be obtained directly through the key. As for the characteristics of the hash table, the corresponding value can be obtained in O(1) time complexity.
- The skip table is a unique data structure of Redis. It is based on the linked list and adds multi-level index improvement. Search efficiency.
- The skip table supports node search with average O(logN) and worst-case O(N) complexity, and can also process nodes in batches through sequential operations.
Redis supports multiple data types, and each basic type may have multiple data structures. When, what data structure to use, and what encoding to use are the results of the redis designer's summary and optimization.
- String: If numbers are stored, int type encoding is used; if non-numbers are stored, a string less than or equal to 39 bytes is embstr; if it is greater than 39 bytes, it is raw encoding. .
- List: If the number of elements in the list is less than 512, the value of each element in the list is less than 64 bytes (default), use ziplist encoding, otherwise use linkedlist encoding
- Hash: Ha If the number of hash type elements is less than 512 and all values are less than 64 bytes, use ziplist encoding, otherwise use hashtable encoding.
- Set: If the elements in the set are all integers and the number of elements is less than 512, use intset encoding, otherwise use hashtable encoding.
- Zset: When the number of elements in the ordered set is less than 128 and the value of each element is less than 64 bytes, use ziplist encoding, otherwise use skiplist (skip list) encoding
I/O multiplexing
Multiple I /O multiplexing technology allows a single thread to efficiently handle multiple connection requests, and Redis uses epoll as the implementation of I/O multiplexing technology. Moreover, Redis's own event processing model converts connections, reads, writes, and shutdowns in epoll into events, without wasting too much time on network I/O.
What is I/O multiplexing?
- I/O: Network I/O
- Multi-channel: multiple network connections
- Multiplexing: reusing the same thread.
- IO multiplexing is actually a synchronous IO model, which implements a thread that can monitor multiple file handles; once a file handle is ready, it can notify the application to perform corresponding read and write operations; When no file handle is ready, the application will be blocked and the CPU will be handed over.
Single-threaded model
Redis directly builds the VM mechanism by itself. It will not call system functions like ordinary systems and waste a certain amount of time to move and request.
What is the virtual memory mechanism of Redis?
The virtual memory mechanism is to temporarily swap infrequently accessed data (cold data) from memory to disk, thereby freeing up valuable memory space for other data that needs to be accessed (hot data). data). The VM function can realize the separation of hot and cold data, so that the hot data is still in the memory and the cold data is saved to the disk. This can avoid the problem of slow access speed caused by insufficient memory.
Let’s first look at a common way to use cache: when a read request comes, check the cache first. If the cache has a value hit, it will return directly; if the cache misses, Just check the database, update the database value to the cache, and then return.
Cache penetration: refers to querying a data that must not exist. Since the cache misses, it needs to be queried from the database. If the data cannot be found, Without writing to the cache, this will cause the non-existent data to be queried in the database every time it is requested, which will put pressure on the database.
In layman's terms, when a read request is accessed, neither the cache nor the database has a certain value, which will cause each query request for this value to penetrate into the database. This is cache penetration. .
Cache penetration is generally caused by the following situations:
How to avoid cache penetration? Generally there are three methods.
Bloom filter principle: It consists of a bitmap array with an initial value of 0 and N hash functions. Perform N hash algorithms on a key to obtain N values. Hash these N values in the bit array and set them to 1. Then when checking, if these specific positions are all 1, then Bloom filtering The server determines that the key exists.
Cache snow run: refers to the expiration time of large batches of data in the cache, and the amount of query data is huge, and all requests are accessed directly Database, causing excessive pressure on the database or even downing the machine.
Cache breakdown: refers to when the hotspot key expires at a certain point in time, and exactly at this point in time, this Key has a large number of concurrent requests, so a large number of requests hit the db.
Cache breakdown looks a bit similar. In fact, the difference between them is that cache snowfall means that the database is under excessive pressure or even down. Cache breakdown is just a large number of concurrent requests to the DB database level. It can be considered that breakdown is a subset of cache snowrun. Some articles believe that the difference between the two is that breakdown is aimed at a certain hot key cache, while Xuebeng is targeted at many keys.
There are two solutions:
What is the hot key? In Redis, we call keys with high access frequency as hotspot keys.
If a request for a hotspot key is sent to the server host, due to a particularly large request volume, it may cause insufficient host resources or even downtime, thus affecting normal services.
How is the hotspot Key generated? There are two main reasons:
- The data consumed by users is much greater than the data produced, such as flash sales, hot news and other scenarios where more reading and less writing are required.
- Request sharding is concentrated and exceeds the performance of a single Redi server. For example, if the fixed name key and Hash fall into the same server, the amount of instant access will be huge, exceeding the machine bottleneck, and causing hot key problems.
So how to identify hot keys in daily development?
- Determine which hot keys are based on experience;
- Report client statistics;
- Report to service agent layer
How to solve the hot key problem?
- Redis cluster expansion: add shard copies to balance read traffic;
- Distribute hot keys to different servers;
- Use secondary Cache, that is, JVM local cache, reduces Redis read requests.
We are at# When ##set key, you can set an expiration time for it, such as
expire key 60. Specify that this key will expire after 60 seconds. How will redis handle it after 60 seconds? Let’s first introduce several expiration strategies:
Each key with an expiration time needs to create a timer, and the key will be cleared immediately when the expiration time is reached. . This strategy can immediately clear expired data and is very memory-friendly; however, it will occupy a large amount of CPU resources to process expired data, thus affecting the cache response time and throughput.
Only when a key is accessed, it will be judged whether the key has expired, and it will be cleared when it expires. This strategy can save CPU resources to the maximum extent, but it is very unfriendly to memory. In extreme cases, a large number of expired keys may not be accessed again, thus not being cleared and occupying a large amount of memory.
Every certain time, a certain number of keys in the expires dictionary of a certain number of databases will be scanned, and the expired keys will be cleared. This strategy is a compromise between the first two. By adjusting the time interval of scheduled scans and the limited time consumption of each scan, the optimal balance between CPU and memory resources can be achieved under different circumstances.
The expires dictionary will save the expiration time data of all keys with expiration time set, where key is a pointer to a key in the key space, and value is the UNIX timestamp of the key with millisecond precision. Expiration. The key space refers to all the keys saved in the Redis cluster.
Redis uses both lazy expiration and periodic expirationtwo expiration strategies.
But, if regular deletion misses a lot of expired keys, then lazy deletion is not performed. There will be a lot of expired keys accumulated in the memory, which will directly cause the memory to explode. Or sometimes, when the business volume increases, redis keys are used a lot, the memory is simply not enough, and the operation and maintenance guy forgets to increase the memory. Could redis just hang up like this? Will not! Redis uses 8 memory elimination strategies to protect itself~
- volatile-lru: When the memory is not enough to accommodate newly written data, set it from Use the LRU (Least Recently Used) algorithm for keys that have expired;
- allkeys-lru: When the memory is insufficient to accommodate newly written data, use the LRU (Least Recently Used) algorithm from all keys Perform elimination.
- volatile-lfu: Newly added in version 4.0, when the memory is insufficient to accommodate newly written data, the LFU algorithm is used to delete keys among expired keys.
- allkeys-lfu: Newly added in version 4.0, when the memory is not enough to accommodate the newly written data, the LFU algorithm is used to eliminate all keys;
- volatile-random: When the memory is not enough When accommodating newly written data, the data is randomly eliminated from the key with an expiration time set;.
- allkeys-random: When the memory is insufficient to accommodate newly written data, data is randomly eliminated from all keys.
- volatile-ttl: When the memory is not enough to accommodate the newly written data, the key with an expiration time set will be eliminated according to the expiration time, and the earlier the expiration date will be eliminated first;
- noeviction: Default policy, when the memory is not enough to accommodate the newly written data, the new write operation will report an error.
When we mention redis, we naturally think of cache. Medium and large websites at home and abroad are inseparable from cache. Reasonable use of cache, such as caching hotspot data, can not only improve the access speed of the website, but also reduce the pressure on the database DB. Moreover, compared to memcached, Redis also provides rich data structures, and provides persistence mechanisms such as RDB and AOF, which are among the strongest.
Today’s Internet applications have various rankings, such as the monthly sales rankings of e-commerce websites, the gift rankings of social APPs, and the voting rankings of mini programs. List and so on. The zset
data type provided by Redis can implement these complex rankings.
For example, if users upload videos every day, the ranking list of likes can be designed like this:
zadd user:ranking:2021-03-03 Jay 3
zincrby user:ranking:2021-03-03 Jay 1
zrem user:ranking:2021-03-03 John
zrevrangebyrank user:ranking:2021-03-03 0 2
各大网站、APP应用经常需要计数器的功能,如短视频的播放数、电商网站的浏览数。这些播放数、浏览数一般要求实时的,每一次播放和浏览都要做加1的操作,如果并发量很大对于传统关系型数据的性能是一种挑战。Redis天然支持计数功能而且计数的性能也非常好,可以说是计数器系统的重要选择。
如果一个分布式Web服务将用户的Session信息保存在各自服务器,用户刷新一次可能就需要重新登录了,这样显然有问题。实际上,可以使用Redis将用户的Session进行集中管理,每次用户更新或者查询登录信息都直接从Redis中集中获取。
几乎每个互联网公司中都使用了分布式部署,分布式服务下,就会遇到对同一个资源的并发访问的技术难题,如秒杀、下单减库存等场景。
赞/踩、粉丝、共同好友/喜好、推送、下拉刷新等是社交网站的必备功能,由于社交网站访问量通常比较大,而且传统的关系型数据不太适保存 这种类型的数据,Redis提供的数据结构可以相对比较容易地实现这些功能。
消息队列是大型网站必用中间件,如ActiveMQ、RabbitMQ、Kafka等流行的消息队列中间件,主要用于业务解耦、流量削峰及异步处理实时性低的业务。Redis提供了发布/订阅及阻塞队列功能,能实现一个简单的消息队列系统。另外,这个不能和专业的消息中间件相比。
用于数据量上亿的场景下,例如几亿用户系统的签到,去重登录次数统计,某用户是否在线状态等等。腾讯10亿用户,要几个毫秒内查询到某个用户是否在线,能怎么做?千万别说给每个用户建立一个key,然后挨个记(你可以算一下需要的内存会很恐怖,而且这种类似的需求很多。这里要用到位操作——使用setbit、getbit、bitcount命令。原理是:redis内构建一个足够长的数组,每个数组元素只能是0和1两个值,然后这个数组的下标index用来表示用户id(必须是数字哈),那么很显然,这个几亿长的大数组就能通过下标和元素值(0和1)来构建一个记忆系统。
Redis是基于内存的非关系型K-V数据库,既然它是基于内存的,如果Redis服务器挂了,数据就会丢失。为了避免数据丢失了,Redis提供了持久化,即把数据保存到磁盘。
Redis提供了RDB和AOF两种持久化机制,它持久化文件加载流程如下:
RDB,就是把内存数据以快照的形式保存到磁盘上。
什么是快照?可以这样理解,给当前时刻的数据,拍一张照片,然后保存下来。
RDB持久化,是指在指定的时间间隔内,执行指定次数的写操作,将内存中的数据集快照写入磁盘中,它是Redis默认的持久化方式。执行完操作后,在指定目录下会生成一个dump.rdb
文件,Redis 重启的时候,通过加载dump.rdb
文件来恢复数据。RDB触发机制主要有以下几种:
RDB 的优点
RDB缺点
##AOF (append only file) Persistence, in the form of logs Record each write operation, append it to the file, and then re-execute the command in the AOF file to recover the data when restarting. It mainly solves the real-time problem of data persistence. The default is not enabled.
The workflow of AOF is as follows:Advantages of AOF
Disadvantages of AOF
master-slave mode, sentinel mode, and cluster mode.
9.1 Master-slave modeIn the master-slave mode, Redis deploys multiple machines, with a master node responsible for read and write operations and a slave node responsible for only read operations. The data of the slave node comes from the master node. The implementation principle isMaster-slave replication mechanism
Master-slave replication includes full replication and incremental replication. Generally, when the slave starts to connect to the master for the first time, or it is considered to be the first time to connect,full copy is used. The full copy process is as follows:
psync has been used to replace sync, because the sync command consumes system resources and psync is more efficient.
After the slave is fully synchronized with the master, if the data on the master is updated again,incremental replication will be triggered.
When data increases or decreases on the master node, thereplicationFeedSalves() function will be triggered. Each command subsequently called on the Master node will use
replicationFeedSlaves()To synchronize to the Slave node. Before executing this function, the master node will determine whether the command executed by the user has data updates. If there is data update and the slave node is not empty, this function will be executed. The function of this function is:
Send the command executed by the user to all slave nodes, and let the slave node execute it. The process is as follows:
Sentinel mode, a Sentinel system composed of one or more Sentinel instances, which can monitor all Redis master nodes and slave nodes, and enter the offline state when the monitored master node When, automatically upgrades a slave node under the offline master server to the new master node . However, when a sentinel process monitors a Redis node, problems may occur (Single point problem). Therefore, multiple sentinels can be used to monitor Redis nodes, and there will be ongoing communication between each sentinel. monitor.
Simply put, the sentinel mode has three functions:What is the failover process?
Assume that the main server is down and Sentinel 1 detects this result first. The system will not immediately perform the failover process. It is just that Sentinel 1 subjectively believes that the main server is unavailable. This phenomenon becomes a subjective offline. When the subsequent sentinels also detect that the main server is unavailable and the number reaches a certain value, a vote will be held between the sentinels. The result of the vote will be initiated by one sentinel to perform a failover operation. After the switch is successful, each sentinel will use the publish-subscribe mode to switch the slave server it monitors to the host. This process is called objective offline. This way everything is transparent to the client.
The working mode of Sentinel is as follows:
Each Sentinel sends messages to the Master, Slave and other Sentinel instances it knows about once per second. A PING command.
If the time since the last valid reply to the PING command exceeds the value specified by the down-after-milliseconds option, the instance will be marked as subjectively offline by Sentinel. .
If a Master is marked as subjective offline, all Sentinels that are monitoring the Master must confirm once per second that the Master has indeed entered the subjective offline state.
When a sufficient number of Sentinels (greater than or equal to the value specified in the configuration file) confirm that the Master has indeed entered the subjective offline state within the specified time range, the Master will be marked as objective offline.
Under normal circumstances, each Sentinel will send an INFO command to all Masters and Slaves it knows once every 10 seconds.
When the Master is marked as objectively offline by Sentinel, the frequency of Sentinel sending INFO commands to all slaves of the offline Master will be changed from once every 10 seconds to once per second
If there are not enough Sentinels to agree that the Master is offline, the Master's objective offline status will be removed; if the Master returns a valid reply to the Sentinel's PING command, the Master's subjective offline status will be removed. The status will be removed.
The sentinel mode is based on the master-slave mode and realizes read and write separation. It can also automatically switch and the system availability is higher. However, the data stored in each node is the same, which wastes memory and is not easy to expand online. Therefore, the Cluster cluster came into being. It was added in Redis3.0 and implemented Redis's distributed storage. Segment the data, which means that each Redis node stores different content to solve the problem of online expansion. Moreover, it also provides replication and failover capabilities.
A Redis cluster consists of multiple nodes, How do each node communicate? Through Gossip protocol!
The Redis Cluster cluster communicates through the Gossip protocol. The nodes continuously exchange information. The information exchanged includes node failure, new node joining, master-slave node change information, slot information, etc. Commonly used gossip messages are divided into four types: ping, pong, meet, and fail.
- meet message: Notify new nodes to join. The message sender notifies the receiver to join the current cluster. After the meet message communication is completed normally, the receiving node will join the cluster and perform periodic ping and pong message exchanges.
- Ping message: The most frequently exchanged message in the cluster. Each node in the cluster sends ping messages to multiple other nodes every second, which is used to detect whether the nodes are online and exchange status information with each other.
- pong message: When receiving a ping or meet message, it will reply to the sender as a response message to confirm the normal communication of the message. The pong message internally encapsulates its own status data. A node can also broadcast its own pong message to the cluster to notify the entire cluster to update its status.
- fail message: When a node determines that another node in the cluster is offline, it will broadcast a fail message to the cluster. After receiving the fail message, other nodes will update the corresponding node to the offline state.
Specially, each node communicates with other nodes through cluster bus(cluster bus). When communicating, use a special port number, that is, the external service port number plus 10000. For example, if the port number of a node is 6379, then the port number it uses to communicate with other nodes is 16379. Communication between nodes uses a special binary protocol.
Since it is distributed storage, is the distributed algorithm used by the Cluster cluster Consistent Hash? No, but Hash Slot slot algorithm.
Slot algorithmThe entire database is divided into 16384 slots (slots). Each key-value pair entering Redis is hashed according to the key and assigned to one of the 16384 slots. The hash map used is also relatively simple. It uses the CRC16 algorithm to calculate a 16-bit value, and then modulo 16384. Each key in the database belongs to one of these 16384 slots, and each node in the cluster can handle these 16384 slots.
Each node in the cluster is responsible for a part of the hash slots. For example, the current cluster has nodes A, B, and C, and the number of hash slots on each node = 16384/3, then there is:
In the Redis Cluster cluster, it is necessary to ensure that the nodes corresponding to the 16384 slots are working normally. If a node fails, the slot it is responsible for will also fail. The entire The cluster will not work.
Therefore, in order to ensure high availability, the Cluster cluster introduces master-slave replication, and one master node corresponds to one or more slave nodes. When other master nodes ping a master node A, if more than half of the master nodes communicate with A times out, then the master node A is considered to be down. If the master node goes down, the slave node will be enabled.
On each node of Redis, there are two things, one is the slot, and its value range is 016383. The other is cluster, which can be understood as a cluster management plug-in. When the key we access arrives, Redis will obtain a 16-bit value based on the CRC16 algorithm, and then take the result modulo 16384. Each key in Jiangzi corresponds to a hash slot numbered between 016383. Use this value to find the node corresponding to the corresponding slot, and then automatically jump to the corresponding node for access. operate.
Although the data is stored separately on different nodes, to the client, the entire cluster is viewed as a whole. The client connects to any node and looks the same as operating a single instance of Redis. When the key operated by the client is not assigned to the correct node, Redis will return the redirection instruction and finally point to the correct node. This is a bit like the 302 redirect jump of the browser page.
Redis cluster achieves high availability. When a node in the cluster fails, failover is used to ensure The cluster provides external services normally.
The redis cluster realizes fault discovery through ping/pong messages. This environment includes subjective offline and objective offline.
Subjective offline: A node thinks that another node is unavailable, that is, offline state. This state is not the final fault determination and can only represent the opinion of one node. It may exist Misjudgment of the situation.
Objective offline: Indicates that a node is truly offline. Multiple nodes in the cluster believe that the node is unavailable and reach a consensus. result. If the master node holding the slot fails, failover needs to be performed for the node.
The process is as follows:
Fault recovery: After the fault is discovered, if the offline node is the master node, then You need to choose one of its slave nodes to replace it to ensure the high availability of the cluster. The process is as follows:
Distributed lock is an implementation of a lock that controls different processes in a distributed system to jointly access shared resources. Business scenarios such as placing flash sales orders, grabbing red envelopes, etc. all require the use of distributed locks. Redis is often used as a distributed lock in our projects.
选了Redis分布式锁的几种实现方法,大家来讨论下,看有没有啥问题哈。
if(jedis.setnx(key,lock_value) == 1){ //加锁 expire(key,100); //设置过期时间 try { do something //业务请求 }catch(){ } finally { jedis.del(key); //释放锁 } }
如果执行完setnx
加锁,正要执行expire设置过期时间时,进程crash掉或者要重启维护了,那这个锁就“长生不老”了,别的线程永远获取不到锁啦,所以分布式锁不能这么实现。
long expires = System.currentTimeMillis() + expireTime; //系统时间+设置的过期时间 String expiresStr = String.valueOf(expires); // 如果当前锁不存在,返回加锁成功 if (jedis.setnx(key, expiresStr) == 1) { return true; } // 如果锁已经存在,获取锁的过期时间 String currentValueStr = jedis.get(key); // 如果获取到的过期时间,小于系统当前时间,表示已经过期 if (currentValueStr != null && Long.parseLong(currentValueStr) < System.currentTimeMillis()) { // 锁已过期,获取上一个锁的过期时间,并设置现在锁的过期时间(不了解redis的getSet命令的小伙伴,可以去官网看下哈) String oldValueStr = jedis.getSet(key_resource_id, expiresStr); if (oldValueStr != null && oldValueStr.equals(currentValueStr)) { // 考虑多线程并发的情况,只有一个线程的设置值和当前值相同,它才可以加锁 return true; } } //其他情况,均返回加锁失败 return false; }
笔者看过有开发小伙伴是这么实现分布式锁的,但是这种方案也有这些缺点:
jedis.getSet()
,最终只能有一个客户端加锁成功,但是该客户端锁的过期时间,可能被别的客户端覆盖。if(jedis.set(key, lock_value, "NX", "EX", 100s) == 1){ //加锁 try { do something //业务处理 }catch(){ } finally { jedis.del(key); //释放锁 } }
这个方案可能存在这样的问题:
if(jedis.set(key, uni_request_id, "NX", "EX", 100s) == 1){ //加锁 try { do something //业务处理 }catch(){ } finally { //判断是不是当前线程加的锁,是才释放 if (uni_request_id.equals(jedis.get(key))) { jedis.del(key); //释放锁 } } }
在这里,判断当前线程加的锁和释放锁是不是一个原子操作。如果调用jedis.del()释放锁的时候,可能这把锁已经不属于当前客户端,会解除他人加的锁。
一般也是用lua脚本代替。lua脚本如下:
if redis.call('get',KEYS[1]) == ARGV[1] then return redis.call('del',KEYS[1]) else return 0 end;
这种方式比较不错了,一般情况下,已经可以使用这种实现方式。但是存在锁过期释放了,业务还没执行完的问题(实际上,估算个业务处理的时间,一般没啥问题了)。
分布式锁可能存在锁过期释放,业务没执行完的问题。有些小伙伴认为,稍微把锁过期时间设置长一些就可以啦。其实我们设想一下,是否可以给获得锁的线程,开启一个定时守护线程,每隔一段时间检查锁是否还存在,存在则对锁的过期时间延长,防止锁过期提前释放。
当前开源框架Redisson就解决了这个分布式锁问题。我们一起来看下Redisson底层原理是怎样的吧:
只要线程一加锁成功,就会启动一个watch dog
看门狗,它是一个后台线程,会每隔10秒检查一下,如果线程1还持有锁,那么就会不断的延长锁key的生存时间。因此,Redisson就是使用Redisson解决了锁过期释放,业务没执行完问题。
Redis一般都是集群部署的,假设数据在主从同步过程,主节点挂了,Redis分布式锁可能会有哪些问题呢?一起来看些这个流程图:
如果线程一在Redis的master节点上拿到了锁,但是加锁的key还没同步到slave节点。恰好这时,master节点发生故障,一个slave节点就会升级为master节点。线程二就可以获取同个key的锁啦,但线程一也已经拿到锁了,锁的安全性就没了。
为了解决这个问题,Redis作者 antirez提出一种高级的分布式锁算法:Redlock。Redlock核心思想是这样的:
搞多个Redis master部署,以保证它们不会同时宕掉。并且这些master节点是完全相互独立的,相互之间不存在数据同步。同时,需要确保在这多个master实例上,是与在Redis单实例,使用相同方法来获取和释放锁。
我们假设当前有5个Redis master节点,在5台服务器上面运行这些Redis实例。
RedLock的实现步骤:如下
- 1. Get the current time in milliseconds.
- 2. Request locks from 5 master nodes in sequence. The client sets the network connection and response timeout, and the timeout should be less than the lock expiration time. (Assuming that the automatic lock expiration time is 10 seconds, the timeout time is generally between 5-50 milliseconds. Let's assume that the timeout time is 50ms). If it times out, skip the master node and try the next master node as soon as possible.
- 3. The client uses the current time minus the time when it started to acquire the lock (that is, the time recorded in step 1) to get the time used to acquire the lock. The lock will be acquired successfully if and only if more than half (N/2 1, here 5/2 1 = 3 nodes) of the Redis master nodes have obtained the lock, and the usage time is less than the lock expiration time. (As shown in the picture above, 10s>30ms 40ms 50ms 4m0s 50ms)
- If the lock is obtained, the real effective time of the key changes, and the time used to obtain the lock needs to be subtracted.
- If the lock acquisition fails (the lock is not acquired on at least N/2 1 master instances, or the lock acquisition time has exceeded the effective time), the client must unlock on all master nodes (even if some The master node has not been successfully locked at all and needs to be unlocked to prevent some fish from slipping through the net).
The simplified steps are:
What is delayed double deletion? The flow chart is as follows:
Delete the cache first
then update the database
Sleep for a while (for example, 1 second) and delete the cache again.
How long does this sleep for a while? Are they all 1 second?
This sleep time = the time it takes to read business logic data is several hundred milliseconds. In order to ensure that the read request ends, the write request can delete cached dirty data that may be brought by the read request.
This solution is not bad. Only during the period of sleep (for example, just 1 second), there may be dirty data, and general businesses will accept it. But what if the second cache deletion fails? The cache and database data may still be inconsistent, right? How about setting a natural expire expiration time for the Key and letting it expire automatically? Does the business have to accept data inconsistencies within the expiration period? Or is there any other better solution?
Because delayed double deletion may fail in the second step of deleting the cache, resulting in data inconsistency. You can use this solution to optimize: if the deletion fails, delete it a few more times to ensure that the cache deletion is successful~ So you can introduce a deletion cache retry mechanism
Write request to update the database
The cache failed to delete for some reason
Put the failed key to the message queue
Consume messages from the message queue and obtain the key to be deleted
Retry the deletion cache operation
The retry deletion cache mechanism is okay, but it will cause a lot of business code intrusion. In fact, it can also be optimized like this: asynchronously eliminating keys through the binlog of the database.
Take mysql as an example
The use of multi-threading by redis does not completely abandon single-threading. Redis still uses a single-threaded model to process client requests. It only uses multi-threading to handle data reading and writing and protocol analysis. To execute commands, it is still used Single threaded.
The purpose of this is because the performance bottleneck of redis lies in network IO rather than CPU. Using multi-threading can improve the efficiency of IO reading and writing, thereby improving the overall performance of redis.
Redis implements the transaction mechanism through a set of commands such as MULTI, EXEC, WATCH. Transactions support executing multiple commands at one time, and all commands in a transaction will be serialized. During the transaction execution process, the commands in the queue will be executed serially in order, and command requests submitted by other clients will not be inserted into the transaction execution command sequence.
In short, a Redis transaction is a sequential, one-time, exclusive execution of a series of commands in a queue.
The process of Redis transaction execution is as follows:
Command | Description |
---|---|
EXEC | Execute all commands within the transaction block |
DISCARD | Cancel the transaction and give up executing all commands within the transaction block |
MULTI | Mark the beginning of a transaction block |
UNWATCH | Cancel the monitoring of all keys by the WATCH command. |
WATCH | Monitor key. If the key is changed by other commands before the transaction is executed, the transaction will be interrupted. |
As a K-V in-memory database, Redis uses a global hash to save all key-value pairs. This hash table is composed of multiple hash buckets. The entry element in the hash bucket stores the key and value pointers, where *key points to the actual key and *value points to the actual value. .
The hash table lookup speed is very fast, somewhat similar to HashMap in Java, which allows us to quickly find key-value pairs in O(1) time complexity. First, calculate the hash value through the key, find the corresponding hash bucket location, then locate the entry, and find the corresponding data in the entry.
What is a hash collision?
Hash conflict: The same hash value is calculated through different keys, resulting in the same hash bucket.
In order to resolve hash conflicts, Redis uses chain hashing. Chained hashing means that multiple elements in the same hash bucket are stored in a linked list, and they are connected in turn using pointers.
#Some readers may still have questions: The elements on the hash conflict chain can only be searched one by one through pointers and then operated. When a lot of data is inserted into the hash table, the more conflicts there will be, the longer the conflict linked list will be, and the query efficiency will be reduced.
In order to maintain efficiency, Redis will perform a rehash operation on the hash table, which means adding hash buckets and reducing conflicts. In order to make rehash more efficient, Redis also uses two global hash tables by default, one for current use, called the main hash table, and one for expansion, called the backup hash table.
Yes, Redis provides two instructions to generate RDB, namely save and bgsave.
RESP, the full English name is Redis Serialization Protocol, which is a serialization protocol specially designed for redis. This protocol is actually It has already appeared in redis version 1.2, but it was only in redis2.0 that it finally became the standard of redis communication protocol.
RESP mainly has the advantages of simple implementation, fast parsing speed, and good readability.
To deal with the cache penetration problem, we can use Bloom filter. What is a bloom filter?
The Bloom filter is a data structure that takes up very little space. It consists of a long binary vector and a set of Hash mapping functions. It is used to retrieve whether an element is in a set, space The efficiency and query time are much better than the general algorithm. The disadvantage is that there is a certain misrecognition rate and deletion difficulty.
What is the principle of Bloom filter? Suppose we have a set A, and there are n elements in A. Use k hash hash functions to map each element in A to different positions in an array B with a length of a bit. The binary numbers at these positions Both are set to 1. If the element to be checked is mapped by these k hash functions and it is found that the binary numbers at its k positions are all 1, this element is likely to belong to set A. Otherwise, must not belong to the set A.
Let’s take a simple example. Suppose set A has 3 elements, namely {d1,d2,d3}. There is 1 hash function, which is Hash1. Now map each element of A to an array B of length 16 bits.
We now map d1, assuming Hash1 (d1) = 2, we change the grid with subscript 2 in array B to 1, as follows: We now also mapd2, assuming Hash1(d2) = 5, we also change the grid with subscript 5 in array B into 1, as follows:
Then we mapd3, assuming that Hash1 (d3) is also equal to 2, it also sets the subscript to 2 Grid mark 1:
Therefore, we need to confirm whether an element dn is in set A. We only need to calculate the index subscript obtained by Hash1 (dn). As long as it is 0, it means that the element is not in set A. If the index What about the subscript 1? Then the element may be an element in A. Because you see, the subscript values obtained by d1 and d3 may both be 1, or they may be mapped to other numbers. The Bloom filter has this disadvantage: there will be a hash False positive caused by collision , there is an error in judgment.
How toreduce this error?
We add another Hash2Hash mapping function, assuming Hash2(d1)=6, Hash2(d3)=8, they are not There is no conflict, as follows:
Even if there is an error, we can find that the Bloom filter does not store complete data, it just The position is calculated using a series of hash map functions and then the binary vector is filled. If the number is very large, the Bloom filter can save a lot of storage space through a very small error rate, which is quite cost-effective.
Currently there are open source class libraries that implement the Bloom filter accordingly, such as Google's Guava class library, Twitter's Algebird class library, which can be easily picked up, or based on Redis's own library It is also possible for Bitmaps to implement its own design.
For more programming related knowledge, please visit: Programming Video! !
The above is the detailed content of Summary and sharing of 20 classic interview questions about Redis (with answer analysis). For more information, please follow other related articles on the PHP Chinese website!