Distributed locks are usually implemented in the following ways:
Actual In development, Redis and Zookeeper are most commonly used, so this article will only talk about these two.
Before discussing this issue, let us first look at a business scenario:
System A is an e-commerce system. It is currently deployed on a machine. There is a user in the system. Order interface, but users must check the inventory before placing an order to ensure that the inventory is sufficient before placing an order for the user.
Since the system has a certain degree of concurrency, the inventory of the goods will be saved in Redis
in advance. When the user places an order, the inventory of Redis
will be updated. .
The system architecture at this time is as follows:
But this will produce a problem: If at a certain moment, the The inventory of a certain product is 1. At this time, two requests come at the same time. One of the requests is executed to step 3 in the above figure, and the inventory of the database is updated to 0, but step 4 has not been executed yet.
The other request reaches step 2 and finds that the inventory is still 1, so it continues to step 3.
The result is that 2 items are sold, but in fact there is only 1 item in stock.
Obviously something is wrong! This is a typical inventory oversold problem
At this point, we can easily think of a solution: use a lock to lock steps 2, 3, and 4, so that after they are completed, another thread can come in to execute step 2. step.
According to the above figure, when executing step 2, use synchronized or ReentrantLock provided by Java to lock, and then release the lock after step 4 is executed.
In this way, the three steps 2, 3, and 4 are "locked", and multiple threads can only be executed serially.
But the good times did not last long, the concurrency of the entire system soared, and one machine could no longer handle it. Now we need to add a machine, as shown below:
After adding the machine, the system becomes as shown in the picture above, my God!
Assuming that the requests from two users arrive at the same time, but fall on different machines, can these two requests be executed at the same time, or will the inventory oversold problem occur.
why? Because the two A systems in the picture above run in two different JVMs, the locks they add are only valid for threads in their own JVMs, and are invalid for threads in other JVMs.
Therefore, the problem here is: the native lock mechanism provided by Java fails in a multi-machine deployment scenario
This is because the locks added by the two machines are not the same lock (two locks in different JVMs).
Then, as long as we ensure that the locks added by the two machines are the same, won’t the problem be solved?
At this point, it’s time for distributed locks to make their grand appearance. The idea of distributed locks is:
Provide a global and unique way to acquire locks in the entire system. "Thing", and then when each system needs to lock, it will ask this "thing" to get a lock, so that different systems can consider it to be the same lock.
As for this "thing", it can be Redis, Zookeeper, or a database.
The text description is not very intuitive, let’s look at the picture below:
Through the above analysis, we know that the inventory oversold scenario works in the distributed deployment system In this case, using Java's native lock mechanism cannot guarantee thread safety, so we need to use a distributed lock solution.
So, how to implement distributed locks? Then read on!
The above analyzes why distributed locks should be used, here we come Let’s look specifically at how distributed locks should be handled when implemented.
The most common solution is to use Redis for distributed locks
The idea of using Redis for distributed locks is roughly this: set a value in redis to indicate that the lock is added, and then delete the key when the lock is released.
The specific code is as follows:
// 获取锁 // NX是指如果key不存在就成功,key存在返回false,PX可以指定过期时间 SET anyLock unique_value NX PX 30000 // 释放锁:通过执行一段lua脚本 // 释放锁涉及到两条指令,这两条指令不是原子性的 // 需要用到redis的lua脚本支持特性,redis执行lua脚本是原子性的 if redis.call("get",KEYS[1]) == ARGV[1] then return redis.call("del",KEYS[1]) else return 0 end
There are several important points in this method:
Be sure to use SET key value NX PX milliseconds command
If not used, set the value first and then set the expiration time. This is not an atomic operation and may crash before setting the expiration time, which will cause a deadlock (the key exists permanently)
The value must be unique
This is because when unlocking, it is necessary to verify that the value is consistent with the locked value before deleting the key.
This avoids a situation: Suppose A acquires the lock and the expiration time is 30s. After 35s, the lock has been automatically released. A goes to release the lock, but B may acquire the lock at this time. Client A cannot delete B's lock.
In addition to considering how the client implements distributed locks, you also need to consider the deployment of redis.
Redis has 3 deployment methods:
The disadvantage of using redis for distributed locks is that if you use stand-alone deployment mode, there will be a single point of problem, as long as redis fails. Locking it won't work.
Adopt the master-slave mode. When locking, only one node is locked. Even if high availability is achieved through sentinel, if the master node fails and a master-slave switch occurs, it may occur. Lost lock problem.
Based on the above considerations, in fact, the author of redis also considered this issue. He proposed a RedLock algorithm. The meaning of this algorithm is roughly like this:
Assume that the deployment mode of redis is redis cluster has a total of 5 master nodes. Obtain a lock through the following steps:
但是这样的这种算法还是颇具争议的,可能还会存在不少的问题,无法保证加锁的过程一定正确。
此外,实现Redis的分布式锁,除了自己基于redis client原生api来实现之外,还可以使用开源框架:Redission
Redisson是一个企业级的开源Redis Client,也提供了分布式锁的支持。我也非常推荐大家使用,为什么呢?
回想一下上面说的,如果自己写代码来通过redis设置一个值,是通过下面这个命令设置的。
这里设置的超时时间是30s,假如我超过30s都还没有完成业务逻辑的情况下,key会过期,其他线程有可能会获取到锁。
这样一来的话,第一个线程还没执行完业务逻辑,第二个线程进来了也会出现线程安全问题。所以我们还需要额外的去维护这个过期时间,太麻烦了~
我们来看看redisson是怎么实现的?先感受一下使用redission的爽:
Config config = new Config(); config.useClusterServers() .addNodeAddress("redis://192.168.31.101:7001") .addNodeAddress("redis://192.168.31.101:7002") .addNodeAddress("redis://192.168.31.101:7003") .addNodeAddress("redis://192.168.31.102:7001") .addNodeAddress("redis://192.168.31.102:7002") .addNodeAddress("redis://192.168.31.102:7003"); RedissonClient redisson = Redisson.create(config); RLock lock = redisson.getLock("anyLock"); lock.lock(); lock.unlock();
就是这么简单,我们只需要通过它的api中的lock和unlock即可完成分布式锁,他帮我们考虑了很多细节:
redisson所有指令都通过lua脚本执行,redis支持lua脚本原子性执行
redisson设置一个key的默认过期时间为30s,如果某个客户端持有一个锁超过了30s怎么办?
redisson中有一个watchdog
的概念,翻译过来就是看门狗,它会在你获取锁之后,每隔10秒帮你把key的超时时间设为30s
这样的话,就算一直持有锁也不会出现key过期了,其他线程获取到锁的问题了。
redisson的“看门狗”逻辑保证了没有死锁发生。
(如果机器宕机了,看门狗也就没了。此时就不会延长key的过期时间,到了30s之后就会自动过期了,其他线程可以获取到锁)
这里稍微贴出来其实现代码:
// 加锁逻辑 private <T> RFuture<Long> tryAcquireAsync(long leaseTime, TimeUnit unit, final long threadId) { if (leaseTime != -1) { return tryLockInnerAsync(leaseTime, unit, threadId, RedisCommands.EVAL_LONG); } // 调用一段lua脚本,设置一些key、过期时间 RFuture<Long> ttlRemainingFuture = tryLockInnerAsync(commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(), TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_LONG); ttlRemainingFuture.addListener(new FutureListener<Long>() { @Override public void operationComplete(Future<Long> future) throws Exception { if (!future.isSuccess()) { return; } Long ttlRemaining = future.getNow(); // lock acquired if (ttlRemaining == null) { // 看门狗逻辑 scheduleExpirationRenewal(threadId); } } }); return ttlRemainingFuture; } <T> RFuture<T> tryLockInnerAsync(long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) { internalLockLeaseTime = unit.toMillis(leaseTime); return commandExecutor.evalWriteAsync(getName(), LongCodec.INSTANCE, command, "if (redis.call('exists', KEYS[1]) == 0) then " + "redis.call('hset', KEYS[1], ARGV[2], 1); " + "redis.call('pexpire', KEYS[1], ARGV[1]); " + "return nil; " + "end; " + "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " + "redis.call('hincrby', KEYS[1], ARGV[2], 1); " + "redis.call('pexpire', KEYS[1], ARGV[1]); " + "return nil; " + "end; " + "return redis.call('pttl', KEYS[1]);", Collections.<Object>singletonList(getName()), internalLockLeaseTime, getLockName(threadId)); } // 看门狗最终会调用了这里 private void scheduleExpirationRenewal(final long threadId) { if (expirationRenewalMap.containsKey(getEntryName())) { return; } // 这个任务会延迟10s执行 Timeout task = commandExecutor.getConnectionManager().newTimeout(new TimerTask() { @Override public void run(Timeout timeout) throws Exception { // 这个操作会将key的过期时间重新设置为30s RFuture<Boolean> future = renewExpirationAsync(threadId); future.addListener(new FutureListener<Boolean>() { @Override public void operationComplete(Future<Boolean> future) throws Exception { expirationRenewalMap.remove(getEntryName()); if (!future.isSuccess()) { log.error("Can't update lock " + getName() + " expiration", future.cause()); return; } if (future.getNow()) { // reschedule itself // 通过递归调用本方法,无限循环延长过期时间 scheduleExpirationRenewal(threadId); } } }); } }, internalLockLeaseTime / 3, TimeUnit.MILLISECONDS); if (expirationRenewalMap.putIfAbsent(getEntryName(), new ExpirationEntry(threadId, task)) != null) { task.cancel(); } }
另外,redisson还提供了对redlock算法的支持,
它的用法也很简单:
RedissonClient redisson = Redisson.create(config); RLock lock1 = redisson.getFairLock("lock1"); RLock lock2 = redisson.getFairLock("lock2"); RLock lock3 = redisson.getFairLock("lock3"); RedissonRedLock multiLock = new RedissonRedLock(lock1, lock2, lock3); multiLock.lock(); multiLock.unlock();
小结:
本节分析了使用Redis
作为分布式锁的具体落地方案,以及其一些局限性,然后介绍了一个Redis
的客户端框架redisson。这也是我推荐大家使用的,比自己写代码实现会少care很多细节。
常见的分布式锁实现方案里面,除了使用redis来实现之外,使用zookeeper也可以实现分布式锁。
在介绍zookeeper(下文用zk代替)实现分布式锁的机制之前,先粗略介绍一下zk是什么东西:
Zookeeper是一种提供配置管理、分布式协同以及命名的中心化服务。
zk的模型是这样的:zk包含一系列的节点,叫做znode,就好像文件系统一样每个znode表示一个目录,然后znode有一些特性:
Ordered node: If there is currently a parent node named /lock
, we can create a child node under this parent node;
zookeeper Provides an optional ordering feature. For example, we can create a child node "/lock/node-" and specify the order. Then zookeeper will automatically add an integer serial number based on the current number of child nodes when generating child nodes
In other words, if it is the first child node created, the generated child node is /lock/node-0000000000
, and the next node is /lock/node-0000000001
,And so on.
Temporary node: The client can create a temporary node. Zookeeper will automatically delete the node after the session ends or the session times out.
Event monitoring: When reading data, we can set event monitoring on the node at the same time. When the node data or structure changes, zookeeper will notify the client. Currently zookeeper has the following four events:
比如当前线程获取到的节点序号为/lock/003
,然后所有的节点列表为[/lock/001,/lock/002,/lock/003]
,则对/lock/002
这个节点添加一个事件监听器。
如果锁释放了,会唤醒下一个序号的节点,然后重新执行第3步,判断是否自己的节点序号是最小。
比如/lock/001
释放了,/lock/002
监听到时间,此时节点集合为[/lock/002,/lock/003]
,则/lock/002
为最小序号节点,获取到锁。
整个过程如下:
具体的实现思路就是这样,至于代码怎么写,这里比较复杂就不贴出来了。
Curator是一个zookeeper的开源客户端,也提供了分布式锁的实现。
他的使用方式也比较简单:
InterProcessMutex interProcessMutex = new InterProcessMutex(client,"/anyLock"); interProcessMutex.acquire(); interProcessMutex.release();
其实现分布式锁的核心源码如下:
private boolean internalLockLoop(long startMillis, Long millisToWait, String ourPath) throws Exception { boolean haveTheLock = false; boolean doDelete = false; try { if ( revocable.get() != null ) { client.getData().usingWatcher(revocableWatcher).forPath(ourPath); } while ( (client.getState() == CuratorFrameworkState.STARTED) && !haveTheLock ) { // 获取当前所有节点排序后的集合 List<String> children = getSortedChildren(); // 获取当前节点的名称 String sequenceNodeName = ourPath.substring(basePath.length() + 1); // +1 to include the slash // 判断当前节点是否是最小的节点 PredicateResults predicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases); if ( predicateResults.getsTheLock() ) { // 获取到锁 haveTheLock = true; } else { // 没获取到锁,对当前节点的上一个节点注册一个监听器 String previousSequencePath = basePath + "/" + predicateResults.getPathToWatch(); synchronized(this){ Stat stat = client.checkExists().usingWatcher(watcher).forPath(previousSequencePath); if ( stat != null ){ if ( millisToWait != null ){ millisToWait -= (System.currentTimeMillis() - startMillis); startMillis = System.currentTimeMillis(); if ( millisToWait <= 0 ){ doDelete = true; // timed out - delete our node break; } wait(millisToWait); }else{ wait(); } } } // else it may have been deleted (i.e. lock released). Try to acquire again } } } catch ( Exception e ) { doDelete = true; throw e; } finally{ if ( doDelete ){ deleteOurPath(ourPath); } } return haveTheLock; }
其实curator实现分布式锁的底层原理和上面分析的是差不多的。这里我们用一张图详细描述其原理:
小结:
本节介绍了Zookeeperr实现分布式锁的方案以及zk的开源客户端的基本使用,简要的介绍了其实现原理。
After learning the two distributed lock implementation schemes, this section needs to be discussed What are the respective advantages and disadvantages of redis and zk implementation solutions.
As for the distributed lock of redis, it has the following shortcomings:
But on the other hand, using redis to implement distributed locks is very common in many enterprises, and in most cases you will not encounter the so-called "extremely complex scenarios"
So using redis as a distributed lock is also a good solution. The most important point is that redis has high performance and can support high-concurrency acquisition and release lock operations.
For zk distributed locks:
But zk also has its shortcomings: if there are more clients frequently applying for locks and releasing locks, the pressure on the zk cluster will be greater.
Summary:
In summary, both redis and zookeeper have their advantages and disadvantages. We can use these issues as reference factors when making technology selection.
The above is the detailed content of Should I use Redis or Zookeeper for distributed locks?. For more information, please follow other related articles on the PHP Chinese website!