In distributed concurrent systems, database and cache data consistency is a challenging technical difficulty. Assuming there is a complete industrial-grade distributed transaction solution, then the consistency of database and cache data will be easily solved. In fact, distributed transactions are currently immature.
In the database and cache data consistency solution, there are various voices.
Operate the database first and then cache or cache first and then the database
Should the cache be updated or deleted
In a concurrent system, in the dual-write scenario of the database and cache, in order to pursue greater concurrency, the operations on the database and cache will obviously not be performed simultaneously. The former operation is successful and the latter is performed in an asynchronous manner.
As a mature industrial-grade data storage solution, relational database has a complete transaction processing mechanism. Once the data is placed on the disk, regardless of hardware failure, it can be responsibly said that the data will not be lost.
The so-called cache is nothing more than data stored in memory. Once the service is restarted, all cached data will be lost. Since it is called caching, be prepared for the loss of cached data at all times. Although Redis has a persistence mechanism, can it guarantee 100% persistence? Redis asynchronously persists data to disk. The cache is a cache, and the database is a database. They are two different things. Using a cache as a database is extremely dangerous.
From the perspective of data security, the database is operated first, and then the cache is operated asynchronously to respond to user requests.
Whether the cache is updated or deleted corresponds to the lazy style and the full style. From the perspective of thread safety practices, deleting the cache operation is relatively difficult. If query performance is satisfied under the premise of deleting the cache, then deleting the cache is preferred.
Although updating the cache can improve query efficiency, the concurrent dirty data caused by threads is more troublesome to process. The preface introduces other message middleware such as MQ, so it is not recommended unless necessary.
The key to understanding the problems caused by thread concurrency is to first understand system interrupts. When the operating system is scheduling tasks, interrupts occur at any time. This is caused by thread data inconsistency. the origin. Taking 4- and 8-thread CPUs as an example, up to 8 threads can be processed at the same time. However, the operating system manages far more than 8 threads, so the threads proceed in a seemingly parallel manner.
In a non-concurrent environment, there is nothing wrong with using the following method to query data: first query the cache, if the cached data does not exist, Query the database, update the cache, and return the results.
public BuOrder getOrder(Long orderId) { String key = ORDER_KEY_PREFIX + orderId; BuOrder buOrder = RedisUtils.getObject(key, BuOrder.class); if (buOrder != null) { return buOrder; } BuOrder order = getById(orderId); RedisUtils.setObject(key, order, 5, TimeUnit.MINUTES); return order; }
If there is a serious flaw in a high-concurrency environment: when the cache fails, a large number of query requests pour in, all hitting the DB in an instant. The database connection resources may be exhausted, and the client responds with a 500 error. In severe cases, the database may be under too much pressure and the service may be shut down.
Therefore, in a concurrent environment, the above code needs to be modified and distributed locks are used. When a large number of requests pour in, the thread that obtains the lock has the opportunity to access the database to query data, and the remaining threads are blocked. When the data is queried and the cache is updated, the lock is released. The waiting thread rechecks the cache and finds that the data can be obtained, and responds directly to the cached data.
Distributed locks are mentioned here, so should we use table locks or row locks? Use distributed row locks to increase concurrency; use a secondary check mechanism to ensure that threads waiting to obtain locks can quickly return results
@Override public BuOrder getOrder(Long orderId) { /* 如果缓存不存在,则添加分布式锁更新缓存 */ String key = ORDER_KEY_PREFIX + orderId; BuOrder order = RedisUtils.getObject(key, BuOrder.class); if (order != null) { return order; } String orderLock = ORDER_LOCK + orderId; RLock lock = redissonClient.getLock(orderLock); if (lock.tryLock()) { order = RedisUtils.getObject(key, BuOrder.class); if (order != null) { LockOptional.ofNullable(lock).ifLocked(RLock::unlock); return order; } BuOrder buOrder = getById(orderId); RedisUtils.setObject(key, buOrder, 5, TimeUnit.MINUTES); LockOptional.ofNullable(lock).ifLocked(RLock::unlock); } return RedisUtils.getObject(key, BuOrder.class); }
In a non-concurrent environment, the following code may cause data inconsistency (data is overwritten). Although using database-level optimistic locking can solve the problem of data being overwritten, invalid update traffic will still flow to the database.
public Boolean editOrder(BuOrder order) { /* 更新数据库 */ updateById(order); /* 删除缓存 */ RedisUtils.deleteObject(OrderServiceImpl.ORDER_KEY_PREFIX + order.getOrderId()); return true; }
The use of database optimistic locking in the above analysis can solve the problem of data being overwritten in concurrent updates. However, when the same row of records is modified, the version number changes. Subsequent Concurrent requests flowing to the database are invalid traffic. The primary strategy to reduce database pressure is to intercept invalid traffic before the database.
Using distributed locks can ensure that concurrent traffic accesses the database in an orderly manner. Considering that optimistic locking has been used at the database level, the second and subsequent threads that obtain the lock operate the database as invalid traffic.
The thread adopts a timeout exit strategy when acquiring the lock. The thread waiting for the lock will timeout and exit quickly, respond to user requests quickly, and retry the update data operation.
public Boolean editOrder(BuOrder order) { String orderLock = ORDER_LOCK + order.getOrderId(); RLock lock = redissonClient.getLock(orderLock); try { /* 超时未获取到锁,快速失败,用户端重试 */ if (lock.tryLock(1, TimeUnit.SECONDS)) { /* 更新数据库 */ updateById(order); /* 删除缓存 */ RedisUtils.deleteObject(OrderServiceImpl.ORDER_KEY_PREFIX + order.getOrderId()); /* 释放锁 */ LockOptional.ofNullable(lock).ifLocked(RLock::unlock); return true; } } catch (InterruptedException e) { e.printStackTrace(); } return false; }
The above code uses a tool class that encapsulates the lock.
<dependency> <groupId>xin.altitude.cms</groupId> <artifactId>ucode-cms-common</artifactId> <version>1.4.3.2</version> </dependency>
LockOptional
Perform subsequent operations based on the status of the lock.
Next, we will discuss whether there is concurrency in updating the database first and then deleting the cache. question.
(1) The cache just expired
(2) Request A to query the database and get an old value
(3) Request B to write the new value into the database
(4) Request B to delete Cache
(5) Request A to write the found old value into cache
The key to the above concurrency problem is that step 5 occurs after steps 3 and 4. It can be seen from the uncertain factors of operating system interruption that this situation may occur.
From the actual situation, writing data to Redis takes far less time than writing data to the database. Although the probability of occurrence is low, it will still happen. .
(1) Increase the cache expiration time
#Increase the cache expiration time to allow dirty data to exist within a certain time range until the next concurrent update occurs, Dirty data may occur. Dirty data exists periodically.
(2) Updates and queries share a row lock
Updates and queries share a row distributed lock, and the above problems no longer exist exist. When the read request acquires the lock, the write request is in a blocked state (timeout will fail and returns quickly), ensuring that step 5 is performed before step 3.
(3) Delayed cache deletion
Use RabbitMQ to delay cache deletion to remove the impact of step 5. Using an asynchronous method has almost no impact on performance.
The database has a transaction mechanism to ensure the success of the operation; a single Redis instruction is atomic, but then combined it does not have atomic characteristics. Specifically, the database operation is successful, and then the application It hung up abnormally, resulting in the Redis cache not being deleted. This problem occurs when the Redis service network connection times out.
If a cache expiration time is set, dirty data will always exist before the cache expires. If the expiration time is not set, dirty data will exist until the next time the data is modified. (The database data has changed and the cache has not been updated)
Before operating the database, write a delayed cache deletion message to RabbitMQ, then perform the database operation and perform the cache deletion operation . Regardless of whether the code-level cache is successfully deleted, MQ deletes the cache as a guaranteed operation.
The above is the detailed content of What is the database and cache data consistency scheme for Java concurrent programming?. For more information, please follow other related articles on the PHP Chinese website!