This article brings you relevant knowledge about Redis, which mainly introduces the usage skills of distributed cache and local cache, including introduction to cache types, various usage scenarios, and How to use it, and finally a practical case will be given. Let’s take a look at it. I hope it will be helpful to everyone.
Recommended learning: Redis video tutorial
As we all know, the main purpose of caching is to speed up access and relieve database pressure. The most commonly used cache is distributed cache, such as redis. When faced with most concurrency scenarios or situations where the traffic of some small and medium-sized companies is not that high, redis can basically solve the problem. However, in the case of high traffic, you may have to use local cache, such as guava's LoadingCache and Kuaishou's open source ReloadableCache.
This part will introduce the usage scenarios and limitations of redis, such as guava's LoadingCache and Kuaishou's open source ReloadableCache. Through this part of the introduction, you can know how to use it. Which cache should be used in business scenarios and why.
If we broadly talk about when to use redis, then it will naturally be used in places where the number of user visits is too high, thereby accelerating access and easing database pressure. If broken down, it can be divided into single-node problems and non-single-node problems.
If a page has a high number of user visits, but they are not accessing the same resource. For example, the user details page has a relatively high number of visits, but the data of each user is different. In this case, it is obvious that only distributed cache can be used. If redis is used, the key is the user's unique key, and the value is the user information.
Cache breakdown caused by redis.
But one thing to note is that the expiration time must be set, and it cannot be set to expire at the same time point. For example, if the user has an activity page, the activity page can see the data of the user’s awards during the activity. A careless person may set the expiration time point of the user data to the end of the activity, which will
(Hot) Point Issue
The single-node problem refers to the concurrency problem of a single node of redis, because the same key will fall on the same node of the redis cluster, so if the key is If the number of visits is too high, then this redis node will have concurrency risks, and this key is called a hot key.
If all users access the same resource, for example, the homepage of the Xiao Ai App displays the same content to all users (initially), and the server returns the same big json to h5, obviously it must be used to cache. First, we consider whether it is feasible to use redis. Since redis has a single point problem, if the traffic is too large, then all user requests will reach the same node of redis, and it is necessary to evaluate whether the node can withstand such a large flow. Our rule is that if the qps of a single node reaches a thousand levels, a single point problem must be solved (even if redis claims to be able to withstand qps of a hundred thousand levels), the most common way is to use local cache. Obviously, the traffic on the homepage of Xiaoai app is less than 100, so there is no problem using redis.
For the hot key problem mentioned above, our most direct approach is to use local cache, such as the LoadingCache of guava that you are most familiar with, but use local cache The cache is required to be able to accept a certain amount of dirty data, because if you update the homepage, the local cache will not be updated. It will only reload the cache according to a certain expiration policy, but in our scenario it is completely fine, because once Once the homepage is pushed in the background, it will not be changed again. Even if it changes, there is no problem. You can set the write expiration to half an hour, and reload the cache after half an hour. We can accept dirty data in such a short period of time.
Cache breakdown caused by LoadingCache
Although the local cache is strongly related to the machine, although the code level is written to expire in half an hour, because each machine The different startup times lead to different cache loading times and different expiration times. Therefore, all requests on the machine will not request the database after the cache expires at the same time. However, cache penetration will also occur for a single machine. If there are 10 machines, each with 1,000 qps, as long as one cache expires, these 1,000 requests may hit the database at the same time. This kind of problem is actually easier to solve, but it is easy to be ignored. That is, when setting up LoadingCache, use the load-miss method of LoadingCache instead of directly judging cache.getIfPresent()== null and then requesting the db; the former will add a virtual machine The layer lock ensures that only one request goes to the database, thus perfectly solving this problem.
However, if there are high real-time requirements, such as frequent activities for a period of time, I want to ensure that the activity page can be updated in near real-time, that is, after the operation configures the activity information in the background, it needs Displaying the configured activity information in near real-time on the C side, using LoadingCache is definitely not enough.
For the real-time problems mentioned above that cannot be solved by LoadingCache, you can consider using ReloadableCache, which is a local caching framework open sourced by Kuaishou. Its biggest feature is that it supports multiple machines at the same time. Update the cache. Suppose we modify the home page information, and then the request hits machine A. At this time, the ReloadableCache is reloaded, and then it will send out a notification. Other machines listening to the same zk node will update the cache after receiving the notification. The general requirement for using this cache is to load the entire amount of data into the local cache, so if the amount of data is too large, it will definitely put pressure on the gc, and it cannot be used in this case. Since Xiao Ai’s homepage has status, and generally there are only two online statuses, you can use ReloadableCache to load only the online status homepages.
The three types of caches have been basically introduced here. Here is a summary:
Although any kind of local cache has virtual machine level locking to solve the breakdown problem, accidents may always happen in unexpected ways. To be on the safe side, you can use a two-level cache, that is, a local cache redis db. .
I won’t say more about the use of redis here. I believe many people are more familiar with the use of api than me
This is provided by guava online, but there are two points to note.
V get(K key, Callable<? extends V> loader)
; Or when using build, use build(CacheLoader<? super K1, V1> loader)
At this time, you can use get( ). In addition, it is recommended to use load-miss instead of checking the database when getIfPresent==null, which may cause cache breakdown; LoadingCache<String, String> cache = CacheBuilder.newBuilder() .maximumSize(1000L) .expireAfterAccess(Duration.ofHours(1L)) // 多久不访问就过期 .expireAfterWrite(Duration.ofHours(1L)) // 多久这个key没修改就过期 .build(new CacheLoader<String, String>() { @Override public String load(String key) throws Exception { // 数据装载方式,一般就是loadDB return key + " world"; } }); String value = cache.get("hello"); // 返回hello world
Import three-party dependencies
<dependency> <groupId>com.github.phantomthief</groupId> <artifactId>zknotify-cache</artifactId> <version>0.1.22</version> </dependency>
You need to read the documentation, otherwise it will not work. If you are interested, you can write one yourself. of.
public interface ReloadableCache<T> extends Supplier<T> { /** * 获取缓存数据 */ @Override T get(); /** * 通知全局缓存更新 * 注意:如果本地缓存没有初始化,本方法并不会初始化本地缓存并重新加载 * * 如果需要初始化本地缓存,请先调用 {@link ReloadableCache#get()} */ void reload(); /** * 更新本地缓存的本地副本 * 注意:如果本地缓存没有初始化,本方法并不会初始化并刷新本地的缓存 * * 如果需要初始化本地缓存,请先调用 {@link ReloadableCache#get()} */ void reloadLocal(); }
These three are really eternal problems, and they really need to be considered if the traffic is large.
Simply put, the cache fails, causing a large number of requests to hit the database at the same time. Many solutions have been given above for the cache breakdown problem.
1.2 and As I said, mainly look at 3. If the business is willing to use redis but cannot use local cache, for example, the amount of data is too large and the real-time requirements are relatively high. Then when the cache fails, you have to find a way to ensure that only a small number of requests hit the database. It is natural to think of using distributed locks, which is feasible in theory, but in fact there are hidden dangers. We believe that many people use redis lua to implement our distributed lock, and perform rotation training in while. Such a large number of requests and data will cause redis to become a hidden danger and occupy too much business. Threads actually increase the complexity just by introducing distributed locks. Our principle is not to use them if we can.
So can we design a rpc service that is similar to a distributed lock but more reliable? When calling the get method, this rpc service ensures that the same key is hit to the same node, and uses synchronized to lock, and then completes the loading of data. Kuaishou provides a framework called cacheSetter. A simplified version is provided below, and it is easy to implement by writing it yourself.
import com.google.common.collect.Lists; import org.apache.commons.collections4.CollectionUtils; import java.util.*; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ConcurrentMap; import java.util.concurrent.CountDownLatch; /** * @Description 分布式加载缓存的rpc服务,如果部署了多台机器那么调用端最好使用id做一致性hash保证相同id的请求打到同一台机器。 **/ public abstract class AbstractCacheSetterService implements CacheSetterService { private final ConcurrentMap<String, CountDownLatch> loadCache = new ConcurrentHashMap<>(); private final Object lock = new Object(); @Override public void load(Collection<String> needLoadIds) { if (CollectionUtils.isEmpty(needLoadIds)) { return; } CountDownLatch latch; Collection<CountDownLatch> loadingLatchList; synchronized (lock) { loadingLatchList = excludeLoadingIds(needLoadIds); needLoadIds = Collections.unmodifiableCollection(needLoadIds); latch = saveLatch(needLoadIds); } System.out.println("needLoadIds:" + needLoadIds); try { if (CollectionUtils.isNotEmpty(needLoadIds)) { loadCache(needLoadIds); } } finally { release(needLoadIds, latch); block(loadingLatchList); } } /** * 加锁 * @param loadingLatchList 需要加锁的id对应的CountDownLatch */ protected void block(Collection<CountDownLatch> loadingLatchList) { if (CollectionUtils.isEmpty(loadingLatchList)) { return; } System.out.println("block:" + loadingLatchList); loadingLatchList.forEach(l -> { try { l.await(); } catch (InterruptedException e) { e.printStackTrace(); } }); } /** * 释放锁 * @param needLoadIds 需要释放锁的id集合 * @param latch 通过该CountDownLatch来释放锁 */ private void release(Collection<String> needLoadIds, CountDownLatch latch) { if (CollectionUtils.isEmpty(needLoadIds)) { return; } synchronized (lock) { needLoadIds.forEach(id -> loadCache.remove(id)); } if (latch != null) { latch.countDown(); } } /** * 加载缓存,比如根据id从db查询数据,然后设置到redis中 * @param needLoadIds 加载缓存的id集合 */ protected abstract void loadCache(Collection<String> needLoadIds); /** * 对需要加载缓存的id绑定CountDownLatch,后续相同的id请求来了从map中找到CountDownLatch,并且await,直到该线程加载完了缓存 * @param needLoadIds 能够正在去加载缓存的id集合 * @return 公用的CountDownLatch */ protected CountDownLatch saveLatch(Collection<String> needLoadIds) { if (CollectionUtils.isEmpty(needLoadIds)) { return null; } CountDownLatch latch = new CountDownLatch(1); needLoadIds.forEach(loadId -> loadCache.put(loadId, latch)); System.out.println("loadCache:" + loadCache); return latch; } /** * 哪些id正在加载数据,此时持有相同id的线程需要等待 * @param ids 需要加载缓存的id集合 * @return 正在加载的id所对应的CountDownLatch集合 */ private Collection<CountDownLatch> excludeLoadingIds(Collection<String> ids) { List<CountDownLatch> loadingLatchList = Lists.newArrayList(); Iterator<String> iterator = ids.iterator(); while (iterator.hasNext()) { String id = iterator.next(); CountDownLatch latch = loadCache.get(id); if (latch != null) { loadingLatchList.add(latch); iterator.remove(); } } System.out.println("loadingLatchList:" + loadingLatchList); return loadingLatchList; } }
Business implementation
import java.util.Collection; public class BizCacheSetterRpcService extends AbstractCacheSetterService { @Override protected void loadCache(Collection<String> needLoadIds) { // 读取db进行处理 // 设置缓存 } }
Simply put, the requested data does not exist in the database, causing invalid requests to penetrate the database.
The solution is also very simple. The method of obtaining data from db (getByKey(K key)) must give a default value.
For example, I have a prize pool with an upper limit of 1W. When the user completes the task, I will send him money, record it using redis, and log it into the table. The user can see the remaining amount of the prize pool in real time on the task page. At the beginning of the task, it is obvious that the amount of the prize pool is unchanged. There is no record of the amount issued in redis and db, which leads to the need to check the db every time. In this case, if the data is not found from the db, a value should be cached. 0 to cache.
It means that a large number of cache failures hit the db. Of course, they must all be business caches. In the final analysis, there is a problem with the code writing. You can break up the expiration time of cache invalidation and don't let it fail centrally.
Recommended learning: Redis video tutorial
The above is the detailed content of Sharing of high concurrency techniques using Redis and local cache. For more information, please follow other related articles on the PHP Chinese website!