This article brings you relevant knowledge about Redis, which mainly introduces related issues about realizing the intersection, union and complement of data. If all calculations are performed in JVM memory If so, it is easy to cause OOM exceptions caused by insufficient memory space. Let’s take a look at it. I hope it will be helpful to everyone.
Recommended learning: Redis video tutorial
Today we will simulate such a scenario, We have multiple text files locally. Each file stores a lot of 32-bit strings as unique identifiers of users. Each user stores one line. If we have a very large number of users every day, we may In work, there is a need to perform intersection, union or complement processing on these users. The simplest way is to perform operations through sets in Java, such as using HashSet to perform some corresponding operations, but such operations exist One limitation is that we generally have limited initial memory during JVM operation. If all calculations are performed in JVM memory, it is easy to cause OOM exceptions caused by insufficient memory space. So today we will introduce an extension. A more flexible way to perform such intersection and complement operations: use Redis to realize the intersection, union, and complement of data
Redis version: Redis 6.0.6
Jedis version: 4.2.2
Tool hutool version: 5.8.0.M3
pom file:
<dependencies> <dependency> <groupId>redis.clients</groupId> <artifactId>jedis</artifactId> <version>4.2.2</version> </dependency> <dependency> <groupId>cn.hutool</groupId> <artifactId>hutool-all</artifactId> <version>5.8.0.M3</version> </dependency></dependencies>
public class RedisCalculateUtils { static String oneFileString = "/Users/tmp/test-1.txt"; static String twoFileString = "/Users/tmp/test-2.txt"; static String diffFileString = "/Users/tmp/diff-test.txt"; static String interFileString = "/Users/tmp/inter-test.txt"; static String unionFileString = "/Users/tmp/union-test.txt"; static String oneFileCacheKey = "oneFile"; static String twoFileCacheKey = "twoFile"; static String diffFileCacheKey = "diffFile"; static String interFileCacheKey = "interFile"; static String unionFileCacheKey = "unionFile"; }
/** * 初始化数据并写入文件中 */public static void writeFile() { File oneFile = new File(oneFileString); List<String> fs = new ArrayList<>(10000); for (int i = 10000; i < 15000; i++) { String s = SecureUtil.md5(String.valueOf(i)); fs.add(s); } FileUtil.writeUtf8Lines(fs, oneFile); File twoFile = new File(twoFileString); fs.clear(); for (int i = 12000; i < 20000; i++) { String s = SecureUtil.md5(String.valueOf(i)); fs.add(s); } FileUtil.writeUtf8Lines(fs, twoFile); }
/** * 读取文件数据并写入Redis */public static void writeCache() { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Pipeline p = jedis.pipelined(); List<String> oneFileStringList = FileUtil.readLines(oneFileString, "UTF-8"); for (String s : oneFileStringList) { p.sadd(oneFileCacheKey, s); } p.sync(); List<String> twoFileStringList = FileUtil.readLines(twoFileString, "UTF-8"); for (String s : twoFileStringList) { p.sadd(twoFileCacheKey, s); } p.sync(); } catch (Exception e) { throw new RuntimeException(e); }}
/** * oneKey对应的Set 与 twoKey对应的Set 的差集 并写入 threeKey * @param oneKey 差集前面的集合Key * @param twoKey 差集后面的集合Key * @param threeKey 差集结果的集合Key */ public static void diff(String oneKey, String twoKey, String threeKey) { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { long result = jedis.sdiffstore(threeKey, oneKey, twoKey); System.out.println("oneKey 与 twoKey 的差集的个数:" + result); } catch (Exception e) { throw new RuntimeException(e); } }
/** * 将计算的差集数据写入到指定文件 */ public static void writeDiffToFile() { File diffFile = new File(diffFileString); try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Set<String> result = jedis.smembers(diffFileCacheKey); FileUtil.writeUtf8Lines(result, diffFile); } catch (Exception e) { throw new RuntimeException(e); } }
/** * * @param cacheKeyArray 交集集合Key * @param destinationKey 交集集合结果Key */ public static void inter(String[] cacheKeyArray, String destinationKey) { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { long result = jedis.sinterstore(destinationKey, cacheKeyArray); System.out.println("cacheKeyArray 的交集的个数:" + result); } catch (Exception e) { throw new RuntimeException(e); } }
/** * 将计算的交集数据写入到指定文件 */ public static void writeInterToFile() { File interFile = new File(interFileString); try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Set<String> result = jedis.smembers(interFileCacheKey); FileUtil.writeUtf8Lines(result, interFile); } catch (Exception e) { throw new RuntimeException(e); } }
/** * 计算多个Key的并集并写入到新的Key * @param cacheKeyArray 求并集的Key * @param destinationKey 并集结果写入的KEY */ public static void union(String[] cacheKeyArray, String destinationKey) { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { long result = jedis.sunionstore(destinationKey, cacheKeyArray); System.out.println("cacheKeyArray 的并集的个数:" + result); } catch (Exception e) { throw new RuntimeException(e); } }
/** * 将计算的并集数据写入到指定文件 */ public static void writeUnionToFile() { File unionFile = new File(unionFileString); try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Set<String> result = jedis.smembers(unionFileCacheKey); FileUtil.writeUtf8Lines(result, unionFile); } catch (Exception e) { throw new RuntimeException(e); } }
Example:
key1 = {a,b,c,d} key2 = {c} key3 = {a,c,e} SDIFF key1 key2 key3 = {b,d}
The SDIFFSTORE command is similar to SDIFF. The difference is that it saves the results to the destination set and returns the result set. to the client.
If the destination collection already exists, it will be overwritten.
Example Note:
key1 = {a,b,c,d} key2 = {c} key3 = {a,c,e} SINTER key1 key2 key3 = {c}
The SINTERSTORE command is similar to the SINTER command, except that it does not directly return the result set, but saves the results in the destination collection.
If the destination collection exists, it will be overwritten.
Example Note:
key1 = {a,b,c,d} key2 = {c} key3 = {a,c,e} SUNION key1 key2 key3 = {a,b,c,d,e}
The function of the SUNIONSTORE command is similar to that of SUNION. The difference is that the result set is not returned but is stored in the destination.
If destination already exists, it will be overwritten.
Recommended learning: Redis video tutorial
The above is the detailed content of Detailed examples of how Redis implements intersection, union and complement of data. For more information, please follow other related articles on the PHP Chinese website!