How to implement Redis using HyperLogLog-Redis-php.cn

Table of Contents

1. Overview

2. What is the cardinality?

3. Commands

3.1 PFADD

3.2 PFCOUNT

3.3 PFMERGE

Home

Database

Redis

How to implement Redis using HyperLogLog

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

May 26, 2023 pm 05:41 PM

redis hyperloglog

1. Overview

Redis added the HyperLogLog data structure in version 2.8.9, which is used for cardinality statistics. The advantage is that when the number of input elements is very large, the space required to calculate the cardinality is relatively small. And generally relatively constant.

In Redis, each HyperLogLog key only costs 12 KB of memory to calculate the cardinality of nearly 2^64 different elements. This is in sharp contrast to the calculation of cardinality, where a collection with more elements consumes more memory. However, because HyperLogLog only calculates the cardinality based on the input elements and does not store the input elements themselves, HyperLogLog cannot return individual elements of the input like a collection.

2. What is the cardinality?

For example, if the data set is {1, 3, 5, 7, 5, 7, 8}, then the cardinality set of this data set is {1, 3, 5 ,7, 8}, the cardinality (non-repeating elements) is 5. Cardinality estimation is to quickly calculate the cardinality within the acceptable error range.

3. Commands

Currently, only three commands, PFADD, PFCOUNT and PFMERGE, are supported by HyperLogLog. Let’s introduce them one by one first.

3.1 PFADD

Earliest available version: 2.8.9. Time complexity: O(1).

The PFADD command can add elements (multiple elements can be specified) to the HyperLogLog data structure and store them in the key specified by the first parameter key. Returns 1 if the cardinality estimate (number of elements evaluated) has changed, otherwise returns 0, i.e. to confirm whether the cardinality estimate has changed after executing the command. If the specified key does not exist, an empty HyperLogLog data structure is created (i.e., a Redis String with the specified string length and encoding). It is also possible to call the command without specifying an element parameter and only specifying the key. If the key exists, do nothing and return 0; if the key does not exist, a new HyperLogLog data node is created and 1 is returned. Essentially it just generates a new HyperLogLog data structure without storing any elements.

(1) Syntax format:

PFADD key element [element ...]

Copy after login

(2) Return value:

Integer type, if at least one element is added, 1 is returned, otherwise 0 is returned.

(3) Example:

127.0.0.1:6379> PFADD hll a b c d e f g
(integer) 1
127.0.0.1:6379> pfcount hll
(integer) 7

Copy after login

3.2 PFCOUNT

Earliest available version: 2.8.9. Time complexity: O(1). For multiple relatively large keys, the time complexity is O(N).

Use the PFCOUNT command to get a HyperLogLog estimated cardinality value (that is, the number of elements). This command returns 0 if the key does not exist, otherwise it returns an estimate of the key's cardinality. For multiple keys, returned is a cardinality estimate for the union of multiple HyperLogLogs, calculated by merging multiple HyperLogLogs into a temporary HyperLogLog. Using a minimal and consistent amount of memory, HyperLogLog can count the number of unique elements of a collection. Each HyperLogLog uses only 12K plus a few bytes of the key itself.

(1) Syntax format:

PFCOUNT key [key ...]

Copy after login

(2) Return value:

Integer, returns the cardinality estimate of the specified HyperLogLog. If there are multiple HyperLogLogs, the union is returned. Cardinality estimate.

(3) Example:

127.0.0.1:6379> PFADD hll foo bar zap
(integer) 1
127.0.0.1:6379> PFADD hll zap zap zap
(integer) 0
127.0.0.1:6379> PFADD hll foo bar
(integer) 0
127.0.0.1:6379> PFCOUNT hll
(integer) 3
127.0.0.1:6379> PFADD some-other-hll 1 2 3
(integer) 1
127.0.0.1:6379> PFCOUNT some-other-hll
(integer) 3
127.0.0.1:6379> PFCOUNT hll some-other-hll
(integer) 6

Copy after login

(4) Limitation:

The results returned by HyperLogLog are not accurate, and the error rate is about 0.81%.

Using this command will change HyperLogLog and use 8 bytes to store the last calculated base. So, technically speaking, PFCOUNT is a write command.

(5) Performance issues

Even though it theoretically takes a long time to process an intensive HyperLogLog, the PFCOUNT command still has high performance when only one key is specified. This is because PFCOUNT caches the base of the last calculation, and this base does not change all the time, because the PFADD command does not update the register in most cases. Therefore, the effect of hundreds of requests per second can be achieved.

When using the PFCOUNT command to process multiple keys, HyperLogLog will be merged. This step is very time-consuming. More importantly, the calculated cardinality of the union cannot be cached. When using multiple keys, the execution of PFCOUNT can take some time (usually on the order of milliseconds), so overuse is not recommended.

It should be noted that the single-key and multi-key execution semantics of this command are different and have different performance. Excessive use of multi-key execution semantics is not recommended.

3.3 PFMERGE

Earliest available version: 2.8.9. Time complexity: O(N), N is the number of HyperLogLogs to be merged.

Multiple HyperLogLogs can be merged into one HyperLogLog through the PFMERGE command. The cardinality estimate of the merged HyperLogLog is calculated by taking the union of all given HyperLogLogs. The calculated result is saved to the specified key.

Syntax format:

PFMERGE destkey sourcekey [sourcekey ...]

Copy after login

Return value:

Return OK.

Example:

127.0.0.1:6379> PFADD hll1 foo bar zap a
(integer) 1
127.0.0.1:6379> PFADD hll2 a b c foo
(integer) 1
127.0.0.1:6379> PFMERGE hll3 hll1 hll2
OK
127.0.0.1:6379> PFCOUNT hll3
(integer) 6

Copy after login

The above is the detailed content of How to implement Redis using HyperLogLog. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

WWE 2K25: How To Unlock Everything In MyRise

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7499

CakePHP Tutorial

1377

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

How to build the redis cluster mode Apr 10, 2025 pm 10:15 PM

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to use the redis command Apr 10, 2025 pm 08:45 PM

Using the Redis directive requires the following steps: Open the Redis client. Enter the command (verb key value). Provides the required parameters (varies from instruction to instruction). Press Enter to execute the command. Redis returns a response indicating the result of the operation (usually OK or -ERR).

How to start the server with redis Apr 10, 2025 pm 08:12 PM

The steps to start a Redis server include: Install Redis according to the operating system. Start the Redis service via redis-server (Linux/macOS) or redis-server.exe (Windows). Use the redis-cli ping (Linux/macOS) or redis-cli.exe ping (Windows) command to check the service status. Use a Redis client, such as redis-cli, Python, or Node.js, to access the server.

How to read the source code of redis Apr 10, 2025 pm 08:27 PM

The best way to understand Redis source code is to go step by step: get familiar with the basics of Redis. Select a specific module or function as the starting point. Start with the entry point of the module or function and view the code line by line. View the code through the function call chain. Be familiar with the underlying data structures used by Redis. Identify the algorithm used by Redis.

How to use redis lock Apr 10, 2025 pm 08:39 PM

Using Redis to lock operations requires obtaining the lock through the SETNX command, and then using the EXPIRE command to set the expiration time. The specific steps are: (1) Use the SETNX command to try to set a key-value pair; (2) Use the EXPIRE command to set the expiration time for the lock; (3) Use the DEL command to delete the lock when the lock is no longer needed.

What to do if redis-server can't be found Apr 10, 2025 pm 06:54 PM

Steps to solve the problem that redis-server cannot find: Check the installation to make sure Redis is installed correctly; set the environment variables REDIS_HOST and REDIS_PORT; start the Redis server redis-server; check whether the server is running redis-cli ping.

How to view all keys in redis Apr 10, 2025 pm 07:15 PM

To view all keys in Redis, there are three ways: use the KEYS command to return all keys that match the specified pattern; use the SCAN command to iterate over the keys and return a set of keys; use the INFO command to get the total number of keys.

How to implement the underlying redis Apr 10, 2025 pm 07:21 PM

Redis uses hash tables to store data and supports data structures such as strings, lists, hash tables, collections and ordered collections. Redis persists data through snapshots (RDB) and append write-only (AOF) mechanisms. Redis uses master-slave replication to improve data availability. Redis uses a single-threaded event loop to handle connections and commands to ensure data atomicity and consistency. Redis sets the expiration time for the key and uses the lazy delete mechanism to delete the expiration key.

See all articles