


A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis
This article will introduce you to the dictionary, hash algorithm and ReHash principle in Redis. I hope it will be helpful to you!
Dictionaries in Redis are widely used to implement various functions of Redis, including databases and hash keys.
The underlying implementation of the dictionary is a hash table. Each dictionary has two hash tables, one is used normally and the other is used when rehash is used to expand the space. [Related recommendations: Redis video tutorial]
Dictionary structure definition
typedef struct dict { // 类型特定函数 dictType *type; // 私有数据 void *privdata; // 哈希表,两个元素 dictht ht[2] // rehash时记录的索引下标,当没有rehash时,值为-1 int rehashidx; } dict;
==When rehash is performed, rehashidx migrates each index The entry data will be 1;==
Among them, the structure of the hash table dicttht is defined as:
typedef struct dictht { // 哈希表数组 dictEntry **table; // 哈希表大小 unsigned long size; // 哈希表大小掩码,用于计算索引值 unsigned long sizenask; // 该哈希表已有节点的数量 unsigned long uesd; } dictht;
Among them, table is an array, and each element of the array points to a pointer of type dictEntry. The dictEntry type stores a key-value pair.
It can also be seen here that the nodes of the hash table are connected by a linked list to solve the hash conflict problem, which is the chain address method.
Hash conflict and hash algorithm
In order to achieve fast access from key to value, Redis uses a hash table to save all key-value pairs. Keycorresponds to the Key set by Redis, and valuecorresponds not to the value itself, but to the pointer pointing to the specific value. The biggest advantage of using a hash table is that you can quickly find key-value pairs with O(1) time complexity. But since it is a hash table, there will inevitably be a hash conflict problem.
Hash conflict means that when the hash value of two keys and the hash bucket are calculated, they happen to fall on the same hash bucket.
The way Redis solves hash conflicts is to use chain hashing, that is, zipper method. When multiple elements point to the same hash bucket, a linked list is used to save the corresponding data in the same hash bucket, and they are connected in turn using pointers.
Hash algorithm
When a new key-value pair is added to the dictionary, the program needs to first calculate the hash value and index based on the key-value pair. value, and then place the hash table node containing the new key-value pair at the specified index of the hash table array based on the index value.
reHash process
There is a load factor in the hash table to control the number of key-value pairs saved in the hash table. This requires a rehash operation to complete. Among them, the calculation formula of the load factor is:
// 负载因子 = 哈希表已保存的节点数量 / 哈希表大小 load_factor = ht[0].used / ht[0].size
The conditions for hash table expansion and contraction are as follows:
- The server is not currently executing the BGSAVE command or the BGREWRITEAOF command, and the hash table The load factor is greater than or equal to 1;
- The server is currently executing the BGSAVE command or the BGREWRITEAOF command, and the load factor of the hash table is greater than or equal to 5;
One of the above conditions is met, The rehash process will be executed.
If the server is executing BGSAVE or BGREWRITEAOF, Redis will create a child process of the current server process
The rehash process is roughly divided into three steps:
Allocate larger space to hash table 2, for example twice the current size of hash table 1;
Remap the data in hash table 1 and copy it to the hash In Table 2;
Release the space of hash table 1;
The size of the allocated space in the first step is determined by the current rehash Determined by the operation type and the number of key-value pairs in the current hash table.
-
When an expansion operation is performed, the allocated space size is the first 2^n value that is greater than or equal to (the number of key-value pairs in the hash table * 2);
Assume that the current number of key-value pairs is 4, then 4 * 2 = 8, because 8 is exactly equal to 2^3, which is exactly equal to the first value equal to 2^n, so the expansion space is 8;
If a shrink operation is performed, the allocated space size is the first 2^n value that is greater than or equal to (the number of key-value pairs in the hash table);
Progressive reHash
When there are a large number of hash tables, if all the data is copied at once, it is likely to affect the server. Therefore, Redis performs rehash in multiple times, which is progressive rehash.
To put it simply, in the second step, Redis still processes client requests normally. When processing a request, it starts from the first index position in hash table 1 and moves this index along the way. All entries elements at the position are copied to hash table 2; when the next request is made, the entries at the next index position are copied.
This cleverly allocates the cost of a large number of copies at one time to the process of processing multiple requests, avoiding time-consuming operations and ensuring fast access to data.
Hash table operations during rehash
When performing a progressive rehash operation, dictionary deletion, search, update and other operations will be performed in two hash tables. For example, if you want to find a key in the dictionary, you will first query it in the original table. If it cannot find it, you will query it in the new table.
And the dictionary addition operation will only be saved in the new table.
For more programming-related knowledge, please visit: Introduction to Programming! !
The above is the detailed content of A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

Using the Redis directive requires the following steps: Open the Redis client. Enter the command (verb key value). Provides the required parameters (varies from instruction to instruction). Press Enter to execute the command. Redis returns a response indicating the result of the operation (usually OK or -ERR).

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

Using Redis to lock operations requires obtaining the lock through the SETNX command, and then using the EXPIRE command to set the expiration time. The specific steps are: (1) Use the SETNX command to try to set a key-value pair; (2) Use the EXPIRE command to set the expiration time for the lock; (3) Use the DEL command to delete the lock when the lock is no longer needed.

The best way to understand Redis source code is to go step by step: get familiar with the basics of Redis. Select a specific module or function as the starting point. Start with the entry point of the module or function and view the code line by line. View the code through the function call chain. Be familiar with the underlying data structures used by Redis. Identify the algorithm used by Redis.

Redis data loss causes include memory failures, power outages, human errors, and hardware failures. The solutions are: 1. Store data to disk with RDB or AOF persistence; 2. Copy to multiple servers for high availability; 3. HA with Redis Sentinel or Redis Cluster; 4. Create snapshots to back up data; 5. Implement best practices such as persistence, replication, snapshots, monitoring, and security measures.

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.
