How to use Gaussian Redis to implement secondary index
1. Background
When it comes to indexing, the first impression is the noun of database, but Gaussian Redis can also implement secondary indexing! ! ! Secondary indexes in Gaussian Redis are generally implemented using zset. Gaussian Redis has higher stability and cost advantages than open source Redis. Using Gaussian Redis zset to implement business secondary indexes can achieve a win-win situation in performance and cost.
The essence of indexing is to use ordered structures to speed up queries. Therefore, numeric type and character type indexes can be easily implemented through the Zset structure Gaussian Redis.
• Numeric type index (zset is sorted by fraction):
• Character type index (fraction is sorted At the same time, zset is sorted lexicographically):
Let’s cut into two types of classic business scenarios and see how to use Gaussian Redis to build Stable and reliable secondary indexing system.
2. Scenario 1: Dictionary completion
When typing a query in the browser, the browser usually recommends searches with the same prefix based on likelihood. In this scenario, Gaussian Redis 2 can be used Level index function is implemented.
2.1 Basic Solution
The simplest method is to add each query of the user to the index. If you need to provide completion prompts to users, you can use ZRANGEBYLEX to perform range queries. To reduce the number of results, using the LIMIT option is a method supported by Gaussian Redis.
• Add user search banana to the index:
ZADD myindex 0 banana:1
• Suppose the user enters "bit" in the search form, and we want to provide search keywords that may start with "bit" .
ZRANGEBYLEX myindex "[bit" "[bit\xff"
Even if you use ZRANGEBYLEX to perform a range query, the query range is the string currently entered by the user, and the same string plus a trailing byte of 255 (\xff). We can use this method to get all the strings prefixed by the string entered by the user.
2.2 Dictionary completion related to frequency
In practical applications, people usually want to automatically sort the completion entries to adapt to the frequency of occurrence and eliminate entries that are no longer popular. while adapting to future inputs. We can still use the ZSet structure of Gaussian Redis to achieve this goal, but in the index structure, not only the search terms need to be stored, but also the frequencies associated with them need to be stored.
• Add user search banana to the index
• Determine whether banana exists
ZRANGEBYLEX myindex "[banana:" + LIMIT 0 1
• Assume banana does not exist, add banana:1, where 1 is the frequency
ZADD myindex 0 banana:1
• Assuming banana exists, you need to increment the frequency
If the frequency returned in ZRANGEBYLEX myindex "[banana:" LIMIT 0 1 is 1
1) Delete the old entry :
ZREM myindex 0 banana:1
2) Frequency plus one to rejoin:
ZADD myindex 0 banana:2
Please note that since there may be concurrent updates, the above three commands should be sent through a Lua script to automatically obtain the old count with Lua script and re-add the entry after increasing the score.
If the user enters "banana" in the search form, we hope to provide relevant search keywords. Sort by frequency after getting results via ZRANGEBYLEX.
ZRANGEBYLEX myindex "[banana:" + LIMIT 0 10 1) "banana:123" 2) "banaooo:1" 3) "banned user:49" 4) "banning:89"
• Use streaming algorithms to purge infrequently used inputs. Randomly select a returned entry and subtract one from its score, then add it back with the updated score. However, if the new score is 0, we need to remove the entry from the list.
• If the frequency of randomly selected entries is 1, such as bananaoo:1
ZREM myindex 0 banaooo:1
• If the frequency of randomly selected entries is greater than 1, such as banana:123
ZREM myindex 0 banana:123 ZADD myindex 0 banana:122
Over the long term, the index will include popular searches and automatically adapt if popular searches change over time.
3. Scenario 2: Multidimensional Index
Gaussian Redis not only supports queries in a single dimension, but can also retrieve in multidimensional data. For example, search for people who meet the following criteria: age between 50 and 55 years old, and salary between 70,000 and 85,000. Converting two-dimensional data encoding into one-dimensional data, and then using Gaussian distributed Redis zset storage, is an important method to implement multi-dimensional secondary indexes.
Represent two-dimensional index from a visual perspective. In this space, there are some data sample points represented as coordinates (x, y), and the maximum values of both x and y variables in these coordinates are 400. The blue box in the image represents our query. We want to find all points with coordinates x between 50 and 100 and y between 100 and 300.
3.1 Data encoding
If the inserted data point is x = 75 and y = 200
1) fill with 0 (the maximum data is 400, so fill in 3 digits)
x = 075
y = 200
2)交织数字,以x表示最左边的数字,以y表示最左边的数字,依此类推,以便创建一个编码
027050
若使用00和99替换最后两位,即027000 to 027099,map回x和y,即:
x = 70-79
y = 200-209
因此,针对x=70-79和y = 200-209的二维查询,可以通过编码map成027000 to 027099的一维查询,这可以通过高斯Redis的Zset结构轻松实现。
同理,我们可以针对后四/六/etc位数字进行相同操作,从而获得更大范围。
3)使用二进制
如果将数据表示为二进制,就可以获得更细的粒度,而在数字替换时,每次都将搜索范围扩大两倍。如果我们使用二进制表示法数字,每个变量最多需要9位(表示最多400个值),那么我们将得到:
x = 75 -> 001001011
y = 200 -> 011001000
交织后,000111000011001010
让我们看看在交错表示中用0s ad 1s替换最后的2、4、6、8,...位时我们的范围是什么:
3.2 添加新元素
若插入数据点为x = 75和y = 200
x = 75和y = 200二进制交织编码后为000111000011001010,
ZADD myindex 0 000111000011001010
3.3 查询
查询:x介于50和100之间,y介于100和300之间的所有点
从索引中替换N位会给我们边长为2^(N/2)的搜索框。因此,我们要做的是检查搜索框较小的尺寸,并检查与该数字最接近的2的幂,并不断切分剩余空间,随后用ZRANGEBYLEX进行搜索。
下面是示例代码:
def spacequery(x0,y0,x1,y1,exp) bits=exp*2 x_start = x0/(2**exp) x_end = x1/(2**exp) y_start = y0/(2**exp) y_end = y1/(2**exp) (x_start..x_end).each{|x| (y_start..y_end).each{|y| x_range_start = x*(2**exp) x_range_end = x_range_start | ((2**exp)-1) y_range_start = y*(2**exp) y_range_end = y_range_start | ((2**exp)-1) puts "#{x},#{y} x from #{x_range_start} to #{x_range_end}, y from #{y_range_start} to #{y_range_end}" # Turn it into interleaved form for ZRANGEBYLEX query. # We assume we need 9 bits for each integer, so the final # interleaved representation will be 18 bits. xbin = x_range_start.to_s(2).rjust(9,'0') ybin = y_range_start.to_s(2).rjust(9,'0') s = xbin.split("").zip(ybin.split("")).flatten.compact.join("") # Now that we have the start of the range, calculate the end # by replacing the specified number of bits from 0 to 1. e = s[0..-(bits+1)]+("1"*bits) puts "ZRANGEBYLEX myindex [#{s} [#{e}" } } end spacequery(50,100,100,300,6)
The above is the detailed content of How to use Gaussian Redis to implement secondary index. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

Using the Redis directive requires the following steps: Open the Redis client. Enter the command (verb key value). Provides the required parameters (varies from instruction to instruction). Press Enter to execute the command. Redis returns a response indicating the result of the operation (usually OK or -ERR).

Using Redis to lock operations requires obtaining the lock through the SETNX command, and then using the EXPIRE command to set the expiration time. The specific steps are: (1) Use the SETNX command to try to set a key-value pair; (2) Use the EXPIRE command to set the expiration time for the lock; (3) Use the DEL command to delete the lock when the lock is no longer needed.

The best way to understand Redis source code is to go step by step: get familiar with the basics of Redis. Select a specific module or function as the starting point. Start with the entry point of the module or function and view the code line by line. View the code through the function call chain. Be familiar with the underlying data structures used by Redis. Identify the algorithm used by Redis.

Redis data loss causes include memory failures, power outages, human errors, and hardware failures. The solutions are: 1. Store data to disk with RDB or AOF persistence; 2. Copy to multiple servers for high availability; 3. HA with Redis Sentinel or Redis Cluster; 4. Create snapshots to back up data; 5. Implement best practices such as persistence, replication, snapshots, monitoring, and security measures.

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.
