Home Database Redis Detailed explanation of redis sharding

Detailed explanation of redis sharding

Dec 06, 2019 pm 05:13 PM
redis

Detailed explanation of redis sharding

Partitioning is the process of splitting your data into multiple Redis instances so that each instance will only contain a subset of all keys. The first part of this article will introduce you to the concept of sharding, and the second part will show you the options for Redis sharding.

What sharding can do

Redis’ sharding has two main goals:

1. Allow usage Many computers have the combined memory to support larger databases. Without sharding, you are limited to the amount of memory a single machine can support.

2. Allows scaling computing power to multiple cores or multiple servers, and scaling network bandwidth to multiple servers or multiple network adapters.

Sharding Basics

There are many different criteria for sharding. Suppose we have 4 Redis instances R0, R1, R2, R3, and many keys representing users, like user:1, user:2,... etc. We can find different ways to select a specific key to store in In which instance. In other words, there are many different ways to map a key to a specific Redis server.

One of the simplest ways to perform sharding is range partitioning, which completes sharding by mapping the range of an object to a specified Redis instance. For example, I could assume that a user enters instance R0 from ID 0 to ID 10000, a user enters instance R1 from ID 10001 to ID 20000, and so on.

This approach works and is actually used in practice. However, it has the disadvantage that it requires a table that maps ranges to instances. This table needs to be managed, and different types of objects require a table, so range sharding is often not advisable in Redis because it is much less efficient than the alternative of sharding for it.

An alternative to range sharding is hash partitioning. This mode works for any key, it does not require the key to be in the form object_name:, it is as simple as this:

1. Use a hash function (for example, crc32 hash function) to convert the key name to a number. For example, if the key is foobar, crc32(foobar) will output something like 93024922.

2. Modulo this data to convert it to a number between 0 and 3 so that this number can be mapped to one of my 4 Redis instances. 93024922 modulo 4 equals 2, so I know my key foobar should be stored to the R2 instance. Note: The modulo operation returns the remainder of the division operation, which is always implemented as the % operator in many programming languages.

There are many other ways to shard, as you can see from these two examples. An advanced form of hash sharding is called consistent hashing and is implemented by some Redis clients and brokers.

Different implementations of sharding

Sharding can be undertaken by different parts of the software stack.

1. Client side partitioning means that the client directly selects the correct node to write and read the specified key. Many Redis clients implement client-side sharding.

2. Proxy assisted partitioning means that our client sends requests to a proxy that can understand the Redis protocol, instead of sending requests directly to the Redis instance. The proxy will ensure that our requests are forwarded to the correct Redis instance according to the configured sharding mode and return a response to the client. Twemproxy, a proxy for Redis and Memcached, implements proxy-assisted sharding.

3. Query routing means that you can send your query to a random instance, and this instance will ensure that your query is forwarded to the correct node. Redis Cluster implements a hybrid form of query routing with the help of clients (requests are not forwarded directly from one Redis instance to another, but the client receives a redirect to the correct node).

Disadvantages of sharding

Some features of Redis do not play well with sharding:

1. Operations involving multiple keys are generally not supported. For example, you cannot perform an intersection on keys mapped on two different Redis instances (there is actually a way to do it, but not directly).

2. Transactions involving multiple keys cannot be used.

3. The granularity of sharding is the key, so you cannot use a large key to shard the data set, such as a large ordered set.

4. When sharding is used, data processing becomes more complex. For example, you need to process multiple RDB/AOF files. When backing up data, you need to aggregate persistent files from multiple instances and hosts.

5. Adding and deleting capacity is also complicated. For example, Redis Cluster has the ability to dynamically add and remove nodes at runtime to support transparent rebalancing of data, but other methods, such as client-side sharding and proxies, do not support this feature. However, there is a technology called presharding that can help at this point.

Data storage or caching

Although Redis sharding is conceptually the same whether Redis is used as data storage or cache, but as data There is an important limitation when it comes to storage. When Redis is used as a data store, a given key is always mapped to the same Redis instance. When Redis is used as a cache, it is not a big problem if one node is unavailable and another node is used. Changing the mapping of keys and instances according to our wishes improves the availability of the system (that is, the system's ability to answer our queries) .

Consistent hashing implementations are often able to switch to other nodes if the preferred node for a given key is unavailable. Similarly, if you add a new node, some data will start to be stored in this new node.

The main concepts here are as follows:

1. If Redis is used as a cache, it is easy to use consistent hashing to achieve scaling up and down.

2. If Redis is used as storage, use a fixed key-to-node mapping, so the number of nodes must be fixed and cannot be changed. Otherwise, when adding or deleting nodes, you need a system that supports rebalancing keys between nodes. Currently, only Redis Cluster can do this, but Redis Cluster is still in the beta stage and has not yet been considered for use in a production environment.

Pre-sharding

We already know that there is a problem with sharding. Unless we use Redis as a cache, adding and deleting nodes is a problem. A tricky thing to do, it's much simpler to use a fixed key and instance mapping.

However, data storage needs may be changing all the time. Today I can live with 10 Redis nodes (instances), but tomorrow I may need 50 nodes.

Because Redis has a relatively small memory footprint and is lightweight (an idle instance only uses 1MB of memory), a simple solution is to start many instances from the beginning. Even if you start with just one server, you can decide on day one to live in a distributed world and use sharding to run multiple Redis instances on a single server.

You can choose a large number of instances from the beginning. For example, 32 or 64 instances will satisfy most users and provide enough room for future growth.

This way, when your data storage needs to grow and you need more Redis servers, all you have to do is simply move the instance from one server to another. When you add the first server, you need to move half of the Redis instances from the first server to the second, and so on.

Using Redis replication, you can move data with little or no downtime:

1. Start an empty instance on your new server.

2. Move data and configure the new instance as the slave service of the source instance.

3. Stop your client.

4. Update the server IP address configuration of the moved instance.

5. Send the SLAVEOF NO ONE command to the slave node on the new server.

6. Start your client with the new updated configuration.

7. Finally, close the instances that are no longer used on the old server.

Redis sharding implementation

Redis cluster is the preferred method for automatic sharding and high availability. It is not yet fully available for production use, but it has entered the beta stage.

Once Redis Cluster is available, and clients that support Redis Cluster are available, Redis Cluster will become the de facto standard for Redis sharding.

Redis cluster is a hybrid model of query routing and client sharding.

Twemproxy is a proxy developed by Twitter that supports Memcached ASCII and Redis protocols. It is single-threaded, written in C, and runs very fast. It is an open source project licensed under the Apache 2.0 license.

Twemproxy supports automatic sharding across multiple Redis instances, and optional node exclusion support if the node is unavailable (this will change the mapping of keys to instances, so you should only use Redis as a cache Only use this feature).

This is not a single point of failure because you can start multiple proxies and have your clients connect to the first proxy that accepts the connection.

An alternative to Twemproxy is to use a client that implements client-side sharding through consistent hashing or other similar algorithms. There are several Redis clients that support consistent hashing, such as redis-rb and Predis.

For more redis knowledge, please pay attention to the redis database tutorial column.

The above is the detailed content of Detailed explanation of redis sharding. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to build the redis cluster mode How to build the redis cluster mode Apr 10, 2025 pm 10:15 PM

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to clear redis data How to clear redis data Apr 10, 2025 pm 10:06 PM

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

How to read redis queue How to read redis queue Apr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

How to use the redis command How to use the redis command Apr 10, 2025 pm 08:45 PM

Using the Redis directive requires the following steps: Open the Redis client. Enter the command (verb key value). Provides the required parameters (varies from instruction to instruction). Press Enter to execute the command. Redis returns a response indicating the result of the operation (usually OK or -ERR).

How to use redis lock How to use redis lock Apr 10, 2025 pm 08:39 PM

Using Redis to lock operations requires obtaining the lock through the SETNX command, and then using the EXPIRE command to set the expiration time. The specific steps are: (1) Use the SETNX command to try to set a key-value pair; (2) Use the EXPIRE command to set the expiration time for the lock; (3) Use the DEL command to delete the lock when the lock is no longer needed.

How to read the source code of redis How to read the source code of redis Apr 10, 2025 pm 08:27 PM

The best way to understand Redis source code is to go step by step: get familiar with the basics of Redis. Select a specific module or function as the starting point. Start with the entry point of the module or function and view the code line by line. View the code through the function call chain. Be familiar with the underlying data structures used by Redis. Identify the algorithm used by Redis.

How to make message middleware for redis How to make message middleware for redis Apr 10, 2025 pm 07:51 PM

Redis, as a message middleware, supports production-consumption models, can persist messages and ensure reliable delivery. Using Redis as the message middleware enables low latency, reliable and scalable messaging.

How to start the server with redis How to start the server with redis Apr 10, 2025 pm 08:12 PM

The steps to start a Redis server include: Install Redis according to the operating system. Start the Redis service via redis-server (Linux/macOS) or redis-server.exe (Windows). Use the redis-cli ping (Linux/macOS) or redis-cli.exe ping (Windows) command to check the service status. Use a Redis client, such as redis-cli, Python, or Node.js, to access the server.

See all articles