Home Backend Development PHP Tutorial Redis sharding (distributed cache)

Redis sharding (distributed cache)

Mar 19, 2019 pm 03:01 PM
Distributed cache

Partitioning is the process of splitting your data into multiple Redis instances, so that each instance will only contain a subset of all keys. (Related recommendations: Redis Tutorial)

Redis sharding (distributed cache)

1 What is the use of sharding

Redis’ sharding has two main goals:

• Allows the combined memory of many computers to be used to support larger databases. Without sharding, you are limited to the amount of memory a single machine can support.

• Allows scaling computing power to multiple cores or multiple servers, and network bandwidth to multiple servers or multiple network adapters.

2 Sharding Basics

There are many different sharding criteria (criteria).

Suppose we have 4 Redis instances R0, R1, R2, R3, and many more keys representing users, like user:1, user:2, ... etc. We can find different ways to choose which instance a specific key is stored in. In other words, there are many different ways to map a key to a specific Redis server.

One of the simplest ways to perform sharding is range partitioning, which completes sharding by mapping the range of an object to a specified Redis instance. For example, I can assume that the user enters instance R0 from ID 0 to ID 10000, and the user enters instance R1 from ID 10001 to ID 20000.

This approach works and is actually used in practice, however , this has the disadvantage that it requires a table that maps ranges to instances.

This table needs to be managed, and different types of objects require a table, so range sharding is often not advisable in Redis because This is much less efficient than other sharding alternatives.

An alternative to range sharding is hash partitioning.

This mode works with any key, and does not require the key to be in the form of object_name:, just It's as simple as this

• Use a hash function (for example, the crc32 hash function) to convert the key name to a number. For example, if the key is foobar, crc32(foobar) will output something like 93024922.

• Modulo this data to convert it to a number between 0 and 3 so that the number can be mapped to one of my 4 Redis instances. 93024922 modulo 4 equals 2, so I know my key foobar should be stored to the R2 instance. Note: The modulo operation returns the remainder of the division operation, which is always implemented as the % operator in many programming languages.

There are many other ways to shard, as you can see from these two examples. An advanced form of hash sharding is called consistent hashing and is implemented by some Redis clients and brokers.

3 Sharding Implementation (Theory)

Sharding can be undertaken by different parts of the software stack.

•Client side partitioning

The client directly selects the correct node to write and read the specified key. Many Redis clients implement client side partitioning.

• Proxy assisted partitioning

Our client sends the request to a proxy that can understand the Redis protocol. Instead of sending the request directly to the Redis instance.

The proxy will ensure that our requests are forwarded to the correct Redis instance according to the configured sharding mode, and return a response to the client.

Redis and Memcached's proxy Twemproxy implements proxy-assisted splitting Piece.

• Query routing

You can send your query to a random instance, and this instance will guarantee to forward your query to the correct node.

Redis Cluster implements a hybrid form of query routing with the help of the client (the request is not forwarded directly from the Redis instance to another, but the client receives a redirect to the correct node).

4 Disadvantages of sharding

Some features of Redis do not play well with sharding

• Operations involving multiple keys are generally not supported. For example, you cannot perform an intersection on keys mapped on two different Redis instances (in fact there is a way to do it, but not directly).

• Transactions involving multiple keys cannot use

• The granularity of sharding is the key, so you cannot use a large key to shard the data set, such as a large ordered set

• When sharding is used, the data The processing becomes more complex, for example, you need to process multiple RDB/AOF files, and when backing up data you need to aggregate persistent files from multiple instances and hosts

• Adding and removing capacity is also complex. For example, Redis Cluster has the ability to dynamically add and remove nodes at runtime to support transparent rebalancing of data, but other methods, such as client-side sharding and proxies, do not support this feature. However, there is a technology called presharding that can help at this point.

5 Storage OR Cache

Although Redis's sharding concept is the same whether you use Redis as a data store or cache, there is an important limitation when used as a data store. When Redis is used as a data store, a given key is always mapped to the same Redis instance. When Redis is used as a cache, it is not a big problem if one node is unavailable and another node is used. Changing the mapping of keys and instances according to our wishes improves the availability of the system (that is, the system's ability to answer our queries) .

Consistent hashing implementations are often able to switch to other nodes if the preferred node for a given key is unavailable. Similarly, if you add a new node, some data will start to be stored in this new node.

The main concepts here are as follows:

• If Redis is used as a cache, it is easy to use consistent hashing to achieve scaling up and down.

• If Redis is used as storage, a fixed key-to-node mapping is used, so the number of nodes must be fixed and cannot be changed. Otherwise, when adding or deleting nodes, you need a system that supports rebalancing keys between nodes. Currently, only Redis cluster can do this.

6 Pre-sharding

We already know a problem with sharding. Unless we use Redis as a cache, adding and removing nodes is a tricky thing. It is much simpler to use fixed key and instance mapping.

However, data storage needs may be changing all the time. Today I can live with 10 Redis nodes (instances), but tomorrow I may need 50 nodes.

Because Redis has a fairly small memory footprint and is lightweight (an idle instance only uses 1MB of memory), a simple solution is to start many instances from the beginning. Even if you start with just one server, you can decide on day one to live in a distributed world and use sharding to run multiple Redis instances on a single server.

You can choose a large number of instances from the beginning. For example, 32 or 64 instances will satisfy most users and provide enough room for future growth.

This way, when your data storage needs to grow and you need more Redis servers, all you have to do is simply move the instance from one server to another. When you add the first server, you need to move half of the Redis instances from the first server to the second, and so on.

Using Redis replication, you can move data with little or no downtime:

• Start an empty instance on your new server.

• Move data and configure the new instance as a slave service of the source instance.

• Stop your client.

• Update the server IP address configuration of the moved instance.

• Send the SLAVEOF NO ONE command to the slave node on the new server.

• Start your client with the new updated configuration.

• Finally, close the instances on the old server that are no longer in use.

7 Sharding Implementation (Practice)

So far, we have discussed Redis sharding in theory, but what is the practical situation? What system should you use?

7.1 Redis Cluster

Redis Cluster is the preferred method for automatic sharding and high availability.

Once Redis Cluster is available, and supports Redis Cluster Once the client is available, Redis Cluster will become the de facto standard for Redis sharding.

Redis cluster is a hybrid model of query routing and client sharding.

7.2 Twemproxy

Twemproxy is a proxy developed by Twitter that supports Memcached ASCII and Redis protocols. It is single-threaded, written in C, and runs very fast. Open source project under the Apache 2.0 license.

Twemproxy supports automatic sharding across multiple Redis instances, and optional node exclusion support if the node is unavailable (this will change the mapping of keys to instances, so you should only use Redis as a cache Only use this feature).

This is not a single point of failure because you can start multiple proxies and have your clients connect to the first proxy that accepts the connection.

Fundamentally, Twemproxy is a middle layer between the client and the Redis instance, allowing us to reliably handle our shards with minimal additional complexity. This is the currently recommended way of handling Redis sharding.

7.3 Clients that support consistent hashing

An alternative to Twemproxy is to use the implementation Clients with client-side sharding, through consistent hashing or other similar algorithms. There are several Redis clients that support consistent hashing, such as redis-rb and Predis.

Please check the complete list of Redis clients to see if there is a mature client that supports your programming language and implements consistent hashing.

The above is the detailed content of Redis sharding (distributed cache). For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to use Redis and Node.js to implement distributed caching function How to use Redis and Node.js to implement distributed caching function Sep 21, 2023 pm 02:30 PM

How to use Redis and Node.js to implement distributed caching functions. Redis is an open source in-memory database that provides fast and scalable key-value storage and is often used in scenarios such as caching, message queues, and data storage. Node.js is a JavaScript runtime based on the ChromeV8 engine, suitable for high-concurrency web applications. This article will introduce how to use Redis and Node.js to implement the distributed cache function, and help readers understand and practice it through specific code examples.

PHP and REDIS: How to implement distributed cache invalidation and update PHP and REDIS: How to implement distributed cache invalidation and update Jul 21, 2023 pm 05:33 PM

PHP and REDIS: How to implement distributed cache invalidation and update Introduction: In modern distributed systems, cache is a very important component, which can significantly improve the performance and scalability of the system. At the same time, cache invalidation and update is also a very important issue, because if the invalidation and update of cache data cannot be handled correctly, it will lead to system data inconsistency. This article will introduce how to use PHP and REDIS to implement distributed cache invalidation and update, and provide relevant code examples. 1. What is RED

How to deal with distributed caching and caching strategies in C# development How to deal with distributed caching and caching strategies in C# development Oct 08, 2023 pm 11:36 PM

How to deal with distributed caching and caching strategies in C# development Introduction: In today's highly interconnected information age, application performance and response speed are crucial to user experience. Caching is one of the important ways to improve application performance. In distributed systems, dealing with caching and developing caching strategies becomes even more important because the complexity of distributed systems often creates additional challenges. This article will explore how to deal with distributed caching and caching strategies in C# development, and demonstrate the implementation through specific code examples. 1. Introduction using distributed cache

How to handle distributed transactions and distributed cache in C# development How to handle distributed transactions and distributed cache in C# development Oct 08, 2023 pm 08:01 PM

How to handle distributed transactions and distributed cache in C# development requires specific code examples Summary: In distributed systems, transaction processing and cache management are two crucial aspects. This article will introduce how to handle distributed transactions and distributed cache in C# development, and give specific code examples. Introduction As the scale and complexity of software systems increase, many applications adopt distributed architectures. In distributed systems, transaction processing and cache management are two key challenges. Transaction processing ensures data consistency, while cache management improves system

Using go-zero to implement high-availability distributed cache Using go-zero to implement high-availability distributed cache Jun 23, 2023 am 08:02 AM

With the development of web applications, more and more attention is turning to how to improve application performance. The role of caching is to offset high traffic and busy loads and improve the performance and scalability of web applications. In a distributed environment, how to implement high-availability caching has become an important technology. This article will introduce how to use some tools and frameworks provided by go-zero to implement high-availability distributed cache, and briefly discuss the advantages and limitations of go-zero in practical applications. 1. What is go-

A deep dive into distributed caching in Java caching technology A deep dive into distributed caching in Java caching technology Jun 21, 2023 am 09:00 AM

In the current Internet environment of high concurrency and big data, caching technology has become one of the important means to improve system performance. In Java caching technology, distributed caching is a very important technology. So what is distributed cache? This article will delve into distributed caching in Java caching technology. 1. Basic concepts of distributed cache Distributed cache refers to a cache system that stores cache data on multiple nodes. Among them, each node contains a complete copy of cached data and can back up each other. When one of the nodes fails,

Java development: How to implement distributed caching and data sharing Java development: How to implement distributed caching and data sharing Sep 20, 2023 pm 12:16 PM

Java Development: How to Implement Distributed Caching and Data Sharing Introduction: As the scale of the system continues to expand, distributed architecture has become a common choice for enterprise application development. In distributed systems, efficient caching and data sharing is one of the key tasks. This article will introduce how to use Java to develop distributed caching and data sharing methods, and provide specific code examples. 1. Implementation of distributed cache 1.1Redis as a distributed cache Redis is an open source in-memory database that can be used as a distributed cache. The following is

Using Redis to implement distributed cache penetration solution Using Redis to implement distributed cache penetration solution Nov 07, 2023 am 10:26 AM

Using Redis to realize distributed cache penetration solution With the continuous development of Internet business, the amount of data access is also increasing. In order to improve the performance and user experience of the system, caching technology has gradually become an indispensable part, of which Redis is an essential part. An efficient and scalable caching middleware solution that is favored by developers. When using Redis as a distributed cache, in order to avoid performance problems caused by cache penetration, we need to implement a reliable solution. This article will introduce how to use Redis to achieve splitting

See all articles