Detailed explanation of Redis' high availability and high concurrency mechanism-Redis-php.cn

Home

Database

Redis

Detailed explanation of Redis' high availability and high concurrency mechanism

coldplay.xixi

Mar 23, 2021 am 11:04 AM

redis

1. High concurrency mechanism

We know that redis is based on single thread and can be hosted in stand-alone mode It is only about tens of thousands, so how to improve its high concurrent requests of hundreds of thousands under big data through the master-slave architecture of redis and the separation of reading and writing.

Video Course Recommendation →: "Concurrency Solution for Tens of Millions of Data (Theory and Practice)"

1. Master-slave replication

The configuration of redis master-slave replication is not emphasized. It mainly depends on the principle and process of master-slave replication: In the process of master-slave replication of redis, a master host is required as an administrator. Build multiple slave machines. When the slave slave tries to start, it will send a command PSYNC to the master host. If the slave slave is reconnected at this time, the data that the slave slave does not have will be copied from the master host. If it is the first time to connect, then Full resynchronization will be triggered. After triggering, the master host will start a process in the background to generate an RDB snapshot file, and at the same time store the write operations in this time period into the cache. When the RDB file is generated, it will send the RDB file to the slave machine, and the slave machine will get the file. After that, it is first written to the disk and then loaded into the memory. Finally, the master host will also send the data cached in the memory to the slave machine at the same time. If a master-slave network failure occurs and multiple slaves reconnect, the master will only restart one RDB to serve all slaves. [Related recommendations: Redis Video Tutorial]

Breakpoint resume: There is a replica offset in the master and slave, and there is a master id in it, where the offset is kept in the backlog, when the master When the slave reconnects after a network failure, it will find the corresponding last replica offset and copy it. If the corresponding offset is not found, full resynchronization is triggered.

①The complete process of replication

(1) The slave node starts and only saves the information of the master node, including the host and IP of the master node, but the replication process does not start

Where do the master host and IP come from?

of the slaveof configuration in redis.conf (2) There is a scheduled task inside the slave node to check whether there is a new master node to connect and copy every second. If Found that, establish a socket network connection with the master node
(3) The slave node sends the ping command to the master node
(4) Password authentication. If the master sets requirepass, then the slave node must send the masterauth password for authentication.
(5) The master node performs full replication for the first time and sends all data to the slave node
(6) The master node will continue to write commands and asynchronously copy them to the slave node

②Data synchronization The related core mechanism

refers to the full copy performed when the slave connects to msater for the first time. Some of your detailed mechanisms in that process

(1) Both master and slave will maintain An offset

The master will continuously accumulate offsets on itself, and the slave will also continuously accumulate offsets on itself
The slave will report its own offset to the master every second, and the master will also save the offset of each slave

This does not mean that it is specifically used for full replication. The main reason is that both the master and the slave need to know the offset of their respective data in order to know the inconsistency of the data between each other.

(2) backlog

The master node has a backlog, the default size is 1MB
When the master node copies data to the slave node, it will also write a copy of the data synchronously in the backlog
The backlog is mainly used for full replication Incremental replication after interruption

(3) master run id

info server, you can see the master run id
It is unreliable to locate the master node based on the host ip , if the master node restarts or the data changes, then the slave node should be distinguished according to different run ids. If the run id is different, full copy will be made.
If you need to restart redis without changing the run id, you can use the redis-cli debug reload command

（4）psync

The slave node uses psync to copy from the master node, and psync runid offset
The master node will return response information according to its own situation. It may be FULLRESYNC runid offset that triggers full replication. , it may be that CONTINUE triggers incremental copy

③Full copy

(1) The master executes bgsave and generates an rdb snapshot file locally
(2) The master node sends the rdb snapshot file to the slave node. If the rdb copy time exceeds 60 seconds (repl-timeout), then the slave The node will think that the copy failed, and you can adjust this parameter appropriately
(3) For machines with Gigabit network cards, 100MB, 6G files are generally transferred per second, which is likely to exceed 60s
(4) The master node is generating RDB When, all new write commands will be cached in memory. After the salve node saves the rdb, the new write commands will be copied to the salve node
(5) client-output-buffer-limit slave 256MB 64MB 60, If during copying, the memory buffer continues to consume more than 64MB, or exceeds 256MB at one time, then stop copying and copy fails
(6) After the slave node receives the rdb, it clears its own old data, and then reloads the rdb to itself. in the memory, while providing external services based on the old data version
(7) If the slave node turns on AOF, then BGREWRITEAOF will be executed immediately and the AOF will be rewritten

rdb generation, rdb copy through the network, slave Cleaning old data and slave aof rewrite are very time-consuming

If the amount of copied data is between 4G~6G, then the full copy time is likely to take 1 and a half to 2 minutes

④Incremental replication

(1) If the master-slave network connection is disconnected during the full replication process, then when the salve reconnects to the master, incremental replication will be triggered
(2) The master directly copies from its own Get part of the lost data from the backlog and send it to the slave node. The default backlog is 1MB
(3) msater gets the data from the backlog based on the offset in psync sent by the slave

⑤heartbeat

The master and slave nodes will send heartbeat information to each other

The master sends a heartbeat every 10 seconds by default, and the salve node sends a heartbeat every 1 second

⑥Asynchronous replication

Every time the master receives a write command, it now writes data internally and then sends it asynchronously to the slave node

2. Read and write separation: the master is responsible for the write operation, and the slave is responsible for helping the master reduce access queries. Quantity

2. High availability mechanism

In the case of high concurrency, multiple clusters are equipped with one master and multiple backups. Although the high concurrency problem can be solved, there is only one host. , if the master is down, the entire system cannot perform write operations, and the slave cannot synchronize data, the entire system will be paralyzed, and the entire system will be unavailable. The high-availability mechanism of redis is the sentinel mechanism. The sentinel is an important component in the redis cluster. It is responsible for cluster monitoring, information notification, failover, and configuration center.

(1) Cluster monitoring, responsible for monitoring whether the redis master and slave processes are working normally
(2) Message notification, if a redis instance fails, the sentinel is responsible for sending messages as alarm notifications to the administrator
(3) Failover, if the master node hangs up, it will be automatically transferred to the slave node
(4) Configuration center, if failover occurs, notify the client of the new master address
Sentinel It is distributed in itself and works as a cluster and needs to work together.

When the master node is found to be down, it will require the consent of a majority of sentinels. This involves distributed elections.

The sentinel mechanism needs to ensure at least 3 nodes to ensure its robustness. If we only give two nodes during the test, one is the master node and the other is the slave node, then there is a sentinel responsible for both nodes. Monitoring, when the master host goes down, then sentinels are needed for election. Then the s1 sentinel in the master node can no longer work, and the election can only be carried out by the s2 sentinel in the slave node. After the election, a fault must be carried out. The transfer requires one sentinel to work, and its majority parameter specifies the number of sentinels required for failover. At this time, there is only one S2 sentinel without majority for failover. So at least 3 nodes are needed to ensure its robustness.

3. Data loss issues arising from high availability and high concurrency

(1) Data loss caused by asynchronous replication

Because master -> The slave's replication is asynchronous, so some data may not be copied to the slave before the master crashes, and these parts of the data are lost.

(2) Data loss caused by split brain

Split brain, that is to say, the machine where a master is located suddenly leaves the normal network and cannot connect to other slave machines, but in fact The master is still running.

At this time, the sentinel may think that the master is down, and then start the election and switch other slaves to the master.

At this time, there will be two slaves in the cluster. There is a master, which is the so-called split brain.

Although a slave is switched to the master at this time, the client may not have time to switch to the new master, and the data that continues to write to the old master may not be Lost,

So when the old master is restored again, it will be hung to the new master as a slave, its own data will be cleared, and the data will be copied from the new master again.

Solution to data loss caused by asynchronous replication and split-brain

min-slaves-to-write 1
 min-slaves-max-lag 10

Copy after login

Requires at least 1 slave, the delay of data replication and synchronization cannot exceed 10 seconds

If once all The slave, data replication and synchronization delays exceed 10 seconds, then at this time, the master will no longer receive any requests

The above two configurations can reduce data loss caused by asynchronous replication and split-brain

(1) Reduce data loss caused by asynchronous replication

With the min-slaves-max-lag configuration , it can be ensured that once the slave copy data and ACK delay is too long, it is considered that too much data may be lost after the master goes down, and then the write request is rejected. This can prevent some data from being synchronized when the master goes down. The data loss caused by the slave is reduced within the controllable range

(2) Reduce the data loss caused by split brain

If a master has a split brain and loses connection with other slaves, then the above two This configuration can ensure that if it cannot continue to send data to the specified number of slaves, and the slave does not send itself an ack message for more than 10 seconds, then the client's write request will be directly rejected

In this way, the old master after the split brain will It will not accept new data from the client, thus avoiding data loss.

The above configuration ensures that if the connection is lost with any slave and no slave gives itself an ack after 10 seconds, then it will be rejected. New write request

Therefore, in a split-brain scenario, up to 10 seconds of data will be lost

For more programming-related knowledge, please visit:Introduction to Programming ! !

The above is the detailed content of Detailed explanation of Redis' high availability and high concurrency mechanism. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7575

CakePHP Tutorial

1386

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

111

Related knowledge

How to build the redis cluster mode Apr 10, 2025 pm 10:15 PM

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to clear redis data Apr 10, 2025 pm 10:06 PM

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

How to read redis queue Apr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

How to use the redis command Apr 10, 2025 pm 08:45 PM

Using the Redis directive requires the following steps: Open the Redis client. Enter the command (verb key value). Provides the required parameters (varies from instruction to instruction). Press Enter to execute the command. Redis returns a response indicating the result of the operation (usually OK or -ERR).

How to use redis lock Apr 10, 2025 pm 08:39 PM

Using Redis to lock operations requires obtaining the lock through the SETNX command, and then using the EXPIRE command to set the expiration time. The specific steps are: (1) Use the SETNX command to try to set a key-value pair; (2) Use the EXPIRE command to set the expiration time for the lock; (3) Use the DEL command to delete the lock when the lock is no longer needed.

How to read the source code of redis Apr 10, 2025 pm 08:27 PM

The best way to understand Redis source code is to go step by step: get familiar with the basics of Redis. Select a specific module or function as the starting point. Start with the entry point of the module or function and view the code line by line. View the code through the function call chain. Be familiar with the underlying data structures used by Redis. Identify the algorithm used by Redis.

How to solve data loss with redis Apr 10, 2025 pm 08:24 PM

Redis data loss causes include memory failures, power outages, human errors, and hardware failures. The solutions are: 1. Store data to disk with RDB or AOF persistence; 2. Copy to multiple servers for high availability; 3. HA with Redis Sentinel or Redis Cluster; 4. Create snapshots to back up data; 5. Implement best practices such as persistence, replication, snapshots, monitoring, and security measures.

How to use the redis command line Apr 10, 2025 pm 10:18 PM

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.

See all articles