java Basic Tutorial column introduces the knowledge about RocketMQ in detail today.
It’s been a long time since I wrote a blog. Although I can find countless reasons for not blogging, in the final analysis, it’s still “lazy”. Today I finally took a pill to cure lazy cancer and decided to write a blog. What should I introduce? After much thought, I’d better introduce RocketMQ. After all, I have written more than 30 blogs, but I haven’t written a good blog about MQ yet. This blog is relatively basic and does not involve source code analysis, just literacy.
I think from a certain perspective, microservices have promoted the vigorous development of MQ. Originally, a system has N multiple modules , all modules are strongly coupled together. Now with microservices, a module is a system, and systems definitely need to interact. There are three common methods of interaction, one is RPC, one is HTTP, and the other is MQ.
Originally, a business was divided into N steps, which had to be processed step by step before the final result could be returned to the user. Now with MQ, the most critical part is processed first, and then Send a message to MQ and directly return OK to the user. As for the subsequent steps, let's slowly process them in the background. It is really an artifact to improve the user experience.
The sudden surge in the number of requests for a certain interface will inevitably put a lot of pressure on the application server and database server. Now with MQ, you don’t have to worry about how many requests come. It will be processed slowly in the background.
RocketMQ is written in Java. It is Alibaba’s open source message middleware and absorbs many of the advantages of Kafka. Kafka is also a popular message middleware, but Kafka is written in Scala, which is not conducive for Java programmers to read the source code, and it is also not conducive for Java programmers to do some customized development. Friends who have been exposed to Kafka know that it is not easy to use Kafka well. Relatively speaking, RocketMQ is much simpler, and RocketMQ is blessed by Alibaba and has experienced the test of N Double 11. It is more suitable for domestic Internet companies, so it is used domestically. There are many RocketMQ companies.
The picture comes from gitee.com/mirrors/roc...
You can see that RocketMQ has four main components:
The producer initiates Topic routing information query to NameServer at regular intervals.
Consumer initiates Topic routing information query to NameServer at regular intervals.
In fact, in the lower version of RocketMQ, Zookeeper was indeed used as the registration center, but it was later changed to the current NameServer. I guess the main reason is:
It is divided into ProducerGroup and ConsumerGroup. We pay more attention to ConsumerGroup. ConsumerGroup contains multiple Consumers.
In cluster consumption mode, consumers under a ConsumerGroup consume a Topic together, and each Consumer will be assigned to N queues, but a queue will only be consumed by one Consumer. Different ConsumerGroups can consume the same A Topic and a message will be consumed by all ConsumerGroups subscribed to this Topic.
There are two consumption modes: Clustering (cluster consumption) and Broadcasting (broadcast consumption).
Different from other MQs, other MQs specify cluster consumption or broadcast consumption when sending messages. RocketMQ sets cluster consumption or broadcast consumption on the consumer side.
The default is cluster consumption mode. In this mode, all Consumers of the ConsumerGroup jointly consume messages from a Topic, and each Consumer is responsible for consuming messages from N queues. (N may also be 1, or even 0, which is not assigned to the queue), but a queue will only be consumed by one Consumer. If a Consumer dies, other Consumers under the ConsumerGroup will take over and continue consuming.
In cluster consumption mode, the consumption progress is maintained on the Borker side, and the storage path is ${ROCKET_HOME}/store/config/ consumerOffset.json
, as shown in the following figure: UsetopicName@consumerGroupName
is Key, and the consumption progress is Value. The form of Value is queueId:offset
, indicating that if there are multiple ConsumerGroups, the consumption progress of each ConsumerGroup is different and needs to be separated. storage.
Broadcast consumption messages will be sent to all Consumers in the ConsumerGroup.
In broadcast consumption mode, the consumption progress is maintained on the Consumer side.
We know that in cluster consumption mode, all Consumers under the ConsumerGroup jointly consume a Topic message , each Consumer is responsible for consuming messages from N queues, so how is it allocated? This involves the consumption queue load algorithm.
RocketMQ provides numerous consumer queue load algorithms, among which the two most commonly used algorithms are AllocateMessageQueueAveragely and AllocateMessageQueueAveragelyByCircle. Let's take a look at the differences between these two algorithms.
Assume that a Topic now has 16 queues, represented by q0~q15, and 3 Consumers, represented by c0-c2.
The results of using AllocateMessageQueueAveragely to consume the queue load algorithm are as follows:
The results of using AllocateMessageQueueAveragelyByCircle to consume the queue load algorithm are as follows:
All Consumers under the ConsumerGroup consume messages from a Topic together, and each Consumer is responsible for consuming N queues messages, but a queue cannot be consumed by N Consumers at the same time. What does this mean?
If you are smart, you must have thought that if a Topic has only 4 queues and 5 Consumers, then one Consumer will not be assigned to any queue, so in RocketMQ, the number of queues under the Topic is directly It determines the maximum number of Consumers, which means that you cannot increase the consumption speed just by adding more Consumers.
Although it is recommended that the number of queues should be fully considered when creating a Topic, the actual situation is often unsatisfactory. Even if the number of queues does not change, the number of Consumers will remain the same. Changes will occur, such as when a Consumer goes online or offline, a Consumer hangs up, or a new Consumer is added. The expansion and contraction of the queue and the expansion and contraction of the Consumer will lead to rebalancing, that is, the consumption queue is redistributed to the Consumer.
In RocketMQ, the Consumer will periodically query the number of Topic queues. If the number of Consumers changes, rebalancing will be triggered.
Rebalancing is implemented internally by RocketMQ, and programmers do not need to care.
Generally speaking, MQ has two methods to obtain messages:
Whether it is Pull or Push, the Consumer will always interact with the Broker. The interaction methods generally include short connection, long connection, and polling.
It seems that RocketMQ supports both Pull and Push, but in fact Push is also implemented using Pull. So how does the Consumer interact with the Broker?
This is the ingenious part of RocketMQ's design. It is neither a short connection, nor a long connection, nor polling, but long polling.
Consumer initiates a request to pull messages, which is divided into two situations:
RocketMQ supports transaction messages. After the Producer sends the transaction message to the Broker, the Broker will store the message in the system Topic: RMQ_SYS_TRANS_HALF_TOPIC
, so that the Consumer This message cannot be consumed.
Broker will have a scheduled task, consume the RMQ_SYS_TRANS_HALF_TOPIC
message, and initiate a review to the Producer. There are three statuses of the review: submitted, rolled back, and unknown.
Delayed message means that after the message is sent to the Broker, it cannot be consumed by the Consumer immediately. It needs to wait for a certain period of time before it can be consumed. RocketMQ only supports specific delays. Time: 1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h
.
RocketMQ supports two consumption forms: concurrent consumption and sequential consumption. If it is consumed sequentially, you need to ensure that the sorted messages are in the same queue. How to choose a queue to send? RocketMQ has several overloads for sending messages, one of which supports queue selection.
Producer sends messages to Borker. Borker needs to persist the message. RocketMQ supports two persistence strategies:
In order to ensure the reliability and availability of MQ, in the production environment, follower nodes are generally deployed, and the follower nodes will copy the master's data. RocketMQ supports two types Persistent replication strategy:
Whether "writing" is written to PageCache or to the hard disk depends on the configuration of Follower Broker.
RocketMQ provides three methods of sending messages:
In actual development, the synchronization method is generally used. If you want to improve the performance of RocketMQ, you usually modify the parameters on the Borker side, especially the disk brushing strategy and the replication strategy.
When sending messages, if MessageQueueSelector is used, the retry mechanism for message sending will be invalid.
The response to sending a message may be the following four:
public enum SendStatus { SEND_OK, FLUSH_DISK_TIMEOUT, FLUSH_SLAVE_TIMEOUT, SLAVE_NOT_AVAILABLE, }复制代码
Except for the first one, the other situations are problematic. In order to ensure that the message is not lost, you need to set the Producer parameters: RetryAnotherBrokerWhenNotStoreOK
is true.
If the message fails to be sent, it will still be sent to the Borker when retrying, so there is a high probability that the sending will still fail. The ingenious design of RockteMQ is that the retry time, it will automatically avoid this Borker and choose other Borkers. However, so far, asynchronous sending is not so smart and will only retry on one Borker, so it is strongly recommended to choose the synchronous sending method.
RocketMQ provides two fault avoidance mechanisms. Use parameter SendLatencyFaultEnable
to control.
The delayed backoff mechanism seems to be very useful, but generally speaking, the Borker end is busy, causing the Borker to be unavailable or the network to be unavailable. It only takes a moment and can be restored immediately. If delayed backoff is turned on Mechanism, the Borker that was originally available was circumvented for a period of time, and other Borkers were busier, which may make the situation worse.
Consumer has two parameters, the degree of parallelism that can be consumed, namely ConsumeThreadMin
, ConsumeThreadMax
, it seems that if there are relatively few messages accumulated on the Consumer side, the number of consumer threads is ConsumeThreadMin
; if there are more messages accumulated on the Consumer side, a new thread will be automatically opened for consumption. Until the number of consumer threads is ConsumeThreadMax
. But this is not the case. The Consumer holds a thread pool internally and uses an unbounded queue, that is, the ConsumeThreadMax
parameter is invalid, so in actual development, ConsumeThreadMin
, ConsumeThreadMax
is often set to the same value.
If the consumption progress cannot be queried, where does the Consumer start consuming? RocketMQ supports consumption in three ways: the latest message, the earliest message, and the specified timestamp.
RocketMQ will set up a retry queue with a Topic name of %RETRY% consumerGroup
for each ConsumerGroup to save the retry queue that needs to be sent to the ConsumerGroup. Retry message, but retry requires a certain delay time. RocketMQ processes the retry message by first saving it to the delay queue with the Topic name SCHEDULE_TOPIC_XXXX
, and then the background scheduled task is Delayed according to the corresponding time. Resave it in the retry queue of %RETRY% consumerGroup
.
I originally thought that I would be able to write smoothly if I wrote literacy text, but I still overthought it. Because it is a literacy text, it is aimed at friends who have not had much contact with RocketMQ, but is RocketMQ that good? It is simple. It is impossible to use a blog to let friends who have not had much contact with RocketMQ get started smoothly. So when writing the blog, I have been thinking, is this thing important and does it need to be described carefully? This thing can be ignored or not. Introduction, etc. You can see that this article basically introduces various concepts and almost does not involve the API level, because once the API is involved, it is estimated that it will not be finished in two weeks.
Related free learning recommendations: java basic tutorial
The above is the detailed content of Finally here...RocketMQ Literacy Chapter. For more information, please follow other related articles on the PHP Chinese website!