Let me give you an example. We have built a mysql binlog
synchronization system before. The pressure is still very high. The daily synchronization data has to reach hundreds of millions, which means that the data is intact from a mysql database. Synchronize to another mysql library (mysql -> mysql). A common point is that, for example, a big data team needs to synchronize a mysql library to perform various complex operations on the data of the company's business system.
When you add, delete or modify a piece of data in mysql, 3 logs of the addition, deletion or modification will appear binlog
, and then these three binlog
will be sent to MQ, and then consumed in sequence When executing, you must at least ensure that people come in order, right? Otherwise, it was originally: add, modify, delete; but you just changed the order and executed it to delete, modify, and add. Isn't that completely wrong?
Originally, when this data was synchronized, the data should have been deleted at the end; but as a result, you got the order wrong, and the data was retained in the end, and the data synchronization went wrong.
Let’s take a look at the two scenarios where the order will be out of order:
RabbitMQ: one queue, multiple consumers. For example, the producer sends three pieces of data to RabbitMQ, in the order of data1/data2/data3, and what is pushed into a memory queue of RabbitMQ. There are three consumers who consume one of the three pieces of data from MQ. As a result, consumer 2 completes the operation first and saves data2 to the database, followed by data1/data3. This isn't obviously messed up.
Kafka: For example, we created a topic with three partitions. When the producer writes, he can actually specify a key. For example, if we specify an order id as the key, then the data related to this order will definitely be distributed to the same partition, and the data in this partition must be There is an order.
When consumers take out data from the partition, they must also be in order. At this point, the order is still ok and there is no confusion. Next, we may create multiple threads in the consumer to process messages concurrently. Because if the consumer consumes and processes in a single thread, and the processing is time-consuming, for example, it takes tens of ms to process a message, then only dozens of messages can be processed in 1 second, which is too low throughput. If multiple threads run concurrently, the order may be messed up.
solution
RabbitMQ
Split multiple queues, each queue has one consumer, which is just more queues, which is indeed a troublesome point; or there is only one queue but corresponding to one consumer, and then this consumer uses a memory queue internally for queuing, and then distributes Give the bottom layer different workers to handle.
Kafka
One topic, one partition, one consumer, internal single-thread consumption, single-thread throughput is too low, generally this will not be used.
Write N memory queues, and data with the same key will go to the same memory queue; then for N threads, each thread consumes one memory queue respectively, so that Ensure orderliness.
The above is the detailed content of How does mysql ensure the order of messages?. For more information, please follow other related articles on the PHP Chinese website!