On the one hand: On the cluster management page corresponding to the cloud database PolarDB in the Alibaba Cloud console, click one click in the diagnosis and optimization module During the diagnostic session management, it was found that the execution time of a certain update sql was very long and very frequent;
On the other hand: The business execution time began to appear continuously in the business monitoring system Alarm information is issued, and the business data of the alarm continues to increase, and some operations affect customer use.
Since the business flow involved in business operations is relatively complex, from a purely technical point of view, it is not a focused discussion, in order to better understand the problem. The reason why it happened is to use an analogy to describe the complex business as follows: There are three tables in the database, the first table is t_grandfather (grandfather’s table), and the second table is t_father (parent table), the third table t_grandson (descendant table), the DDL is as follows:
CREATE TABLE `t_grandfather ` ( `id` int(11) NOT NULL AUTO_INCREMENT, `count` int(11) NOT NULL DEFAULT 0 COMMENT '子孙后代数量', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='老爷表'; CREATE TABLE `t_father ` ( `id` int(11) NOT NULL AUTO_INCREMENT, `grandfather_id` int(11) NOT NULL COMMENT '老爷表id', PRIMARY KEY (`id`), KEY `idx_grandfather_id` (`grandfather_id`), ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='老爸表'; CREATE TABLE `t_grandson` ( `id` int(11) NOT NULL AUTO_INCREMENT, `grandfather_id` int(11) NOT NULL COMMENT '老爷表id', PRIMARY KEY (`id`), KEY `idx_grandfather_id` (`grandfather_id`), ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='孙子表';
The business logic relationship between the three tables is, first create the grandson table, and then This man takes many wives (business) and will continue to have children. When a child is born, a father's table will be generated. At the same time, the count=count 1 of the man's table will be updated, indicating that a new offspring has been added. The man's wife (business) ) When babies continue to be born, the babies born before will also have wives, and their wives will also give birth to babies. For the master, it means that he has a grandson (generating new business data). After having a grandson, It is also necessary to update the count=count 1 of the master table, which means that a new descendant has been added, and so on, the descendants are endless (business data is continuously generated), as shown in the following figure:
The logic of the ancestral code is that as long as there are new additions to the t_father table and t_grandson, t_grandfather will be updated. This logical design is not a big problem, but considering the large amount of data in the grandchild table, a very serious performance problem will arise here. The following is part of the pseudo code extracted from the business
/** * 处理 father 的业务 */ public void doFatherBusiness (){ //do fatherBusiness baba .... 此处省 // 插入 t_father 表 if (fatherMapper.inster(father)){ //update t_grandfather set count=count+1 where id= #{grandfatherId} grandfatherMapper.updateCount(father.getGrandfatherId ()) ; } } /** * 处理 grandson 的业务 */ public void doGrandsonBusiness (){ //do grandson baba .... 此处省略 // 插入 t_grandson 表 if(grandsonMapper.inster(grandson)){ //update t_grandfather set count=count+1 where id= #{grandfatherId} grandfatherMapper.updateCount(grandson.getGrandfatherId()); } }
When multiple businesses (threads) call the above method respectively, it will cause huge pressure on the update operation of the t_grandfather table, especially When updating the same ID, the competition for locks within the MySQL server is very fierce. The final performance is consistent with the background description above.
1. Temporary solution:
On the one hand, in the Alibaba Cloud console, limit the flow of SQL and block it normally. The session is forcibly killed so that the data thread is not blocked and resources are released. On the other hand, the number of nodes of the service that receives the request is reduced, with the purpose of reducing the amount of business data entering;
2 . Long-term solution
On the one hand, change the above business logic. When inserting the t_grandson table and the t_father table, the count field of the t_grandfather table will no longer be updated; on the other hand, when the count statistical requirement is needed, Switch all to other methods;
The above is the detailed content of How to solve production failures caused by Mysql update sql. For more information, please follow other related articles on the PHP Chinese website!