Mysql slave 同步错误解决_MySQL
分析解决
master 数据库提供服务,slave数据库作报表服务器,通过mysqlbin log做主从同步。
从报表数据来看,缺少了从10/28到现在(11/18)之前的数据,比对master、slave数据库主要表数据,发现:
10/28之后的slave库数据缺失,一致未同步。
查看slave mysql同步状态:
关注图中黄色字段:
Slave_IO_Running: Yes
Slave_SQL_Running: No --- 表示sql进程未工作,问题就处在这。
图中粉色背景,Last_Error: ....
'Duplicate entry '1169595' for key 'PRIMARY'' on query. Default database: ''. Query: 'insert into user。。。
这个错误很简单,违反主键唯一约束。
2. mysql 错误日志 通过my.conf 确定出错误日志文件,vi 查看,根据日志151028搜索到以下:
从日志中,看到10/28 1:28:55 mysql 非正常关闭; 1:29 重启后,开始recovery。 1:29:16s 后I/O errror; 1:30:19s SQL error, slave SQL thread aborted(停止工作)。
mysql也给出了解决办法:
Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000274' position 504869752
重启,并告诉sql 执行的binlog 文件设置position, 重启slave. 错误仍存在,如下
View Code
错误原因仍是
Last_SQL_Error: Error 'Duplicate entry '1169595' for key 'PRIMARY'' on query. Default database: ''. Query: 'insert into user (type,lang,ipAddr,activityStatus,extUserId,endpoint,createTime, email, userName, mobile, storageSize, tuner
)values ('normal','zh-xx','xxxx','active','913151000777430','xxx',now(),null,null,null,0,0)'
至此看来,虽然找到问题原因必须解决以上这个问题。 进一步了解mysql 同步机制及bin log, 原理:slave 获取master的binlog, 并执行,执行报错说明数据库已有这条记录,可能原因是日志中的position不准确,只能从binlog入手分析,
binlog查看参考: MySQL的binlog日志
这里通过第二种方式查看:
mysql> show binlog events [IN 'log_name'] [FROM pos] [LIMIT [offset,] row_count];
选项解析:
IN 'log_name' 指定要查询的binlog文件名(不指定就是第一个binlog文件)
FROM pos 指定从哪个pos起始点开始查起(不指定就是从整个文件首个pos点开始算)
LIMIT [offset,] 偏移量(不指定就是0)
row_count 查询总条数(不指定就是所有行)
log_name, pos 错误日志已知道,查询如下:
binlog 包括所有数据库操作的sql,每条记录包含1个数据库操作。
在bin log 中,很容易找到出错的语句,现在问题是:找到slave SQL线程执行到哪个position. 这里只能采用最笨的办法,根据sql 语句,查看slave库中的数据,如:
Query | 1 | 504873619 | replace into content_preference(userId,contentId,playRecordId,status,createTime)
values (587658,15308,1544691,0,now())
此sql语句为向content_preference 插入一条记录,那么在salve库中content_preference 表中确定是否有id=587658,contentId=15308的记录,如果有,表示此语句已执行。
一直向下查找,最终找到没有执行的sql 的记录, 找到position。
至此,可以判断出slave库中执行binlog 的position, 重新设定slave库的binlog position ,启动slave, 查看执行状态,running, 见下:
mysql> stop slave; Query OK, 0 rows affected (0.00 sec) mysql> CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000274',MASTER_LOG_POS=504873114; Query OK, 0 rows affected (1.98 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: 172.17.128.15 Master_User: xxx Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000274 Read_Master_Log_Pos: 693913486 Relay_Log_File: app3-relay-bin.000002 Relay_Log_Pos: 1819098 Relay_Master_Log_File: mysql-bin.000274 Slave_IO_Running: Yes Slave_SQL_Running: Yes
在备份执行完毕后,报表导出数据正常,问题解决。
反思
mysql在11/28日重启,原因是当天线上服务出现异常,无法解决,重启了机器,当时没考虑到报表数据库也在此机器上,重启后也未检查数据库备份情况,直到用户使用才发现问题。
mysql意外重启,虽然在下次重启启动后,记录了异常信息,和备份的binlog 日志文件及位置。如果是正在执行binlog SQL,但此时mysql意外关闭,记录的position 较旧,导致下次启动时,
会有部分binlog 日志重复执行导致。 此问题,不能说是mysql bug,这种机制,可以保证slave数据不至于丢失,但需要人工找到posion,即可。
后续:
1. 对于机器重启情况, 可手动执行stop slave. 在正常关闭mysql, 重启后,手动start slave, 应该就可以正常同步。
2. 线上系统,重启一定要小心, 重启后监控相关模块是否启动,可以增加mysql slave工作监控。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Does Windows Sandbox terminate with Windows Sandbox Unable to Start, Error 0x80070005, Access Denied message? Some users reported that Windows Sandbox cannot be opened. If you also encounter this error, you can follow this guide to fix it. Windows Sandbox failed to start - Access Denied If Windows Sandbox terminates with Windows Sandbox Unable to Start, Error 0x80070005, Access Denied message, make sure you are logged in as an administrator. This type of error is usually caused by insufficient permissions. So try logging in as an administrator and see if that resolves the issue. If the problem persists, you can try the following solutions: Run the Wi-Fi as administrator

If you encounter an error message when using your printer, such as the operation could not be completed (error 0x00000771), it may be because the printer has been disconnected. In this case, you can solve the problem through the following methods. In this article, we will discuss how to fix this issue on Windows 11/10 PC. The entire error message says: The operation could not be completed (error 0x0000771). The specified printer has been deleted. Fix 0x00000771 Printer Error on Windows PC To fix Printer Error the operation could not be completed (Error 0x0000771), the specified printer has been deleted on Windows 11/10 PC, follow this solution: Restart Print Spool

Decrypting HTTP status code 460: Why does this error occur? Introduction: In daily network use, we often encounter various error prompts, including HTTP status codes. These status codes are a mechanism defined by the HTTP protocol to indicate the processing of a request. Among these status codes, there is a relatively rare error code, namely 460. This article will delve into this error code and explain why this error occurs. Definition of HTTP status code 460: First, we need to understand the basics of HTTP status code

When many friends turn on the computer to connect to the broadband, the computer prompts error 651. What is the situation? The occurrence of 651 is caused by the failure of the connection between the user's terminal computer and the China Netcom central office equipment. It may be an external disconnection or a problem with the equipment. , we can contact the operator to solve it, or check the device. Let’s take a look at the specific tutorial. Detailed tutorial method to solve computer broadband connection 651 error 1: Network card driver failure 1. First consider the network card driver problem. This problem is relatively common. Right-click the desktop computer - Manage, as shown in the figure below 2. Select "Device Management" on the computer properties page device" to enter. 3. On the Device Manager page, find "Network Adapter". There are usually two network cards, one wired and one wireless. Click Wired.

Table of Contents Solution 1 Solution 21. Delete the temporary files of Windows update 2. Repair damaged system files 3. View and modify registry entries 4. Turn off the network card IPv6 5. Run the WindowsUpdateTroubleshooter tool to repair 6. Turn off the firewall and other related anti-virus software. 7. Close the WidowsUpdate service. Solution 3 Solution 4 "0x8024401c" error occurs during Windows update on Huawei computers Symptom Problem Cause Solution Still not solved? Recently, the web server needs to be updated due to system vulnerabilities. After logging in to the server, the update prompts error code 0x8024401c. Solution 1

AutoCAD is one of the most commonly used drawing design software, but when we want to use it on win11, we may encounter an error when installing autocad on win11. At this time, we can try to modify the registry to solve it. An error occurred when installing autocad in win11: First step, press "win logo + r" on the keyboard to open the run. In the second step, enter "regedit" and press Enter to open the registry. 3. Paste "Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System" into the path above. 4. After entering, double-click

Practical Tips to Quickly Solve Tomcat404 Errors Tomcat is a commonly used JavaWeb application server and is often used when developing and deploying JavaWeb applications. However, sometimes we may encounter a 404 error from Tomcat, which means that Tomcat cannot find the requested resource. This error can be caused by multiple factors, but in this article, we will cover some common solutions and tips to help you resolve Tomcat 404 errors quickly. Check URL path

If you encounter error code 0x80070003 when using Hyper-V to create or start a virtual machine, it may be caused by permission issues, file corruption, or configuration errors. Solutions include checking file permissions, repairing damaged files, ensuring correct configuration, and more. This problem can be solved by ruling out the different possibilities one by one. The entire error message looks like this: The server encountered an error while creating [virtual machine name]. Unable to create new virtual machine. Unable to access configuration store: The system cannot find the path specified. (0x80070003). Some possible causes of this error include: The virtual machine file is corrupted. This can happen due to malware, virus or adware attacks. Although the likelihood of this happening is low, you can't completely
