一、GTID的相关特性

二、GTID如何跳过事务冲突

三、GTID事务冲突的几种常见类型

四、示例演示

1、从库报主键重复(Errno: 1062)

2、从库报找不到对应的被更新的记录(Errno: 1032)

3、从库找不到对应的被删除的记录(Errno: 1032)

4、延迟从修复主库意外truncate

5、主库binlog被purge的情形(Errno: 1236)

五、小结

首页

数据库

mysql教程

MySQL GTID 错误处理汇总

黄舟

Feb 13, 2017 am 11:20 AM

MySQL GTID是在传统的mysql主从复制的基础之上演化而来的产物，即通过UUID加上事务ID的方式来确保每一个事物的唯一性。这样的操作方式使得我们不再需要关心所谓的log_file和log_Pos，只是简单的告诉从库，从哪个服务器上去找主库就OK了。简化了主从的搭建以及failover的过程，同时比传统的复制更加安全可靠。由于GTID是连续没有空洞的，因此主从库出现数据冲突时，可以通过注入空事物的方式进行跳过。本文主要讲述GTID主从架构的错误处理方式。

一、GTID的相关特性

配置MySQL GTID 主从复制
基于mysqldump搭建gtid主从

二、GTID如何跳过事务冲突

    很多无法预料的情形导致mysql主从发生事务冲突，主从失败或停止的情形，即需要修复主从
    对于GTID方式的主从架构而言，更多的是处理事务冲突来修复主从
    GTID不支持通过传统设置sql_slave_skip_counter方法来跳过事务
    方法：通过注入空事务来填补事务空洞，等同于传统复制的(set global sql_slave_skip_counter = 1)
    步骤：
            stop slave；
            set gtid_next='xxxxxxx:N'; --指定下一个事务执行的版本，即想要跳过的GTID
            begin;
            commit;  --注入一个空事物
            set gtid_next='AUTOMATIC' --自动的寻找GTID事务。
            start slave; --开始同步

登录后复制

三、GTID事务冲突的几种常见类型

    1、主库新增记录，从库提示主键冲突
    2、主库对象可更新，从库无对应的对象可更新
    3、主库对象可删除，从库无对应的对象可删除
    4、通过延迟从修复主库意外删除的对象
    5、主库日志被purged的情形

登录后复制

四、示例演示

当前演示的主从架构图
# mysqlrplshow --master=root:pass@192.168.1.233:3306 --discover-slaves-login=root:pass --verboseWARNING: Using a password on the command line interface can be insecure.
# master on 192.168.1.233: ... connected.
# Finding slaves for master: 192.168.1.233:3306

# Replication Topology Graph
192.168.1.233:3306 (MASTER)   |   +--- 192.168.1.245:3306 [IO: Yes, SQL: Yes] - (SLAVE)   |   +--- 192.168.1.247:3306 [IO: Yes, SQL: Yes] - (SLAVE)(root@192.168.1.233)[tempdb]>show slave hosts;+-----------+---------------+------+-----------+--------------------------------------+| Server_id | Host          | Port | Master_id | Slave_UUID                           |
+-----------+---------------+------+-----------+--------------------------------------+|       245 | 192.168.1.245 | 3306 |       233 | 78336cdc-8cfb-11e6-ba9f-000c29328504 ||       247 | 192.168.1.247 | 3306 |       233 | 13a26fc1-555a-11e6-b5e0-000c292e1642 |
+-----------+---------------+------+-----------+--------------------------------------+--演示的mysql版本
(root@192.168.1.233)[tempdb]>show variables like &#39;version&#39;;+---------------+------------+| Variable_name | Value      |
+---------------+------------+| version       | 5.7.12-log |
+---------------+------------+--查看gtid是否开启
(root@192.168.1.233)[tempdb]>show variables like &#39;%gtid_mode%&#39;;+---------------+-------+| Variable_name | Value |
+---------------+-------+| gtid_mode     | ON    |
+---------------+-------+--主库上面可以看到基于gtid的dump线程，如下
(root@192.168.1.233)[tempdb]>show processlist;+----+------+-----------------------+--------+------------------+------+| Id | User | Host                  | db     | Command          | Time |
+----+------+-----------------------+--------+------------------+------+| 17 | repl | node245.edq.com:52685 | NULL   | Binlog Dump GTID | 2738 |
| 18 | repl | node247.edq.com:33516 | NULL   | Binlog Dump GTID | 2690 || 24 | root | localhost             | tempdb | Query            |    0 |
+----+------+-----------------------+--------+------------------+------+

登录后复制

1、从库报主键重复(Errno: 1062)

(root@Master)[tempdb]>create table t1 (
            -> id tinyint not null primary key,ename varchar(20),blog varchar(50));

(root@Master)[tempdb]>insert into t1 
            -> values(1,&#39;leshami&#39;,&#39;http://blog.csdn.net/leshami&#39;);

(root@Master)[tempdb]>insert into t1 
            -> values(2,&#39;robin&#39;,&#39;http://blog.csdn.net/robinson_0612&#39;);

(root@Master)[tempdb]>set sql_log_bin=off;

(root@Master)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

(root@Master)[tempdb]>set sql_log_bin=on;

(root@Master)[tempdb]>insert into t1 
            -> values(2,&#39;robin&#39;,&#39;http://blog.csdn.net/robinson_0612&#39;);

-- 从库状态报错，提示重复的primary key
(root@Slave)[tempdb]>show slave status \G
*************************** 1. row ***************************Last_Errno: 1062Last_Error: Could not execute Write_rows event on table tempdb.t1; Duplicate entry &#39;2&#39; for key &#39;PRIMARY&#39;, 
                        Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; 
                        the event&#39;s master log node233-binlog.000004, end_log_pos 4426
Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90
 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-89
     Auto_Position: 1

-- 如下解决方案，可以通过删除重库的这条记录
(root@Slave)[tempdb]>stop slave;

(root@Slave)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

(root@Slave)[tempdb]>start slave;

(root@Slave)[tempdb]>show slave status \G
*************************** 1. row ***************************
           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90
            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,
 78336cdc-8cfb-11e6-ba9f-000c29328504:1  --这里多了一个GTID，注意这个是从库上执行的，这里的UUID跟IP 245的UUID一致
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 

(root@Slave)[tempdb]>show variables like &#39;%uuid%&#39;;
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| server_uuid   | 78336cdc-8cfb-11e6-ba9f-000c29328504 |
+---------------+--------------------------------------+

登录后复制

2、从库报找不到对应的被更新的记录(Errno: 1032)

--首先在从库上删除leshami这条记录
(root@Slave)[tempdb]>delete from t1 where ename=&#39;leshami&#39;;

--接下来再主库尝试更新leshami这条记录
(root@Master)[tempdb]>update t1 set 
            -> blog=&#39;http://blog.csdn.net/robinson_0612&#39; where ename=&#39;leshami&#39;;Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

-- 查看从库状态
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows event on table tempdb.t1; Can&#39;t find record in &#39;t1&#39;,                                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                            the event&#39;s master log node233-binlog.000004, end_log_pos 4769Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91
Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,        78336cdc-8cfb-11e6-ba9f-000c29328504:1-2-- 通过mysqlbinlog在主服务器上寻找报错的binglog日志文件及位置，找到对应的SQL语句，如下所示
-- update中的where后面的部分为更新前的数据，set部分为更新后的数据，因此可以将更新前的数据插入到从库# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -A &#39;10&#39; 4769#161009 13:46:34 server id 233 end_log_pos 4769 CRC32 0xb60df74e Update_rows: table id 147 flags: STMT_END_F### UPDATE `tempdb`.`t1`### WHERE###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2=&#39;leshami&#39; /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3=&#39;http://blog.csdn.net/leshami&#39; /* VARSTRING(50) meta=50 nullable=1 is_null=0 */### SET###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2=&#39;leshami&#39; /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3=&#39;http://blog.csdn.net/robinson_0612&#39; /* VARSTRING(50) meta=50 nullable=1 is_null=0 */# at 4769#161009 13:46:34 server id 233  end_log_pos 4800 CRC32 0xa9669811       Xid = 1749COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= &#39;AUTOMATIC&#39; /* added by mysqlbinlog */ /*!*/;
DELIMITER ;# End of log file/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;   

(root@Slave)[tempdb]>select * from t1;
+----+-------+------------------------------------+
| id | ename | blog                               |
+----+-------+------------------------------------+
|  2 | robin | http://www.php.cn/ |
+----+-------+------------------------------------+

(root@Slave)[tempdb]>stop slave sql_thread;

(root@Slave)[tempdb]>insert into t1 values(1,&#39;leshami&#39;,&#39;http://blog.csdn.net/leshami&#39;);

(root@Slave)[tempdb]>start slave sql_thread;

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-3                Auto_Position: 1

登录后复制

3、从库找不到对应的被删除的记录(Errno: 1032)

-- 如果是在主库上删除记录，而从库上找不到对应的记录，则可以直接跳过该事务
-- 下面我们首选在从库上删除一条记录
(root@Slave)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

-- 接下来在主库上删除该记录
(root@Master)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

-- 从库端提示无法找到对应的记录
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Error: Could not execute Delete_rows event on table tempdb.t1; Can&#39;t find record in &#39;t1&#39;,                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                 the event&#39;s master log node233-binlog.000004, end_log_pos 5070Last_SQL_Error_Timestamp: 161009 15:08:06    Master_SSL_Crl: Master_SSL_Crlpath: 
Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92
 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                    78336cdc-8cfb-11e6-ba9f-000c29328504:1-4     Auto_Position: 1      -- 下面通过注入空事务来跳过
(root@Slave)[tempdb]>stop slave sql_thread;

(root@Slave)[tempdb]>set gtid_next=&#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:92&#39;;

(root@Slave)[tempdb]>begin;commit;

(root@Slave)[tempdb]>set gtid_next=&#39;AUTOMATIC&#39;;

(root@Slave)[tempdb]>start slave sql_thread;

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-92,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1         Replicate_Rewrite_DB:                  Channel_Name:            Master_TLS_Version:

登录后复制

4、延迟从修复主库意外truncate

-- 主库上面新增表及记录             
(root@Master)[tempdb]>create table t2 (id tinyint not null primary key, 
        -> ename varchar(20),blog varchar(50));(root@Master)[tempdb]>insert into t2  
            -> values(1,&#39;leshami&#39;,&#39;http://blog.csdn.net/leshami&#39;);(root@Master)[tempdb]>insert into t2  
            -> values(2,&#39;robin&#39;,&#39;http://blog.csdn.net/robinson_0612&#39;);(root@Master)[tempdb]>select * from t2;
+----+---------+------------------------------------+
| id | ename   | blog                               |
+----+---------+------------------------------------+
|  1 | leshami | http://www.php.cn/       |
|  2 | robin   | http://www.php.cn/ |
+----+---------+------------------------------------+

--先将从库配置为延迟从
(root@Slave)[tempdb]>stop slave sql_thread;
Query OK, 0 rows affected (0.01 sec)

(root@Slave)[tempdb]>CHANGE MASTER TO MASTER_DELAY = 300;
Query OK, 0 rows affected (0.00 sec)

(root@Slave)[tempdb]>start slave sql_thread;
Query OK, 0 rows affected (0.02 sec)

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************             Slave_IO_Running: Yes            Slave_SQL_Running: Yes                    SQL_Delay: 300  root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-99            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1--查看主库上的binglog gtid
(root@Master)[tempdb]>show master status\G*************************** 1. row ***************************             File: node233-binlog.000004         Position: 6970     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99
1 row in set (0.00 sec)

--在主库上truncate t2
(root@Master)[tempdb]>truncate table t2;
Query OK, 0 rows affected (0.03 sec)

--再次查看主库上的binglog gtid，有99变成了100，这个100即是我们需要跳过的ID
(root@Master)[tempdb]>show master status\G*************************** 1. row ***************************             File: node233-binlog.000004         Position: 7121     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100
1 row in set (0.00 sec)

--从库上跳过被意外truncate的事务
(root@Slave)[tempdb]>stop slave sql_thread;
Query OK, 0 rows affected (0.01 sec)

(root@Slave)[tempdb]>set gtid_next=&#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100&#39;;
Query OK, 0 rows affected (0.00 sec)

(root@Slave)[tempdb]>begin;commit;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

(root@Slave)[tempdb]>set gtid_next=&#39;AUTOMATIC&#39;;
Query OK, 0 rows affected (0.00 sec)

(root@Slave)[tempdb]>start slave sql_thread;
Query OK, 0 rows affected (0.02 sec)

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000004          Read_Master_Log_Pos: 7121               Relay_Log_File: node245-relay-bin.000003                Relay_Log_Pos: 2982        Relay_Master_Log_File: node233-binlog.000004             Slave_IO_Running: Yes            Slave_SQL_Running: Yes             ...........................                    Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,                                                             78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1-- 很多时候我们并不知道表何时被truncate，因此可以从binlog日志得到其gtid
-- 如下所示，可以得到这串 SET @@SESSION.GTID_NEXT= &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100&#39;
-- 100即为这个truncate对应的gtid的事务号# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -i \> "truncate table t2" -A3 -B10  ###   @3=&#39;http://blog.csdn.net/robinson_0612&#39; /* VARSTRING(50) meta=50 nullable=1 is_null=0 */# at 6939#161009 18:04:58 server id 233  end_log_pos 6970 CRC32 0x71c5121c     Xid = 1775COMMIT/*!*/;# at 6970#161009 18:08:42 server id 233 end_log_pos 7035 CRC32 0x00ba9437 GTID last_committed=26 sequence_number=27SET @@SESSION.GTID_NEXT= &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100&#39;/*!*/;# at 7035#161009 18:08:42 server id 233 end_log_pos 7121 CRC32 0x5a8b9723 Query thread_id=26 exec_time=0 error_code=0SET TIMESTAMP=1476007722/*!*/;
truncate table t2
/*!*/;
SET @@SESSION.GTID_NEXT= &#39;AUTOMATIC&#39; /* added by mysqlbinlog */ /*!*/;
DELIMITER ;

登录后复制

5、主库binlog被purge的情形(Errno: 1236)

-- 首先停止从库，模拟从库被意外宕机
(root@Slave)[tempdb]>stop slave;
Query OK, 0 rows affected (0.08 sec)

--在主库上进行相应的操作
--此时主库上的gtid_purged为空
(root@Master)[tempdb]>show variables like &#39;%gtid_purged%&#39;;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_purged   |       |
+---------------+-------+

--查看主库binlog
(root@Master)[tempdb]>show binary logs;
+-----------------------+-----------+
| Log_name              | File_size |
+-----------------------+-----------+
| node233-binlog.000001 |   1362104 |
| node233-binlog.000002 |      1331 |
| node233-binlog.000003 |       217 |
| node233-binlog.000004 |      7121 |
+-----------------------+-----------+

(root@Master)[tempdb]>select * from t1;
+----+---------+------------------------------------+
| id | ename   | blog                               |
+----+---------+------------------------------------+
|  1 | leshami | http://www.php.cn/ |
|  2 | robin   | http://www.php.cn/       |
+----+---------+------------------------------------+

--从主库删除记录
(root@Master)[tempdb]>delete from t1;

--切换日志
(root@Master)[tempdb]>flush logs;

--新增记录
(root@Master)[tempdb]>insert into t1 values(1,    -> &#39;xuputi&#39;,&#39;http://blog.csdn.net/leshami&#39;);(root@Master)[tempdb]>show binary logs;
+-----------------------+-----------+
| Log_name              | File_size |
+-----------------------+-----------+
| node233-binlog.000001 |   1362104 |
| node233-binlog.000002 |      1331 |
| node233-binlog.000003 |       217 |
| node233-binlog.000004 |      7513 |
| node233-binlog.000005 |       490 |
+-----------------------+-----------+

--清理binlog
(root@Master)[tempdb]>purge binary logs to &#39;node233-binlog.000005&#39;;
Query OK, 0 rows affected (0.01 sec)

(root@Master)[tempdb]>show binary logs;
+-----------------------+-----------+
| Log_name              | File_size |
+-----------------------+-----------+
| node233-binlog.000005 |       490 |
+-----------------------+-----------+

--此时可以看到相应的gtid_purged值
(root@Master)[tempdb]>show variables like &#39;%gtid_purged%&#39;;
+---------------+--------------------------------------------+
| Variable_name | Value                                      |
+---------------+--------------------------------------------+
| gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101 |
+---------------+--------------------------------------------+

--下面启动从库
(root@Slave)[tempdb]>start slave;
Query OK, 0 rows affected (0.00 sec)

--从库状态提示有日志被purged
(root@Slave)[tempdb]>show slave status\G*************************** 1. row ***************************               Slave_IO_State:                   Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000004          Read_Master_Log_Pos: 7121               Relay_Log_File: node245-relay-bin.000003                Relay_Log_Pos: 3133        Relay_Master_Log_File: node233-binlog.000004             Slave_IO_Running: No            Slave_SQL_Running: Yes                    ...............                Last_IO_Errno: 1236                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log:                &#39;The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1,                  but the master has purged binary logs containing GTIDs that the slave requires.&#39;                       ..................           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1-- 从库上gtid_purged参数，此时为75
(root@Slave)[tempdb]>show variables like &#39;%gtid_purged%&#39;;
+---------------+-------------------------------------------+
| Variable_name | Value                                     |
+---------------+-------------------------------------------+
| gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-75 |
+---------------+-------------------------------------------+                

--停止从库
(root@Slave)[tempdb]>stop slave;
Query OK, 0 rows affected (0.01 sec)

--下面尝试使用gtid_purged进行跳过事务，，如下，提示仅仅当GLOBAL.GTID_EXECUTED为空才能被设置
(root@Slave)[tempdb]>set global gtid_purged = &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101&#39;;
ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

--如下查看，已经存在被执行的gtid，即gtid_executed肯定是不为空，且这些gtid记录在从库的binary log中
(root@Slave)[tempdb]>show global variables like &#39;%gtid_executed%&#39;\G*************************** 1. row ***************************Variable_name: gtid_executed        Value: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4*************************** 2. row ***************************Variable_name: gtid_executed_compression_period        Value: 1000--下面我们在从库上reset master，即清空从库binlog
(root@Slave)[tempdb]>reset master;
Query OK, 0 rows affected (0.05 sec)

--再次查看gtid_executed已经为空值
(root@Slave)[tempdb]>show global variables like &#39;%gtid_executed%&#39;\G*************************** 1. row ***************************Variable_name: gtid_executed        Value: *************************** 2. row ***************************Variable_name: gtid_executed_compression_period        Value: 1000--此时再次设置gtid_purged的值
(root@Slave)[tempdb]>set global gtid_purged = &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101&#39;;
Query OK, 0 rows affected (0.01 sec)

--启动从库
(root@Slave)[tempdb]>start slave;
Query OK, 0 rows affected (0.03 sec)

--提示有重复记录，如下所示
--是由于我们在从库停止期间delete这个事务没有被从库的relay log接受到
--其次主从的binlog又被purged，而且从库启动后，执行了gtid_purged，因此主库上新增的记录在从库上提示主键重复
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000005          Read_Master_Log_Pos: 490               Relay_Log_File: node245-relay-bin.000004                Relay_Log_Pos: 417        Relay_Master_Log_File: node233-binlog.000005             Slave_IO_Running: Yes            Slave_SQL_Running: No                ................               Last_SQL_Error: Could not execute Write_rows event on table tempdb.t1; 
 Duplicate entry &#39;1&#39; for key &#39;PRIMARY&#39;, Error_code: 1062;
 handler error HA_ERR_FOUND_DUPP_KEY; the event&#39;s master log node233-binlog.000005, end_log_pos 459           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100:102            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101                Auto_Position: 1--在从库上删除id为1的记录
(root@Slave)[tempdb]>delete from t1 where id=1;
Query OK, 1 row affected (0.05 sec)

--启动从库的sql_thread线程
(root@Slave)[tempdb]>start slave sql_thread;
Query OK, 0 rows affected (0.02 sec)

--再次查看正常
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000005          Read_Master_Log_Pos: 490               Relay_Log_File: node245-relay-bin.000004                Relay_Log_Pos: 713        Relay_Master_Log_File: node233-binlog.000005             Slave_IO_Running: Yes            Slave_SQL_Running: Yes--上面的这个示例，主要是演示我们使用gtid_purged方式来达到跳过事务的目的
--事实上，主从的数据已经不一致了，应根据实际的需要考虑是否进行相应的修复

登录后复制

五、小结

1、GTID是全局事务ID，简化了主从架构的部署使得从库不再需要关心log_file和log_pos
2、由于事务ID的唯一性，使得将其他从库的GTID应用到其它从库成为可能，即提供了便利的failover
3、GTID是连续的，非空洞性的，因此，对于冲突的情形，需要注入空的事务来实现
4、可以通过配置延迟从来避免主库上意外的删除对象导致的人为错误

MySQL GTID是在传统的mysql主从复制的基础之上演化而来的产物，即通过UUID加上事务ID的方式来确保每一个事物的唯一性。这样的操作方式使得我们不再需要关心所谓的log_file和log_Pos，只是简单的告诉从库，从哪个服务器上去找主库就OK了。简化了主从的搭建以及failover的过程，同时比传统的复制更加安全可靠。由于GTID是连续没有空洞的，因此主从库出现数据冲突时，可以通过注入空事物的方式进行跳过。本文主要讲述GTID主从架构的错误处理方式。

一、GTID的相关特性

配置MySQL GTID 主从复制
基于mysqldump搭建gtid主从

二、GTID如何跳过事务冲突

    很多无法预料的情形导致mysql主从发生事务冲突，主从失败或停止的情形，即需要修复主从
    对于GTID方式的主从架构而言，更多的是处理事务冲突来修复主从
    GTID不支持通过传统设置sql_slave_skip_counter方法来跳过事务
    方法：通过注入空事务来填补事务空洞，等同于传统复制的(set global sql_slave_skip_counter = 1)
    步骤：
            stop slave；
            set gtid_next=&#39;xxxxxxx:N&#39;; --指定下一个事务执行的版本，即想要跳过的GTID
            begin;
            commit;  --注入一个空事物
            set gtid_next=&#39;AUTOMATIC&#39; --自动的寻找GTID事务。
            start slave; --开始同步

登录后复制

三、GTID事务冲突的几种常见类型

    1、主库新增记录，从库提示主键冲突
    2、主库对象可更新，从库无对应的对象可更新
    3、主库对象可删除，从库无对应的对象可删除
    4、通过延迟从修复主库意外删除的对象
    5、主库日志被purged的情形

登录后复制

四、示例演示

当前演示的主从架构图
# mysqlrplshow --master=root:pass@192.168.1.233:3306 --discover-slaves-login=root:pass --verboseWARNING: Using a password on the command line interface can be insecure.
# master on 192.168.1.233: ... connected.
# Finding slaves for master: 192.168.1.233:3306

# Replication Topology Graph
192.168.1.233:3306 (MASTER)   |   +--- 192.168.1.245:3306 [IO: Yes, SQL: Yes] - (SLAVE)   |   +--- 192.168.1.247:3306 [IO: Yes, SQL: Yes] - (SLAVE)(root@192.168.1.233)[tempdb]>show slave hosts;+-----------+---------------+------+-----------+--------------------------------------+| Server_id | Host          | Port | Master_id | Slave_UUID                           |
+-----------+---------------+------+-----------+--------------------------------------+|       245 | 192.168.1.245 | 3306 |       233 | 78336cdc-8cfb-11e6-ba9f-000c29328504 ||       247 | 192.168.1.247 | 3306 |       233 | 13a26fc1-555a-11e6-b5e0-000c292e1642 |
+-----------+---------------+------+-----------+--------------------------------------+--演示的mysql版本
(root@192.168.1.233)[tempdb]>show variables like &#39;version&#39;;+---------------+------------+| Variable_name | Value      |
+---------------+------------+| version       | 5.7.12-log |
+---------------+------------+--查看gtid是否开启
(root@192.168.1.233)[tempdb]>show variables like &#39;%gtid_mode%&#39;;+---------------+-------+| Variable_name | Value |
+---------------+-------+| gtid_mode     | ON    |
+---------------+-------+--主库上面可以看到基于gtid的dump线程，如下
(root@192.168.1.233)[tempdb]>show processlist;+----+------+-----------------------+--------+------------------+------+| Id | User | Host                  | db     | Command          | Time |
+----+------+-----------------------+--------+------------------+------+| 17 | repl | node245.edq.com:52685 | NULL   | Binlog Dump GTID | 2738 |
| 18 | repl | node247.edq.com:33516 | NULL   | Binlog Dump GTID | 2690 || 24 | root | localhost             | tempdb | Query            |    0 |
+----+------+-----------------------+--------+------------------+------+

登录后复制

1、从库报主键重复(Errno: 1062)

(root@Master)[tempdb]>create table t1 (
            -> id tinyint not null primary key,ename varchar(20),blog varchar(50));

(root@Master)[tempdb]>insert into t1 
            -> values(1,&#39;leshami&#39;,&#39;http://blog.csdn.net/leshami&#39;);

(root@Master)[tempdb]>insert into t1 
            -> values(2,&#39;robin&#39;,&#39;http://blog.csdn.net/robinson_0612&#39;);

(root@Master)[tempdb]>set sql_log_bin=off;

(root@Master)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

(root@Master)[tempdb]>set sql_log_bin=on;

(root@Master)[tempdb]>insert into t1 
            -> values(2,&#39;robin&#39;,&#39;http://blog.csdn.net/robinson_0612&#39;);

-- 从库状态报错，提示重复的primary key
(root@Slave)[tempdb]>show slave status \G
*************************** 1. row ***************************Last_Errno: 1062Last_Error: Could not execute Write_rows event on table tempdb.t1; Duplicate entry &#39;2&#39; for key &#39;PRIMARY&#39;, 
                        Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; 
                        the event&#39;s master log node233-binlog.000004, end_log_pos 4426
Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90
 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-89
     Auto_Position: 1

-- 如下解决方案，可以通过删除重库的这条记录
(root@Slave)[tempdb]>stop slave;

(root@Slave)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

(root@Slave)[tempdb]>start slave;

(root@Slave)[tempdb]>show slave status \G
*************************** 1. row ***************************
           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90
            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,
 78336cdc-8cfb-11e6-ba9f-000c29328504:1  --这里多了一个GTID，注意这个是从库上执行的，这里的UUID跟IP 245的UUID一致
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 

(root@Slave)[tempdb]>show variables like &#39;%uuid%&#39;;
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| server_uuid   | 78336cdc-8cfb-11e6-ba9f-000c29328504 |
+---------------+--------------------------------------+

登录后复制

2、从库报找不到对应的被更新的记录(Errno: 1032)

--首先在从库上删除leshami这条记录
(root@Slave)[tempdb]>delete from t1 where ename=&#39;leshami&#39;;

--接下来再主库尝试更新leshami这条记录
(root@Master)[tempdb]>update t1 set 
            -> blog=&#39;http://blog.csdn.net/robinson_0612&#39; where ename=&#39;leshami&#39;;Query OK, 1 row affected (0.02 sec)
Rows matched: 1  Changed: 1  Warnings: 0

-- 查看从库状态
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows event on table tempdb.t1; Can&#39;t find record in &#39;t1&#39;,                                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                            the event&#39;s master log node233-binlog.000004, end_log_pos 4769Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91
Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,        78336cdc-8cfb-11e6-ba9f-000c29328504:1-2-- 通过mysqlbinlog在主服务器上寻找报错的binglog日志文件及位置，找到对应的SQL语句，如下所示
-- update中的where后面的部分为更新前的数据，set部分为更新后的数据，因此可以将更新前的数据插入到从库# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -A &#39;10&#39; 4769#161009 13:46:34 server id 233 end_log_pos 4769 CRC32 0xb60df74e Update_rows: table id 147 flags: STMT_END_F### UPDATE `tempdb`.`t1`### WHERE###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2=&#39;leshami&#39; /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3=&#39;http://blog.csdn.net/leshami&#39; /* VARSTRING(50) meta=50 nullable=1 is_null=0 */### SET###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2=&#39;leshami&#39; /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3=&#39;http://blog.csdn.net/robinson_0612&#39; /* VARSTRING(50) meta=50 nullable=1 is_null=0 */# at 4769#161009 13:46:34 server id 233  end_log_pos 4800 CRC32 0xa9669811       Xid = 1749COMMIT/*!*/;
SET @@SESSION.GTID_NEXT= &#39;AUTOMATIC&#39; /* added by mysqlbinlog */ /*!*/;
DELIMITER ;# End of log file/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;   

(root@Slave)[tempdb]>select * from t1;
+----+-------+------------------------------------+
| id | ename | blog                               |
+----+-------+------------------------------------+
|  2 | robin | http://www.php.cn/ |
+----+-------+------------------------------------+

(root@Slave)[tempdb]>stop slave sql_thread;

(root@Slave)[tempdb]>insert into t1 values(1,&#39;leshami&#39;,&#39;http://blog.csdn.net/leshami&#39;);

(root@Slave)[tempdb]>start slave sql_thread;

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-3                Auto_Position: 1

登录后复制

3、从库找不到对应的被删除的记录(Errno: 1032)

-- 如果是在主库上删除记录，而从库上找不到对应的记录，则可以直接跳过该事务
-- 下面我们首选在从库上删除一条记录
(root@Slave)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

-- 接下来在主库上删除该记录
(root@Master)[tempdb]>delete from t1 where ename=&#39;robin&#39;;

-- 从库端提示无法找到对应的记录
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Error: Could not execute Delete_rows event on table tempdb.t1; Can&#39;t find record in &#39;t1&#39;,                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                 the event&#39;s master log node233-binlog.000004, end_log_pos 5070Last_SQL_Error_Timestamp: 161009 15:08:06    Master_SSL_Crl: Master_SSL_Crlpath: 
Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92
 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                    78336cdc-8cfb-11e6-ba9f-000c29328504:1-4     Auto_Position: 1      -- 下面通过注入空事务来跳过
(root@Slave)[tempdb]>stop slave sql_thread;

(root@Slave)[tempdb]>set gtid_next=&#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:92&#39;;

(root@Slave)[tempdb]>begin;commit;

(root@Slave)[tempdb]>set gtid_next=&#39;AUTOMATIC&#39;;

(root@Slave)[tempdb]>start slave sql_thread;

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-92,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1         Replicate_Rewrite_DB:                  Channel_Name:            Master_TLS_Version:

登录后复制

4、延迟从修复主库意外truncate

-- 主库上面新增表及记录             
(root@Master)[tempdb]>create table t2 (id tinyint not null primary key, 
        -> ename varchar(20),blog varchar(50));(root@Master)[tempdb]>insert into t2  
            -> values(1,&#39;leshami&#39;,&#39;http://blog.csdn.net/leshami&#39;);(root@Master)[tempdb]>insert into t2  
            -> values(2,&#39;robin&#39;,&#39;http://blog.csdn.net/robinson_0612&#39;);(root@Master)[tempdb]>select * from t2;
+----+---------+------------------------------------+
| id | ename   | blog                               |
+----+---------+------------------------------------+
|  1 | leshami | http://www.php.cn/       |
|  2 | robin   | http://www.php.cn/ |
+----+---------+------------------------------------+

--先将从库配置为延迟从
(root@Slave)[tempdb]>stop slave sql_thread;
Query OK, 0 rows affected (0.01 sec)

(root@Slave)[tempdb]>CHANGE MASTER TO MASTER_DELAY = 300;
Query OK, 0 rows affected (0.00 sec)

(root@Slave)[tempdb]>start slave sql_thread;
Query OK, 0 rows affected (0.02 sec)

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************             Slave_IO_Running: Yes            Slave_SQL_Running: Yes                    SQL_Delay: 300  root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-99            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1--查看主库上的binglog gtid
(root@Master)[tempdb]>show master status\G*************************** 1. row ***************************             File: node233-binlog.000004         Position: 6970     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99
1 row in set (0.00 sec)

--在主库上truncate t2
(root@Master)[tempdb]>truncate table t2;
Query OK, 0 rows affected (0.03 sec)

--再次查看主库上的binglog gtid，有99变成了100，这个100即是我们需要跳过的ID
(root@Master)[tempdb]>show master status\G*************************** 1. row ***************************             File: node233-binlog.000004         Position: 7121     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100
1 row in set (0.00 sec)

--从库上跳过被意外truncate的事务
(root@Slave)[tempdb]>stop slave sql_thread;
Query OK, 0 rows affected (0.01 sec)

(root@Slave)[tempdb]>set gtid_next=&#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100&#39;;
Query OK, 0 rows affected (0.00 sec)

(root@Slave)[tempdb]>begin;commit;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

(root@Slave)[tempdb]>set gtid_next=&#39;AUTOMATIC&#39;;
Query OK, 0 rows affected (0.00 sec)

(root@Slave)[tempdb]>start slave sql_thread;
Query OK, 0 rows affected (0.02 sec)

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000004          Read_Master_Log_Pos: 7121               Relay_Log_File: node245-relay-bin.000003                Relay_Log_Pos: 2982        Relay_Master_Log_File: node233-binlog.000004             Slave_IO_Running: Yes            Slave_SQL_Running: Yes             ...........................                    Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,                                                             78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1-- 很多时候我们并不知道表何时被truncate，因此可以从binlog日志得到其gtid
-- 如下所示，可以得到这串 SET @@SESSION.GTID_NEXT= &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100&#39;
-- 100即为这个truncate对应的gtid的事务号# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -i \> "truncate table t2" -A3 -B10  ###   @3=&#39;http://blog.csdn.net/robinson_0612&#39; /* VARSTRING(50) meta=50 nullable=1 is_null=0 */# at 6939#161009 18:04:58 server id 233  end_log_pos 6970 CRC32 0x71c5121c     Xid = 1775COMMIT/*!*/;# at 6970#161009 18:08:42 server id 233 end_log_pos 7035 CRC32 0x00ba9437 GTID last_committed=26 sequence_number=27SET @@SESSION.GTID_NEXT= &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100&#39;/*!*/;# at 7035#161009 18:08:42 server id 233 end_log_pos 7121 CRC32 0x5a8b9723 Query thread_id=26 exec_time=0 error_code=0SET TIMESTAMP=1476007722/*!*/;
truncate table t2
/*!*/;
SET @@SESSION.GTID_NEXT= &#39;AUTOMATIC&#39; /* added by mysqlbinlog */ /*!*/;
DELIMITER ;

登录后复制

5、主库binlog被purge的情形(Errno: 1236)

-- 首先停止从库，模拟从库被意外宕机
(root@Slave)[tempdb]>stop slave;
Query OK, 0 rows affected (0.08 sec)

--在主库上进行相应的操作
--此时主库上的gtid_purged为空
(root@Master)[tempdb]>show variables like &#39;%gtid_purged%&#39;;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| gtid_purged   |       |
+---------------+-------+

--查看主库binlog
(root@Master)[tempdb]>show binary logs;
+-----------------------+-----------+
| Log_name              | File_size |
+-----------------------+-----------+
| node233-binlog.000001 |   1362104 |
| node233-binlog.000002 |      1331 |
| node233-binlog.000003 |       217 |
| node233-binlog.000004 |      7121 |
+-----------------------+-----------+

(root@Master)[tempdb]>select * from t1;
+----+---------+------------------------------------+
| id | ename   | blog                               |
+----+---------+------------------------------------+
|  1 | leshami | http://www.php.cn/ |
|  2 | robin   | http://www.php.cn/       |
+----+---------+------------------------------------+

--从主库删除记录
(root@Master)[tempdb]>delete from t1;

--切换日志
(root@Master)[tempdb]>flush logs;

--新增记录
(root@Master)[tempdb]>insert into t1 values(1,    -> &#39;xuputi&#39;,&#39;http://blog.csdn.net/leshami&#39;);(root@Master)[tempdb]>show binary logs;
+-----------------------+-----------+
| Log_name              | File_size |
+-----------------------+-----------+
| node233-binlog.000001 |   1362104 |
| node233-binlog.000002 |      1331 |
| node233-binlog.000003 |       217 |
| node233-binlog.000004 |      7513 |
| node233-binlog.000005 |       490 |
+-----------------------+-----------+

--清理binlog
(root@Master)[tempdb]>purge binary logs to &#39;node233-binlog.000005&#39;;
Query OK, 0 rows affected (0.01 sec)

(root@Master)[tempdb]>show binary logs;
+-----------------------+-----------+
| Log_name              | File_size |
+-----------------------+-----------+
| node233-binlog.000005 |       490 |
+-----------------------+-----------+

--此时可以看到相应的gtid_purged值
(root@Master)[tempdb]>show variables like &#39;%gtid_purged%&#39;;
+---------------+--------------------------------------------+
| Variable_name | Value                                      |
+---------------+--------------------------------------------+
| gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101 |
+---------------+--------------------------------------------+

--下面启动从库
(root@Slave)[tempdb]>start slave;
Query OK, 0 rows affected (0.00 sec)

--从库状态提示有日志被purged
(root@Slave)[tempdb]>show slave status\G*************************** 1. row ***************************               Slave_IO_State:                   Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000004          Read_Master_Log_Pos: 7121               Relay_Log_File: node245-relay-bin.000003                Relay_Log_Pos: 3133        Relay_Master_Log_File: node233-binlog.000004             Slave_IO_Running: No            Slave_SQL_Running: Yes                    ...............                Last_IO_Errno: 1236                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log:                &#39;The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1,                  but the master has purged binary logs containing GTIDs that the slave requires.&#39;                       ..................           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1-- 从库上gtid_purged参数，此时为75
(root@Slave)[tempdb]>show variables like &#39;%gtid_purged%&#39;;
+---------------+-------------------------------------------+
| Variable_name | Value                                     |
+---------------+-------------------------------------------+
| gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-75 |
+---------------+-------------------------------------------+                

--停止从库
(root@Slave)[tempdb]>stop slave;
Query OK, 0 rows affected (0.01 sec)

--下面尝试使用gtid_purged进行跳过事务，，如下，提示仅仅当GLOBAL.GTID_EXECUTED为空才能被设置
(root@Slave)[tempdb]>set global gtid_purged = &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101&#39;;
ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

--如下查看，已经存在被执行的gtid，即gtid_executed肯定是不为空，且这些gtid记录在从库的binary log中
(root@Slave)[tempdb]>show global variables like &#39;%gtid_executed%&#39;\G*************************** 1. row ***************************Variable_name: gtid_executed        Value: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4*************************** 2. row ***************************Variable_name: gtid_executed_compression_period        Value: 1000--下面我们在从库上reset master，即清空从库binlog
(root@Slave)[tempdb]>reset master;
Query OK, 0 rows affected (0.05 sec)

--再次查看gtid_executed已经为空值
(root@Slave)[tempdb]>show global variables like &#39;%gtid_executed%&#39;\G*************************** 1. row ***************************Variable_name: gtid_executed        Value: *************************** 2. row ***************************Variable_name: gtid_executed_compression_period        Value: 1000--此时再次设置gtid_purged的值
(root@Slave)[tempdb]>set global gtid_purged = &#39;1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101&#39;;
Query OK, 0 rows affected (0.01 sec)

--启动从库
(root@Slave)[tempdb]>start slave;
Query OK, 0 rows affected (0.03 sec)

--提示有重复记录，如下所示
--是由于我们在从库停止期间delete这个事务没有被从库的relay log接受到
--其次主从的binlog又被purged，而且从库启动后，执行了gtid_purged，因此主库上新增的记录在从库上提示主键重复
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000005          Read_Master_Log_Pos: 490               Relay_Log_File: node245-relay-bin.000004                Relay_Log_Pos: 417        Relay_Master_Log_File: node233-binlog.000005             Slave_IO_Running: Yes            Slave_SQL_Running: No                ................               Last_SQL_Error: Could not execute Write_rows event on table tempdb.t1; 
 Duplicate entry &#39;1&#39; for key &#39;PRIMARY&#39;, Error_code: 1062;
 handler error HA_ERR_FOUND_DUPP_KEY; the event&#39;s master log node233-binlog.000005, end_log_pos 459           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100:102            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101                Auto_Position: 1--在从库上删除id为1的记录
(root@Slave)[tempdb]>delete from t1 where id=1;
Query OK, 1 row affected (0.05 sec)

--启动从库的sql_thread线程
(root@Slave)[tempdb]>start slave sql_thread;
Query OK, 0 rows affected (0.02 sec)

--再次查看正常
(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000005          Read_Master_Log_Pos: 490               Relay_Log_File: node245-relay-bin.000004                Relay_Log_Pos: 713        Relay_Master_Log_File: node233-binlog.000005             Slave_IO_Running: Yes            Slave_SQL_Running: Yes--上面的这个示例，主要是演示我们使用gtid_purged方式来达到跳过事务的目的
--事实上，主从的数据已经不一致了，应根据实际的需要考虑是否进行相应的修复

登录后复制