> 데이터 베이스 > MySQL 튜토리얼 > MySQL GTID 오류 처리 요약

MySQL GTID 오류 처리 요약

黄舟
풀어 주다: 2017-02-13 11:20:23
원래의
2399명이 탐색했습니다.

MySQL GTID는 전통적인 mysql 마스터-슬레이브 복제를 기반으로 진화한 제품으로, UUID와 트랜잭션 ID를 통해 각각의 고유성을 보장합니다. 이 작업 방법은 더 이상 소위 log_file 및 log_Pos에 신경 쓸 필요가 없으며 슬레이브 라이브러리에 메인 라이브러리를 찾을 서버를 알려주기만 하면 됩니다. 마스터-슬레이브 설정 및 장애 조치 프로세스를 단순화하고 기존 복제보다 더 안전하고 안정적입니다. GTID는 구멍 없이 연속되어 있기 때문에 마스터-슬레이브 라이브러리에서 데이터 충돌이 발생하면 빈 것을 주입하여 건너뛸 수 있습니다. 이 기사에서는 주로 GTID 마스터-슬레이브 아키텍처의 오류 처리 방법을 설명합니다.

1. GTID 관련 기능

MySQL GTID 마스터-슬레이브 복제 구성
mysqldump 기반 gtid 마스터-슬레이브 구축

2. GTID

1

2

3

4

5

6

7

8

9

10

11

    很多无法预料的情形导致mysql主从发生事务冲突,主从失败或停止的情形,即需要修复主从

    对于GTID方式的主从架构而言,更多的是处理事务冲突来修复主从

    GTID不支持通过传统设置sql_slave_skip_counter方法来跳过事务

    方法:通过注入空事务来填补事务空洞,等同于传统复制的(set global sql_slave_skip_counter = 1)

    步骤:

            stop slave;

            set gtid_next='xxxxxxx:N'; --指定下一个事务执行的版本,即想要跳过的GTID

            begin;

            commit;  --注入一个空事物

            set gtid_next='AUTOMATIC' --自动的寻找GTID事务。

            start slave; --开始同步

로그인 후 복사

와의 트랜잭션 충돌을 건너뛰는 방법3. 몇 가지 일반적인 GTID 트랜잭션 충돌 유형

1

2

3

4

5

    1、主库新增记录,从库提示主键冲突

    2、主库对象可更新,从库无对应的对象可更新

    3、主库对象可删除,从库无对应的对象可删除

    4、通过延迟从修复主库意外删除的对象

    5、主库日志被purged的情形

로그인 후 복사

4. 기본 키가 중복되어 보고되었습니다. 데이터베이스에서 (Errno: 1062)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

当前演示的主从架构图

# mysqlrplshow --master=root:pass@192.168.1.233:3306 --discover-slaves-login=root:pass --verboseWARNING: Using a password on the command line interface can be insecure.

# master on 192.168.1.233: ... connected.

# Finding slaves for master: 192.168.1.233:3306

 

# Replication Topology Graph

192.168.1.233:3306 (MASTER)   |   +--- 192.168.1.245:3306 [IO: Yes, SQL: Yes] - (SLAVE)   |   +--- 192.168.1.247:3306 [IO: Yes, SQL: Yes] - (SLAVE)(root@192.168.1.233)[tempdb]>show slave hosts;+-----------+---------------+------+-----------+--------------------------------------+| Server_id | Host          | Port | Master_id | Slave_UUID                           |

+-----------+---------------+------+-----------+--------------------------------------+|       245 | 192.168.1.245 | 3306 |       233 | 78336cdc-8cfb-11e6-ba9f-000c29328504 ||       247 | 192.168.1.247 | 3306 |       233 | 13a26fc1-555a-11e6-b5e0-000c292e1642 |

+-----------+---------------+------+-----------+--------------------------------------+--演示的mysql版本

(root@192.168.1.233)[tempdb]>show variables like 'version';+---------------+------------+| Variable_name | Value      |

+---------------+------------+| version       | 5.7.12-log |

+---------------+------------+--查看gtid是否开启

(root@192.168.1.233)[tempdb]>show variables like '%gtid_mode%';+---------------+-------+| Variable_name | Value |

+---------------+-------+| gtid_mode     | ON    |

+---------------+-------+--主库上面可以看到基于gtid的dump线程,如下

(root@192.168.1.233)[tempdb]>show processlist;+----+------+-----------------------+--------+------------------+------+| Id | User | Host                  | db     | Command          | Time |

+----+------+-----------------------+--------+------------------+------+| 17 | repl | node245.edq.com:52685 | NULL   | Binlog Dump GTID | 2738 |

| 18 | repl | node247.edq.com:33516 | NULL   | Binlog Dump GTID | 2690 || 24 | root | localhost             | tempdb | Query            |    0 |

+----+------+-----------------------+--------+------------------+------+

로그인 후 복사
로그인 후 복사

2. 해당 업데이트된 레코드를 데이터베이스에서 찾을 수 없습니다. (Errno: 1032)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

(root@Master)[tempdb]>create table t1 (

            -> id tinyint not null primary key,ename varchar(20),blog varchar(50));

 

(root@Master)[tempdb]>insert into t1

            -> values(1,'leshami','http://blog.csdn.net/leshami');

 

(root@Master)[tempdb]>insert into t1

            -> values(2,'robin','http://blog.csdn.net/robinson_0612');

 

(root@Master)[tempdb]>set sql_log_bin=off;

 

(root@Master)[tempdb]>delete from t1 where ename='robin';

 

(root@Master)[tempdb]>set sql_log_bin=on;

 

(root@Master)[tempdb]>insert into t1

            -> values(2,'robin','http://blog.csdn.net/robinson_0612');

 

-- 从库状态报错,提示重复的primary key

(root@Slave)[tempdb]>show slave status \G

*************************** 1. row ***************************Last_Errno: 1062Last_Error: Could not execute Write_rows event on table tempdb.t1; Duplicate entry '2' for key 'PRIMARY',

                        Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY;

                        the event's master log node233-binlog.000004, end_log_pos 4426

Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90

 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-89

     Auto_Position: 1

 

-- 如下解决方案,可以通过删除重库的这条记录

(root@Slave)[tempdb]>stop slave;

 

(root@Slave)[tempdb]>delete from t1 where ename='robin';

 

(root@Slave)[tempdb]>start slave;

 

(root@Slave)[tempdb]>show slave status \G

*************************** 1. row ***************************

           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90

            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,

 78336cdc-8cfb-11e6-ba9f-000c29328504:1  --这里多了一个GTID,注意这个是从库上执行的,这里的UUID跟IP 245的UUID一致

                Auto_Position: 1

         Replicate_Rewrite_DB:

                 Channel_Name:

           Master_TLS_Version:

 

(root@Slave)[tempdb]>show variables like '%uuid%';

+---------------+--------------------------------------+

| Variable_name | Value                                |

+---------------+--------------------------------------+

| server_uuid   | 78336cdc-8cfb-11e6-ba9f-000c29328504 |

+---------------+--------------------------------------+

로그인 후 복사
로그인 후 복사

3. 해당 삭제된 레코드를 데이터베이스에서 찾을 수 없습니다. 데이터베이스( Errno: 1032)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

--首先在从库上删除leshami这条记录

(root@Slave)[tempdb]>delete from t1 where ename='leshami';

 

--接下来再主库尝试更新leshami这条记录

(root@Master)[tempdb]>update t1 set

            -> blog='http://blog.csdn.net/robinson_0612' where ename='leshami';Query OK, 1 row affected (0.02 sec)

Rows matched: 1  Changed: 1  Warnings: 0

 

-- 查看从库状态

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Errno: 1032

Last_SQL_Error: Could not execute Update_rows event on table tempdb.t1; Can't find record in 't1',                                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                            the event's master log node233-binlog.000004, end_log_pos 4769Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91

Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,        78336cdc-8cfb-11e6-ba9f-000c29328504:1-2-- 通过mysqlbinlog在主服务器上寻找报错的binglog日志文件及位置,找到对应的SQL语句,如下所示

-- update中的where后面的部分为更新前的数据,set部分为更新后的数据,因此可以将更新前的数据插入到从库# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -A '10' 4769#161009 13:46:34 server id 233 end_log_pos 4769 CRC32 0xb60df74e Update_rows: table id 147 flags: STMT_END_F### UPDATE `tempdb`.`t1`### WHERE###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2='leshami' /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3='http://blog.csdn.net/leshami' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */### SET###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2='leshami' /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3='http://blog.csdn.net/robinson_0612' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */# at 4769#161009 13:46:34 server id 233  end_log_pos 4800 CRC32 0xa9669811       Xid = 1749COMMIT/*!*/;

SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;

DELIMITER ;# End of log file/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;  

 

(root@Slave)[tempdb]>select * from t1;

+----+-------+------------------------------------+

| id | ename | blog                               |

+----+-------+------------------------------------+

|  2 | robin | http://www.php.cn/ |

+----+-------+------------------------------------+

 

(root@Slave)[tempdb]>stop slave sql_thread;

 

(root@Slave)[tempdb]>insert into t1 values(1,'leshami','http://blog.csdn.net/leshami');

 

(root@Slave)[tempdb]>start slave sql_thread;

 

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-3                Auto_Position: 1

로그인 후 복사
로그인 후 복사

4. 예상치 못한 잘림으로 인해 기본 데이터베이스 복구가 지연됩니다

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

-- 如果是在主库上删除记录,而从库上找不到对应的记录,则可以直接跳过该事务

-- 下面我们首选在从库上删除一条记录

(root@Slave)[tempdb]>delete from t1 where ename='robin';

 

-- 接下来在主库上删除该记录

(root@Master)[tempdb]>delete from t1 where ename='robin';

 

-- 从库端提示无法找到对应的记录

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Error: Could not execute Delete_rows event on table tempdb.t1; Can't find record in 't1',                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                 the event's master log node233-binlog.000004, end_log_pos 5070Last_SQL_Error_Timestamp: 161009 15:08:06    Master_SSL_Crl: Master_SSL_Crlpath:

Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92

 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                    78336cdc-8cfb-11e6-ba9f-000c29328504:1-4     Auto_Position: 1      -- 下面通过注入空事务来跳过

(root@Slave)[tempdb]>stop slave sql_thread;

 

(root@Slave)[tempdb]>set gtid_next='1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:92';

 

(root@Slave)[tempdb]>begin;commit;

 

(root@Slave)[tempdb]>set gtid_next='AUTOMATIC';

 

(root@Slave)[tempdb]>start slave sql_thread;

 

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-92,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1         Replicate_Rewrite_DB:                  Channel_Name:            Master_TLS_Version:

로그인 후 복사
로그인 후 복사

5. 기본 데이터베이스 binlog가 삭제되었습니다(Errno: 1236)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

-- 主库上面新增表及记录            

(root@Master)[tempdb]>create table t2 (id tinyint not null primary key,

        -> ename varchar(20),blog varchar(50));(root@Master)[tempdb]>insert into t2 

            -> values(1,'leshami','http://blog.csdn.net/leshami');(root@Master)[tempdb]>insert into t2 

            -> values(2,'robin','http://blog.csdn.net/robinson_0612');(root@Master)[tempdb]>select * from t2;

+----+---------+------------------------------------+

| id | ename   | blog                               |

+----+---------+------------------------------------+

|  1 | leshami | http://www.php.cn/       |

|  2 | robin   | http://www.php.cn/ |

+----+---------+------------------------------------+

 

--先将从库配置为延迟从

(root@Slave)[tempdb]>stop slave sql_thread;

Query OK, 0 rows affected (0.01 sec)

 

(root@Slave)[tempdb]>CHANGE MASTER TO MASTER_DELAY = 300;

Query OK, 0 rows affected (0.00 sec)

 

(root@Slave)[tempdb]>start slave sql_thread;

Query OK, 0 rows affected (0.02 sec)

 

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************             Slave_IO_Running: Yes            Slave_SQL_Running: Yes                    SQL_Delay: 300  root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-99            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1--查看主库上的binglog gtid

(root@Master)[tempdb]>show master status\G*************************** 1. row ***************************             File: node233-binlog.000004         Position: 6970     Binlog_Do_DB:

 Binlog_Ignore_DB:

Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99

1 row in set (0.00 sec)

 

--在主库上truncate t2

(root@Master)[tempdb]>truncate table t2;

Query OK, 0 rows affected (0.03 sec)

 

--再次查看主库上的binglog gtid,有99变成了100,这个100即是我们需要跳过的ID

(root@Master)[tempdb]>show master status\G*************************** 1. row ***************************             File: node233-binlog.000004         Position: 7121     Binlog_Do_DB:

 Binlog_Ignore_DB:

Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100

1 row in set (0.00 sec)

 

--从库上跳过被意外truncate的事务

(root@Slave)[tempdb]>stop slave sql_thread;

Query OK, 0 rows affected (0.01 sec)

 

(root@Slave)[tempdb]>set gtid_next='1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100';

Query OK, 0 rows affected (0.00 sec)

 

(root@Slave)[tempdb]>begin;commit;

Query OK, 0 rows affected (0.00 sec)

 

Query OK, 0 rows affected (0.01 sec)

 

(root@Slave)[tempdb]>set gtid_next='AUTOMATIC';

Query OK, 0 rows affected (0.00 sec)

 

(root@Slave)[tempdb]>start slave sql_thread;

Query OK, 0 rows affected (0.02 sec)

 

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000004          Read_Master_Log_Pos: 7121               Relay_Log_File: node245-relay-bin.000003                Relay_Log_Pos: 2982        Relay_Master_Log_File: node233-binlog.000004             Slave_IO_Running: Yes            Slave_SQL_Running: Yes             ...........................                    Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,                                                             78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1-- 很多时候我们并不知道表何时被truncate,因此可以从binlog日志得到其gtid

-- 如下所示,可以得到这串 SET @@SESSION.GTID_NEXT= '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100'

-- 100即为这个truncate对应的gtid的事务号# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -i \> "truncate table t2" -A3 -B10  ###   @3='http://blog.csdn.net/robinson_0612' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */# at 6939#161009 18:04:58 server id 233  end_log_pos 6970 CRC32 0x71c5121c     Xid = 1775COMMIT/*!*/;# at 6970#161009 18:08:42 server id 233 end_log_pos 7035 CRC32 0x00ba9437 GTID last_committed=26 sequence_number=27SET @@SESSION.GTID_NEXT= '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100'/*!*/;# at 7035#161009 18:08:42 server id 233 end_log_pos 7121 CRC32 0x5a8b9723 Query thread_id=26 exec_time=0 error_code=0SET TIMESTAMP=1476007722/*!*/;

truncate table t2

/*!*/;

SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;

DELIMITER ;

로그인 후 복사

5. 요약

1. GTID는 마스터-슬레이브 아키텍처의 배포를 단순화하여 슬레이브 라이브러리가 더 이상 log_file 및 log_pos를 신경 쓸 필요가 없도록 합니다

2. 트랜잭션 ID의 고유성은 다른 슬레이브 라이브러리에 GTID를 적용할 수 있어 편리한 장애 조치를 제공합니다.

3. GTID는 연속적이고 비어 있지 않으므로 충돌 상황에서는 빈 트랜잭션이 필요합니다.
4. 예 지연을 구성하여 기본 데이터베이스에서 실수로 객체를 삭제하여 발생하는 인적 오류를 방지합니다

MySQL GTID는 기존 mysql 마스터-슬레이브 복제를 기반으로 진화된 제품입니다. 즉, UUID에 의해 추가된 트랜잭션 ID는 각 트랜잭션의 고유성을 보장하는 데 사용됩니다. 이 작업 방법은 더 이상 소위 log_file 및 log_Pos에 신경 쓸 필요가 없으며 슬레이브 라이브러리에 메인 라이브러리를 찾을 서버를 알려주기만 하면 됩니다. 마스터-슬레이브 설정 및 장애 조치 프로세스를 단순화하고 기존 복제보다 더 안전하고 안정적입니다. GTID는 구멍 없이 연속되어 있기 때문에 마스터-슬레이브 라이브러리에서 데이터 충돌이 발생하면 빈 것을 주입하여 건너뛸 수 있습니다. 이 기사에서는 주로 GTID 마스터-슬레이브 아키텍처의 오류 처리 방법을 설명합니다.

1. GTID 관련 기능

MySQL GTID 마스터-슬레이브 복제 구성

mysqldump 기반 gtid 마스터-슬레이브 구축


2. 트랜잭션이 GTID와 충돌합니다

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

-- 首先停止从库,模拟从库被意外宕机

(root@Slave)[tempdb]>stop slave;

Query OK, 0 rows affected (0.08 sec)

 

--在主库上进行相应的操作

--此时主库上的gtid_purged为空

(root@Master)[tempdb]>show variables like '%gtid_purged%';

+---------------+-------+

| Variable_name | Value |

+---------------+-------+

| gtid_purged   |       |

+---------------+-------+

 

--查看主库binlog

(root@Master)[tempdb]>show binary logs;

+-----------------------+-----------+

| Log_name              | File_size |

+-----------------------+-----------+

| node233-binlog.000001 |   1362104 |

| node233-binlog.000002 |      1331 |

| node233-binlog.000003 |       217 |

| node233-binlog.000004 |      7121 |

+-----------------------+-----------+

 

(root@Master)[tempdb]>select * from t1;

+----+---------+------------------------------------+

| id | ename   | blog                               |

+----+---------+------------------------------------+

|  1 | leshami | http://www.php.cn/ |

|  2 | robin   | http://www.php.cn/       |

+----+---------+------------------------------------+

 

--从主库删除记录

(root@Master)[tempdb]>delete from t1;

 

--切换日志

(root@Master)[tempdb]>flush logs;

 

--新增记录

(root@Master)[tempdb]>insert into t1 values(1,    -> 'xuputi','http://blog.csdn.net/leshami');(root@Master)[tempdb]>show binary logs;

+-----------------------+-----------+

| Log_name              | File_size |

+-----------------------+-----------+

| node233-binlog.000001 |   1362104 |

| node233-binlog.000002 |      1331 |

| node233-binlog.000003 |       217 |

| node233-binlog.000004 |      7513 |

| node233-binlog.000005 |       490 |

+-----------------------+-----------+

 

--清理binlog

(root@Master)[tempdb]>purge binary logs to 'node233-binlog.000005';

Query OK, 0 rows affected (0.01 sec)

 

(root@Master)[tempdb]>show binary logs;

+-----------------------+-----------+

| Log_name              | File_size |

+-----------------------+-----------+

| node233-binlog.000005 |       490 |

+-----------------------+-----------+

 

--此时可以看到相应的gtid_purged值

(root@Master)[tempdb]>show variables like '%gtid_purged%';

+---------------+--------------------------------------------+

| Variable_name | Value                                      |

+---------------+--------------------------------------------+

| gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101 |

+---------------+--------------------------------------------+

 

--下面启动从库

(root@Slave)[tempdb]>start slave;

Query OK, 0 rows affected (0.00 sec)

 

--从库状态提示有日志被purged

(root@Slave)[tempdb]>show slave status\G*************************** 1. row ***************************               Slave_IO_State:                   Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000004          Read_Master_Log_Pos: 7121               Relay_Log_File: node245-relay-bin.000003                Relay_Log_Pos: 3133        Relay_Master_Log_File: node233-binlog.000004             Slave_IO_Running: No            Slave_SQL_Running: Yes                    ...............                Last_IO_Errno: 1236                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log:                'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1,                  but the master has purged binary logs containing GTIDs that the slave requires.'                       ..................           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1-- 从库上gtid_purged参数,此时为75

(root@Slave)[tempdb]>show variables like '%gtid_purged%';

+---------------+-------------------------------------------+

| Variable_name | Value                                     |

+---------------+-------------------------------------------+

| gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-75 |

+---------------+-------------------------------------------+               

 

--停止从库

(root@Slave)[tempdb]>stop slave;

Query OK, 0 rows affected (0.01 sec)

 

--下面尝试使用gtid_purged进行跳过事务,,如下,提示仅仅当GLOBAL.GTID_EXECUTED为空才能被设置

(root@Slave)[tempdb]>set global gtid_purged = '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101';

ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.

 

--如下查看,已经存在被执行的gtid,即gtid_executed肯定是不为空,且这些gtid记录在从库的binary log中

(root@Slave)[tempdb]>show global variables like '%gtid_executed%'\G*************************** 1. row ***************************Variable_name: gtid_executed        Value: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4*************************** 2. row ***************************Variable_name: gtid_executed_compression_period        Value: 1000--下面我们在从库上reset master,即清空从库binlog

(root@Slave)[tempdb]>reset master;

Query OK, 0 rows affected (0.05 sec)

 

--再次查看gtid_executed已经为空值

(root@Slave)[tempdb]>show global variables like '%gtid_executed%'\G*************************** 1. row ***************************Variable_name: gtid_executed        Value: *************************** 2. row ***************************Variable_name: gtid_executed_compression_period        Value: 1000--此时再次设置gtid_purged的值

(root@Slave)[tempdb]>set global gtid_purged = '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101';

Query OK, 0 rows affected (0.01 sec)

 

--启动从库

(root@Slave)[tempdb]>start slave;

Query OK, 0 rows affected (0.03 sec)

 

--提示有重复记录,如下所示

--是由于我们在从库停止期间delete这个事务没有被从库的relay log接受到

--其次主从的binlog又被purged,而且从库启动后,执行了gtid_purged,因此主库上新增的记录在从库上提示主键重复

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000005          Read_Master_Log_Pos: 490               Relay_Log_File: node245-relay-bin.000004                Relay_Log_Pos: 417        Relay_Master_Log_File: node233-binlog.000005             Slave_IO_Running: Yes            Slave_SQL_Running: No                ................               Last_SQL_Error: Could not execute Write_rows event on table tempdb.t1;

 Duplicate entry '1' for key 'PRIMARY', Error_code: 1062;

 handler error HA_ERR_FOUND_DUPP_KEY; the event's master log node233-binlog.000005, end_log_pos 459           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100:102            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101                Auto_Position: 1--在从库上删除id为1的记录

(root@Slave)[tempdb]>delete from t1 where id=1;

Query OK, 1 row affected (0.05 sec)

 

--启动从库的sql_thread线程

(root@Slave)[tempdb]>start slave sql_thread;

Query OK, 0 rows affected (0.02 sec)

 

--再次查看正常

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: Master                  Master_User: repl                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: node233-binlog.000005          Read_Master_Log_Pos: 490               Relay_Log_File: node245-relay-bin.000004                Relay_Log_Pos: 713        Relay_Master_Log_File: node233-binlog.000005             Slave_IO_Running: Yes            Slave_SQL_Running: Yes--上面的这个示例,主要是演示我们使用gtid_purged方式来达到跳过事务的目的

--事实上,主从的数据已经不一致了,应根据实际的需要考虑是否进行相应的修复

로그인 후 복사

3. 몇 가지 일반적인 GTID 트랜잭션 충돌 유형

1

2

3

4

5

6

7

8

9

10

11

很多无法预料的情形导致mysql主从发生事务冲突,主从失败或停止的情形,即需要修复主从

对于GTID方式的主从架构而言,更多的是处理事务冲突来修复主从

GTID不支持通过传统设置sql_slave_skip_counter方法来跳过事务

方法:通过注入空事务来填补事务空洞,等同于传统复制的(set global sql_slave_skip_counter = 1)

步骤:

        stop slave;

        set gtid_next='xxxxxxx:N'; --指定下一个事务执行的版本,即想要跳过的GTID

        begin;

        commit;  --注入一个空事物

        set gtid_next='AUTOMATIC' --自动的寻找GTID事务。

        start slave; --开始同步

로그인 후 복사

4. 데이터베이스에서 보고된 기본 키가 중복되었습니다(Errno: 1062)

1

2

3

4

5

1、主库新增记录,从库提示主键冲突

2、主库对象可更新,从库无对应的对象可更新

3、主库对象可删除,从库无对应的对象可删除

4、通过延迟从修复主库意外删除的对象

5、主库日志被purged的情形

로그인 후 복사

2. 업데이트된 해당 레코드를 데이터베이스에서 찾을 수 없습니다(Errno: 1032)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

当前演示的主从架构图

# mysqlrplshow --master=root:pass@192.168.1.233:3306 --discover-slaves-login=root:pass --verboseWARNING: Using a password on the command line interface can be insecure.

# master on 192.168.1.233: ... connected.

# Finding slaves for master: 192.168.1.233:3306

 

# Replication Topology Graph

192.168.1.233:3306 (MASTER)   |   +--- 192.168.1.245:3306 [IO: Yes, SQL: Yes] - (SLAVE)   |   +--- 192.168.1.247:3306 [IO: Yes, SQL: Yes] - (SLAVE)(root@192.168.1.233)[tempdb]>show slave hosts;+-----------+---------------+------+-----------+--------------------------------------+| Server_id | Host          | Port | Master_id | Slave_UUID                           |

+-----------+---------------+------+-----------+--------------------------------------+|       245 | 192.168.1.245 | 3306 |       233 | 78336cdc-8cfb-11e6-ba9f-000c29328504 ||       247 | 192.168.1.247 | 3306 |       233 | 13a26fc1-555a-11e6-b5e0-000c292e1642 |

+-----------+---------------+------+-----------+--------------------------------------+--演示的mysql版本

(root@192.168.1.233)[tempdb]>show variables like 'version';+---------------+------------+| Variable_name | Value      |

+---------------+------------+| version       | 5.7.12-log |

+---------------+------------+--查看gtid是否开启

(root@192.168.1.233)[tempdb]>show variables like '%gtid_mode%';+---------------+-------+| Variable_name | Value |

+---------------+-------+| gtid_mode     | ON    |

+---------------+-------+--主库上面可以看到基于gtid的dump线程,如下

(root@192.168.1.233)[tempdb]>show processlist;+----+------+-----------------------+--------+------------------+------+| Id | User | Host                  | db     | Command          | Time |

+----+------+-----------------------+--------+------------------+------+| 17 | repl | node245.edq.com:52685 | NULL   | Binlog Dump GTID | 2738 |

| 18 | repl | node247.edq.com:33516 | NULL   | Binlog Dump GTID | 2690 || 24 | root | localhost             | tempdb | Query            |    0 |

+----+------+-----------------------+--------+------------------+------+

로그인 후 복사
로그인 후 복사

3. 삭제된 해당 레코드를 데이터베이스에서 찾을 수 없습니다(Errno: 1032).

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

(root@Master)[tempdb]>create table t1 (

            -> id tinyint not null primary key,ename varchar(20),blog varchar(50));

 

(root@Master)[tempdb]>insert into t1

            -> values(1,'leshami','http://blog.csdn.net/leshami');

 

(root@Master)[tempdb]>insert into t1

            -> values(2,'robin','http://blog.csdn.net/robinson_0612');

 

(root@Master)[tempdb]>set sql_log_bin=off;

 

(root@Master)[tempdb]>delete from t1 where ename='robin';

 

(root@Master)[tempdb]>set sql_log_bin=on;

 

(root@Master)[tempdb]>insert into t1

            -> values(2,'robin','http://blog.csdn.net/robinson_0612');

 

-- 从库状态报错,提示重复的primary key

(root@Slave)[tempdb]>show slave status \G

*************************** 1. row ***************************Last_Errno: 1062Last_Error: Could not execute Write_rows event on table tempdb.t1; Duplicate entry '2' for key 'PRIMARY',

                        Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY;

                        the event's master log node233-binlog.000004, end_log_pos 4426

Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90

 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-89

     Auto_Position: 1

 

-- 如下解决方案,可以通过删除重库的这条记录

(root@Slave)[tempdb]>stop slave;

 

(root@Slave)[tempdb]>delete from t1 where ename='robin';

 

(root@Slave)[tempdb]>start slave;

 

(root@Slave)[tempdb]>show slave status \G

*************************** 1. row ***************************

           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90

            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,

 78336cdc-8cfb-11e6-ba9f-000c29328504:1  --这里多了一个GTID,注意这个是从库上执行的,这里的UUID跟IP 245的UUID一致

                Auto_Position: 1

         Replicate_Rewrite_DB:

                 Channel_Name:

           Master_TLS_Version:

 

(root@Slave)[tempdb]>show variables like '%uuid%';

+---------------+--------------------------------------+

| Variable_name | Value                                |

+---------------+--------------------------------------+

| server_uuid   | 78336cdc-8cfb-11e6-ba9f-000c29328504 |

+---------------+--------------------------------------+

로그인 후 복사
로그인 후 복사

4. 예상치 못한 잘림으로 인해 메인 라이브러리 복구가 지연됩니다

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

--首先在从库上删除leshami这条记录

(root@Slave)[tempdb]>delete from t1 where ename='leshami';

 

--接下来再主库尝试更新leshami这条记录

(root@Master)[tempdb]>update t1 set

            -> blog='http://blog.csdn.net/robinson_0612' where ename='leshami';Query OK, 1 row affected (0.02 sec)

Rows matched: 1  Changed: 1  Warnings: 0

 

-- 查看从库状态

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Errno: 1032

Last_SQL_Error: Could not execute Update_rows event on table tempdb.t1; Can't find record in 't1',                                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                            the event's master log node233-binlog.000004, end_log_pos 4769Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91

Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,        78336cdc-8cfb-11e6-ba9f-000c29328504:1-2-- 通过mysqlbinlog在主服务器上寻找报错的binglog日志文件及位置,找到对应的SQL语句,如下所示

-- update中的where后面的部分为更新前的数据,set部分为更新后的数据,因此可以将更新前的数据插入到从库# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -A '10' 4769#161009 13:46:34 server id 233 end_log_pos 4769 CRC32 0xb60df74e Update_rows: table id 147 flags: STMT_END_F### UPDATE `tempdb`.`t1`### WHERE###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2='leshami' /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3='http://blog.csdn.net/leshami' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */### SET###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */###   @2='leshami' /* VARSTRING(20) meta=20 nullable=1 is_null=0 */###   @3='http://blog.csdn.net/robinson_0612' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */# at 4769#161009 13:46:34 server id 233  end_log_pos 4800 CRC32 0xa9669811       Xid = 1749COMMIT/*!*/;

SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;

DELIMITER ;# End of log file/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;  

 

(root@Slave)[tempdb]>select * from t1;

+----+-------+------------------------------------+

| id | ename | blog                               |

+----+-------+------------------------------------+

|  2 | robin | http://www.php.cn/ |

+----+-------+------------------------------------+

 

(root@Slave)[tempdb]>stop slave sql_thread;

 

(root@Slave)[tempdb]>insert into t1 values(1,'leshami','http://blog.csdn.net/leshami');

 

(root@Slave)[tempdb]>start slave sql_thread;

 

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-3                Auto_Position: 1

로그인 후 복사
로그인 후 복사

5. 메인 라이브러리 binlog가 삭제되었습니다(Errno: 1236)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

-- 如果是在主库上删除记录,而从库上找不到对应的记录,则可以直接跳过该事务

-- 下面我们首选在从库上删除一条记录

(root@Slave)[tempdb]>delete from t1 where ename='robin';

 

-- 接下来在主库上删除该记录

(root@Master)[tempdb]>delete from t1 where ename='robin';

 

-- 从库端提示无法找到对应的记录

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************Last_SQL_Error: Could not execute Delete_rows event on table tempdb.t1; Can't find record in 't1',                Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;                 the event's master log node233-binlog.000004, end_log_pos 5070Last_SQL_Error_Timestamp: 161009 15:08:06    Master_SSL_Crl: Master_SSL_Crlpath:

Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92

 Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,                    78336cdc-8cfb-11e6-ba9f-000c29328504:1-4     Auto_Position: 1      -- 下面通过注入空事务来跳过

(root@Slave)[tempdb]>stop slave sql_thread;

 

(root@Slave)[tempdb]>set gtid_next='1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:92';

 

(root@Slave)[tempdb]>begin;commit;

 

(root@Slave)[tempdb]>set gtid_next='AUTOMATIC';

 

(root@Slave)[tempdb]>start slave sql_thread;

 

(root@Slave)[tempdb]>show slave status \G*************************** 1. row ***************************           Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92            Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-92,                               78336cdc-8cfb-11e6-ba9f-000c29328504:1-4                Auto_Position: 1         Replicate_Rewrite_DB:                  Channel_Name:            Master_TLS_Version:

로그인 후 복사
로그인 후 복사

5. >

1. GTID는 마스터-슬레이브 아키텍처 배포를 단순화하여 슬레이브 라이브러리가 더 이상 log_file 및 log_pos를 신경 쓸 필요가 없는 전역 트랜잭션 ID입니다.

2. 트랜잭션 ID의 고유성으로 인해 , 다른 슬레이브 라이브러리의 GTID를 다른 슬레이브 라이브러리에 적용할 수 있습니다. 즉, 편리한 페일오버를 제공합니다

3. GTID는 연속적이고 비어 있지 않습니다. 따라서 충돌 상황에서는 빈 주입이 필요합니다.

4. 지연을 구성하여 마스터를 방지할 수 있습니다. 라이브러리의 실수로 개체를 삭제하여 발생한 휴먼 오류

위 내용은 자세한 내용을 참조하세요. PHP 중국어 홈페이지(www.php.cn)!

관련 라벨:
본 웹사이트의 성명
본 글의 내용은 네티즌들의 자발적인 기여로 작성되었으며, 저작권은 원저작자에게 있습니다. 본 사이트는 이에 상응하는 법적 책임을 지지 않습니다. 표절이나 침해가 의심되는 콘텐츠를 발견한 경우 admin@php.cn으로 문의하세요.
인기 튜토리얼
더>
최신 다운로드
더>
웹 효과
웹사이트 소스 코드
웹사이트 자료
프론트엔드 템플릿