MHA with binlog server
In this post ,we mainly talk about MHA GTID behavior, we test different cases and find something is different from previous versions . we have four machines for this test. environment: master server: 10.0.128.77slave server : 10.0.128.110/
In this post ,we mainly talk about MHA GTID behavior, we test different cases and find something is different from previous versions .
we have four machines for this test.
environment:
master server: 10.0.128.77 slave server : 10.0.128.110/113/114 port : 3306
————————————————————————————–
we first do normal failover .
kill master server
we find MHA outputs:
Tue Dec 30 13:32:14 2014 - [warning] Got error on MySQL connect ping: DBI connect(';host=10.0.128.77;port=3306;mysql_connect_timeout=1','dbadmin',...) failed: Lost connection to MySQL server at 'reading initial communication packet', system error: 111 at /usr/share/perl5/vendor_perl/MHA/HealthCheck.pm line 97 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Tue Dec 30 13:32:14 2014 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 10.0.128.113 -s 10.0.128.114 --user=root --master_host=10.0.128.77 --master_ip=10.0.128.77 --master_port=3306 --master_user=dbadmin --master_password=NV7yVBpn88cg4WJCVlZd --ping_type=CONNECT Tue Dec 30 13:32:14 2014 - [info] Executing SSH check script: exit 0 Tue Dec 30 13:32:14 2014 - [info] HealthCheck: SSH to 10.0.128.77 is reachable. Monitoring server 10.0.128.113 is reachable, Master is not reachable from 10.0.128.113. OK. Monitoring server 10.0.128.114 is reachable, Master is not reachable from 10.0.128.114. OK. Tue Dec 30 13:32:14 2014 - [info] Master is not reachable from all other monitoring servers. Failover should start. Tue Dec 30 13:32:16 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Tue Dec 30 13:32:16 2014 - [warning] Connection failed 2 time(s).. Tue Dec 30 13:32:18 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Tue Dec 30 13:32:18 2014 - [warning] Connection failed 3 time(s).. Tue Dec 30 13:32:20 2014 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111) Tue Dec 30 13:32:20 2014 - [warning] Connection failed 4 time(s).. Tue Dec 30 13:32:20 2014 - [warning] Master is not reachable from health checker! Tue Dec 30 13:32:20 2014 - [warning] Master 10.0.128.77(10.0.128.77:3306) is not reachable! Tue Dec 30 13:32:20 2014 - [warning] SSH is reachable. Tue Dec 30 13:32:20 2014 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/test.cnf again, and trying to connect to all servers to check server status.. Tue Dec 30 13:32:20 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Dec 30 13:32:20 2014 - [info] Reading application default configuration from /etc/masterha/test.cnf.. Tue Dec 30 13:32:20 2014 - [info] Reading server configuration from /etc/masterha/test.cnf.. Tue Dec 30 13:32:20 2014 - [info] GTID failover mode = 1 Tue Dec 30 13:32:20 2014 - [info] Dead Servers: Tue Dec 30 13:32:20 2014 - [info] 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] Alive Servers: Tue Dec 30 13:32:20 2014 - [info] 10.0.128.110(10.0.128.110:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.113(10.0.128.113:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.114(10.0.128.114:3306) Tue Dec 30 13:32:20 2014 - [info] Alive Slaves: Tue Dec 30 13:32:20 2014 - [info] 10.0.128.110(10.0.128.110:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.113(10.0.128.113:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.114(10.0.128.114:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] Checking slave configurations.. Tue Dec 30 13:32:20 2014 - [info] read_only=1 is not set on slave 10.0.128.110(10.0.128.110:3306). Tue Dec 30 13:32:20 2014 - [info] read_only=1 is not set on slave 10.0.128.113(10.0.128.113:3306). Tue Dec 30 13:32:20 2014 - [info] read_only=1 is not set on slave 10.0.128.114(10.0.128.114:3306). Tue Dec 30 13:32:20 2014 - [info] Checking replication filtering settings.. Tue Dec 30 13:32:20 2014 - [info] Replication filtering check ok. Tue Dec 30 13:32:20 2014 - [info] Master is down! Tue Dec 30 13:32:20 2014 - [info] Terminating monitoring script. Tue Dec 30 13:32:20 2014 - [info] Got exit code 20 (Master dead). Tue Dec 30 13:32:20 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Dec 30 13:32:20 2014 - [info] Reading application default configuration from /etc/masterha/test.cnf.. Tue Dec 30 13:32:20 2014 - [info] Reading server configuration from /etc/masterha/test.cnf.. Tue Dec 30 13:32:20 2014 - [info] MHA::MasterFailover version 0.56. Tue Dec 30 13:32:20 2014 - [info] Starting master failover. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] * Phase 1: Configuration Check Phase.. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] GTID failover mode = 1 Tue Dec 30 13:32:20 2014 - [info] Dead Servers: Tue Dec 30 13:32:20 2014 - [info] 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] Checking master reachability via MySQL(double check)... Tue Dec 30 13:32:20 2014 - [info] ok. Tue Dec 30 13:32:20 2014 - [info] Alive Servers: Tue Dec 30 13:32:20 2014 - [info] 10.0.128.110(10.0.128.110:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.113(10.0.128.113:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.114(10.0.128.114:3306) Tue Dec 30 13:32:20 2014 - [info] Alive Slaves: Tue Dec 30 13:32:20 2014 - [info] 10.0.128.110(10.0.128.110:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.113(10.0.128.113:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.114(10.0.128.114:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] Starting GTID based failover. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] ** Phase 1: Configuration Check Phase completed. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] * Phase 2: Dead Master Shutdown Phase.. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] Forcing shutdown so that applications never connect to the current master.. Tue Dec 30 13:32:20 2014 - [info] Executing master IP deactivation script: Tue Dec 30 13:32:20 2014 - [info] /usr/local/dbadmin/mhascripts/master_ip_failover_pajk01.pl --orig_master_host=10.0.128.77 --orig_master_ip=10.0.128.77 --orig_master_port=3306 --command=stopssh --ssh_user=root Disabling the VIP on old master: 10.0.128.77 RTNETLINK answers: Cannot assign requested address Tue Dec 30 13:32:20 2014 - [info] done. Tue Dec 30 13:32:20 2014 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Tue Dec 30 13:32:20 2014 - [info] * Phase 2: Dead Master Shutdown Phase completed. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] * Phase 3: Master Recovery Phase.. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] The latest binary log file/position on all slaves is master-bin.000001:3679 Tue Dec 30 13:32:20 2014 - [info] Retrieved Gtid Set: a861ddd1-8f4e-11e4-aeac-86f5fb218bef:1-15 Tue Dec 30 13:32:20 2014 - [info] Latest slaves (Slaves that received relay log files to the latest): Tue Dec 30 13:32:20 2014 - [info] 10.0.128.110(10.0.128.110:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.113(10.0.128.113:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.114(10.0.128.114:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] The oldest binary log file/position on all slaves is master-bin.000001:3679 Tue Dec 30 13:32:20 2014 - [info] Retrieved Gtid Set: a861ddd1-8f4e-11e4-aeac-86f5fb218bef:1-15 Tue Dec 30 13:32:20 2014 - [info] Oldest slaves: Tue Dec 30 13:32:20 2014 - [info] 10.0.128.110(10.0.128.110:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.113(10.0.128.113:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] 10.0.128.114(10.0.128.114:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 13:32:20 2014 - [info] GTID ON Tue Dec 30 13:32:20 2014 - [info] Replicating from 10.0.128.77(10.0.128.77:3306) Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] * Phase 3.3: Determining New Master Phase.. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] Searching new master from slaves.. Tue Dec 30 13:32:20 2014 - [info] Candidate masters from the configuration file: Tue Dec 30 13:32:20 2014 - [info] Non-candidate masters: Tue Dec 30 13:32:20 2014 - [info] New master is 10.0.128.110(10.0.128.110:3306) Tue Dec 30 13:32:20 2014 - [info] Starting master failover.. Tue Dec 30 13:32:20 2014 - [info] From: 10.0.128.77(10.0.128.77:3306) (current master) +--10.0.128.110(10.0.128.110:3306) +--10.0.128.113(10.0.128.113:3306) +--10.0.128.114(10.0.128.114:3306) To: 10.0.128.110(10.0.128.110:3306) (new master) +--10.0.128.113(10.0.128.113:3306) +--10.0.128.114(10.0.128.114:3306) Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] * Phase 3.3: New Master Recovery Phase.. Tue Dec 30 13:32:20 2014 - [info] Tue Dec 30 13:32:20 2014 - [info] Waiting all logs to be applied.. Tue Dec 30 13:32:20 2014 - [info] done. Tue Dec 30 13:32:20 2014 - [info] Getting new master's binlog name and position.. Tue Dec 30 13:32:20 2014 - [info] mysql-bin.000001:4142 Tue Dec 30 13:32:20 2014 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.128.110', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='rep', MASTER_PASSWORD='xxx'; Tue Dec 30 13:32:20 2014 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 4142, a861ddd1-8f4e-11e4-aeac-86f5fb218bef:1-15 Tue Dec 30 13:32:20 2014 - [info] Executing master IP activate script: Tue Dec 30 13:32:20 2014 - [info] /usr/local/dbadmin/mhascripts/master_ip_failover_pajk01.pl --command=start --ssh_user=root --orig_master_host=10.0.128.77 --orig_master_ip=10.0.128.77 --orig_master_port=3306 --new_master_host=10.0.128.110 --new_master_ip=10.0.128.110 --new_master_port=3306 --new_master_user='dbadmin' --new_master_password='NV7yVBpn88cg4WJCVlZd' Set read_only=0 on the new master. Enabling the VIP - 10.0.128.35/24 on the new master - 10.0.128.110 Tue Dec 30 13:32:23 2014 - [info] OK. Tue Dec 30 13:32:23 2014 - [info] ** Finished master recovery successfully. Tue Dec 30 13:32:23 2014 - [info] * Phase 3: Master Recovery Phase completed. Tue Dec 30 13:32:23 2014 - [info] Tue Dec 30 13:32:23 2014 - [info] * Phase 4: Slaves Recovery Phase.. Tue Dec 30 13:32:23 2014 - [info] Tue Dec 30 13:32:23 2014 - [info] Tue Dec 30 13:32:23 2014 - [info] * Phase 4.1: Starting Slaves in parallel.. Tue Dec 30 13:32:23 2014 - [info] Tue Dec 30 13:32:23 2014 - [info] -- Slave recovery on host 10.0.128.113(10.0.128.113:3306) started, pid: 31607. Check tmp log /var/log/masterha/test/10.0.128.113_3306_20141230133220.log if it takes time.. Tue Dec 30 13:32:23 2014 - [info] -- Slave recovery on host 10.0.128.114(10.0.128.114:3306) started, pid: 31608. Check tmp log /var/log/masterha/test/10.0.128.114_3306_20141230133220.log if it takes time.. Tue Dec 30 13:32:24 2014 - [info] Tue Dec 30 13:32:24 2014 - [info] Log messages from 10.0.128.113 ... Tue Dec 30 13:32:24 2014 - [info] Tue Dec 30 13:32:23 2014 - [info] Resetting slave 10.0.128.113(10.0.128.113:3306) and starting replication from the new master 10.0.128.110(10.0.128.110:3306).. Tue Dec 30 13:32:24 2014 - [info] Executed CHANGE MASTER. Tue Dec 30 13:32:24 2014 - [info] Slave started. Tue Dec 30 13:32:24 2014 - [info] gtid_wait(a861ddd1-8f4e-11e4-aeac-86f5fb218bef:1-15) completed on 10.0.128.113(10.0.128.113:3306). Executed 0 events. Tue Dec 30 13:32:24 2014 - [info] End of log messages from 10.0.128.113. Tue Dec 30 13:32:24 2014 - [info] -- Slave on host 10.0.128.113(10.0.128.113:3306) started. Tue Dec 30 13:32:24 2014 - [info] Tue Dec 30 13:32:24 2014 - [info] Log messages from 10.0.128.114 ... Tue Dec 30 13:32:24 2014 - [info] Tue Dec 30 13:32:23 2014 - [info] Resetting slave 10.0.128.114(10.0.128.114:3306) and starting replication from the new master 10.0.128.110(10.0.128.110:3306).. Tue Dec 30 13:32:24 2014 - [info] Executed CHANGE MASTER. Tue Dec 30 13:32:24 2014 - [info] Slave started. Tue Dec 30 13:32:24 2014 - [info] gtid_wait(a861ddd1-8f4e-11e4-aeac-86f5fb218bef:1-15) completed on 10.0.128.114(10.0.128.114:3306). Executed 0 events. Tue Dec 30 13:32:24 2014 - [info] End of log messages from 10.0.128.114. Tue Dec 30 13:32:24 2014 - [info] -- Slave on host 10.0.128.114(10.0.128.114:3306) started. Tue Dec 30 13:32:24 2014 - [info] All new slave servers recovered successfully. Tue Dec 30 13:32:24 2014 - [info] Tue Dec 30 13:32:24 2014 - [info] * Phase 5: New master cleanup phase.. Tue Dec 30 13:32:24 2014 - [info] Tue Dec 30 13:32:24 2014 - [info] Resetting slave info on the new master.. Tue Dec 30 13:32:24 2014 - [info] 10.0.128.110: Resetting slave info succeeded. Tue Dec 30 13:32:24 2014 - [info] Master failover to 10.0.128.110(10.0.128.110:3306) completed successfully. Tue Dec 30 13:32:24 2014 - [info] ----- Failover Report ----- test: MySQL Master failover 10.0.128.77(10.0.128.77:3306) to 10.0.128.110(10.0.128.110:3306) succeeded Master 10.0.128.77(10.0.128.77:3306) is down! Check MHA Manager logs at a3-relay00.sh for details. Started automated(non-interactive) failover. Invalidated master IP address on 10.0.128.77(10.0.128.77:3306) Selected 10.0.128.110(10.0.128.110:3306) as a new master. 10.0.128.110(10.0.128.110:3306): OK: Applying all logs succeeded. 10.0.128.110(10.0.128.110:3306): OK: Activated master IP address. 10.0.128.113(10.0.128.113:3306): OK: Slave started, replicating from 10.0.128.110(10.0.128.110:3306) 10.0.128.114(10.0.128.114:3306): OK: Slave started, replicating from 10.0.128.110(10.0.128.110:3306) 10.0.128.110(10.0.128.110:3306): Resetting slave info succeeded. Master failover to 10.0.128.110(10.0.128.110:3306) completed successfully.
failover to new master 10.0.128.110:
create new database liuyang3 on 10.0.128.110:
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | liuyang2 | | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.00 sec) mysql> create database liuyang3; Query OK, 1 row affected (0.00 sec) show all slaves status : mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.0.128.110 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 4296 Relay_Log_File: hadoop-vm-datanode3-relay-bin.000002 Relay_Log_Pos: 562 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 4296 Relay_Log_Space: 780 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_UUID: 909ee6b1-8f51-11e4-aebf-00163f00801f Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: 909ee6b1-8f51-11e4-aebf-00163f00801f:1 Executed_Gtid_Set: 909ee6b1-8f51-11e4-aebf-00163f00801f:1, a861ddd1-8f4e-11e4-aeac-86f5fb218bef:1-15 Auto_Position: 1 1 row in set (0.00 sec)
start from new GTID — This GTID is new and we can also see old master GTID
#cat /data/mysql/data/auto.cnf [auto] server-uuid=909ee6b1-8f51-11e4-aebf-00163f00801f
this GTID was gathered when slave has been started.
start old master 10.0.128.77 again and add it to the cluster.
Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.17-log MySQL Community Server (GPL) Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CHANGE MASTER TO MASTER_HOST='10.0.128.110', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='rep', MASTER_PASSWORD='rep'; Query OK, 0 rows affected, 2 warnings (0.32 sec) mysql> start slave ; Query OK, 0 rows affected, 1 warning (0.06 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | liuyang2 | | liuyang3 | | mysql | | performance_schema | | test | +--------------------+ 6 rows in set (0.00 sec)
database liuyang3 appeared.
———————————————————————–
initialize cluster:
drop all database;
master: 10.0.128.110 slave: 10.0.128.77/114/113
load data into master. we keep one specific slave has lag and set “sync_binlog=1″ on all slaves.
mysql> delimiter $$ mysql> mysql> CREATE PROCEDURE myprocedure() -> BEGIN -> DECLARE i INT DEFAULT 1; -> -> CREATE TABLE test -> (ascii_code int, ascii_char CHAR(1)); -> -> WHILE (i INSERT INTO test VALUES(i,CHAR(i)); -> SET i=i+1; -> END WHILE; -> -> END$$ Query OK, 0 rows affected (0.03 sec) mysql> delimiter ; mysql> mysql> call myprocedure(); Query OK, 1 row affected, 1 warning (5 min 53.61 sec) mysql> select count(*) from liuyang1.test; +----------+ | count(*) | +----------+ | 10000 | +----------+ 1 row in set (0.01 sec)
10.0.128.113/114 are both consistent with master
mysql> select count(*) from liuyang1.test; +----------+ | count(*) | +----------+ | 10000 | +----------+ 1 row in set (0.01 sec) Retrieved_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-10003 Executed_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-10003 Auto_Position: 1
10.0.128.77 has a huge lag:
Retrieved_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-2548 Executed_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-1979 Auto_Position: 1
now kill master server:
oldest slave will change master to 10.0.128.113:
Tue Dec 30 16:36:15 2014 - [info] Retrieved Gtid Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-10003 Tue Dec 30 16:36:15 2014 - [info] Latest slaves (Slaves that received relay log files to the latest): Tue Dec 30 16:36:15 2014 - [info] 10.0.128.113(10.0.128.113:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 16:36:15 2014 - [info] GTID ON Tue Dec 30 16:36:15 2014 - [info] Replicating from 10.0.128.110(10.0.128.110:3306) Tue Dec 30 16:36:15 2014 - [info] 10.0.128.114(10.0.128.114:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 16:36:15 2014 - [info] GTID ON Tue Dec 30 16:36:15 2014 - [info] Replicating from 10.0.128.110(10.0.128.110:3306) Tue Dec 30 16:36:15 2014 - [info] The oldest binary log file/position on all slaves is mysql-bin.000001:885010 Tue Dec 30 16:36:15 2014 - [info] Retrieved Gtid Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-2864 Tue Dec 30 16:36:15 2014 - [info] Oldest slaves: Tue Dec 30 16:36:15 2014 - [info] 10.0.128.77(10.0.128.77:3306) Version=5.6.17-log (oldest major version between slaves) log-bin:enabled Tue Dec 30 16:36:15 2014 - [info] GTID ON Tue Dec 30 16:36:15 2014 - [info] Replicating from 10.0.128.110(10.0.128.110:3306) Tue Dec 30 16:36:15 2014 - [info] Tue Dec 30 16:36:15 2014 - [info] * Phase 3.3: Determining New Master Phase.. Tue Dec 30 16:36:15 2014 - [info] Tue Dec 30 16:36:15 2014 - [info] Searching new master from slaves.. Tue Dec 30 16:36:15 2014 - [info] Candidate masters from the configuration file: Tue Dec 30 16:36:15 2014 - [info] Non-candidate masters: Tue Dec 30 16:36:15 2014 - [info] New master is 10.0.128.113(10.0.128.113:3306)
and MHA will control 10.0.128.77 to do this command:
SELECT WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS('4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-10003',0) AS Result show slave status\G; <pre class='brush:php;toolbar:false;'> shows: Retrieved_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:2786-5067 Executed_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-3929 Auto_Position: 1
this means oldest slave 10.0.128.77 changes to 10.0.128.113 for the new master . Star from transaction id 2785:
Retrieved_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-2785 in old master
Retrieved_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:2786-5067 in current master
wait a long time ……
Retrieved_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:2786-10003 Executed_Gtid_Set: 4f1dfbe6-8ffc-11e4-b318-00163f00801f:1-10003 Auto_Position: 1 mysql> select count(*) from liuyang1.test; +----------+ | count(*) | +----------+ | 10000 | +----------+
oldest slave 10.0.128.77 was OK !
—————————-
TEST more details:
We test two cases:
1. just 10.0.128.77 has lag — make 10.0.128.77 as candidate master
2. all slaves have lag — make 10.0.128.114 as candidate master
initialize environment
all slave should get new GTID:
master: 10.0.128.113
slave: 10.0.128.110/77/114
make all slaves have lag manually .
First rerun procedure to initialize data
mysql> call myprocedure(); Query OK, 1 row affected, 1 warning (1 min 9.98 sec)
kill master MHA will choose a new candidate master and do slave recovery:
10.0.128.110/114 position:
Retrieved_Gtid_Set: f776aa4d-908d-11e4-b6ce-00163f008023:1-10006 Executed_Gtid_Set: f776aa4d-908d-11e4-b6ce-00163f008023:1-10006 Auto_Position: 1
10.0.128.77 position:
Retrieved_Gtid_Set: f776aa4d-908d-11e4-b6ce-00163f008023:1-1269 Executed_Gtid_Set: f776aa4d-908d-11e4-b6ce-00163f008023:1-684 Auto_Position: 1
+--10.0.128.114(10.0.128.114:3306) Wed Dec 31 10:09:50 2014 - [info] Wed Dec 31 10:09:50 2014 - [info] * Phase 3.3: New Master Recovery Phase.. Wed Dec 31 10:09:50 2014 - [info] Wed Dec 31 10:09:50 2014 - [info] Waiting all logs to be applied.. Wed Dec 31 10:10:26 2014 - [info] done. Wed Dec 31 10:11:28 2014 - [info] Replicating from the latest slave 10.0.128.110(10.0.128.110:3306) and waiting to apply.. Wed Dec 31 10:11:28 2014 - [info] Waiting all logs to be applied on the latest slave.. Wed Dec 31 10:11:28 2014 - [info] Resetting slave 10.0.128.77(10.0.128.77:3306) and starting replication from the new master 10.0.128.110(10.0.128.110:3306).. Wed Dec 31 10:11:28 2014 - [info] Executed CHANGE MASTER. Wed Dec 31 10:11:28 2014 - [info] Slave started. Wed Dec 31 10:11:28 2014 - [info] Waiting to execute all relay logs on 10.0.128.77(10.0.128.77:3306)..
mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: System lock Master_Host: 10.0.128.110 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 487380 Relay_Log_File: a3-oracle-128-77-relay-bin.000002 Relay_Log_Pos: 23897 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 458828 Relay_Log_Space: 52664 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 333 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_UUID: 354a7e91-908e-11e4-b6d0-00163f00801f Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Reading event from the relay log Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: f776aa4d-908d-11e4-b6ce-00163f008023:1539-1722 Executed_Gtid_Set: f776aa4d-908d-11e4-b6ce-00163f008023:1-1622 Auto_Position: 1
MHA will control 77 to change master to 10.0.128.110 (prior is 10.0.128.113)
so 77 start from new transaction (old is 1-1538, new is 1539-1722)
we can wait until reach 10006 on lastest
dbadmin | 10.0.128.25:17331 | NULL | Query | 60 | Waiting for the slave SQL thread to advance position | SELECT MASTER_POS_WAIT('mysql-bin.000001','2831783',0) AS Result
in 10.0.128.110 show master status:
| mysql-bin.000001 | 2831783 | | | f776aa4d-908d-11e4-b6ce-00163f008023:1-10006 |
until 77 reached transaction 10006 ,MHA will continue:
Wed Dec 31 10:36:14 2014 - [info] master_pos_wait(mysql-bin.000001:2831783) completed on 10.0.128.77(10.0.128.77:3306). Executed 4229 events. Wed Dec 31 10:36:14 2014 - [info] done. Wed Dec 31 10:36:14 2014 - [info] done. Wed Dec 31 10:36:14 2014 - [info] Getting new master's binlog name and position.. Wed Dec 31 10:36:14 2014 - [info] master-bin.000001:2831783 Wed Dec 31 10:36:14 2014 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.128.77', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='rep', MASTER_PASSWORD='xxx'; Wed Dec 31 10:36:14 2014 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: master-bin.000001, 2831783, f776aa4d-908d-11e4-b6ce-00163f008023:1-10006 Wed Dec 31 10:36:14 2014 - [info] Executing master IP activate script: Wed Dec 31 10:36:14 2014 - [info] /usr/local/dbadmin/mhascripts/master_ip_failover_pajk01.pl --command=start --ssh_user=root --orig_master_host=10.0.128.113 --orig_master_ip=10.0.128.113 --orig_master_port=3306 --new_master_host=10.0.128.77 --new_master_ip=10.0.128.77 --new_master_port=3306 --new_master_user='dbadmin' --new_master_password='NV7yVBpn88cg4WJCVlZd' Set read_only=0 on the new master. Enabling the VIP - 10.0.128.35/24 on the new master - 10.0.128.77 sysfs read broadcast value: No such file or directory falling back to default broadcast value Wed Dec 31 10:36:17 2014 - [info] OK. Wed Dec 31 10:36:17 2014 - [info] ** Finished master recovery successfully. Wed Dec 31 10:36:17 2014 - [info] * Phase 3: Master Recovery Phase completed. Wed Dec 31 10:36:17 2014 - [info] Wed Dec 31 10:36:17 2014 - [info] * Phase 4: Slaves Recovery Phase.. Wed Dec 31 10:36:17 2014 - [info] Wed Dec 31 10:36:17 2014 - [info] Wed Dec 31 10:36:17 2014 - [info] * Phase 4.1: Starting Slaves in parallel.. Wed Dec 31 10:36:17 2014 - [info] Wed Dec 31 10:36:17 2014 - [info] -- Slave recovery on host 10.0.128.110(10.0.128.110:3306) started, pid: 13177. Check tmp log /var/log/masterha/test/10.0.128.110_3306_20141231100948.log if it takes time.. Wed Dec 31 10:36:17 2014 - [info] -- Slave recovery on host 10.0.128.114(10.0.128.114:3306) started, pid: 13178. Check tmp log /var/log/masterha/test/10.0.128.114_3306_20141231100948.log if it takes time.. Wed Dec 31 10:36:18 2014 - [info] Wed Dec 31 10:36:18 2014 - [info] Log messages from 10.0.128.114 ... Wed Dec 31 10:36:18 2014 - [info] Wed Dec 31 10:36:17 2014 - [info] Resetting slave 10.0.128.114(10.0.128.114:3306) and starting replication from the new master 10.0.128.77(10.0.128.77:3306).. Wed Dec 31 10:36:17 2014 - [info] Executed CHANGE MASTER. Wed Dec 31 10:36:18 2014 - [info] Slave started. Wed Dec 31 10:36:18 2014 - [info] gtid_wait(f776aa4d-908d-11e4-b6ce-00163f008023:1-10006) completed on 10.0.128.114(10.0.128.114:3306). Executed 0 events. Wed Dec 31 10:36:18 2014 - [info] End of log messages from 10.0.128.114. Wed Dec 31 10:36:18 2014 - [info] -- Slave on host 10.0.128.114(10.0.128.114:3306) started. Wed Dec 31 10:36:18 2014 - [info] Wed Dec 31 10:36:18 2014 - [info] Log messages from 10.0.128.110 ... Wed Dec 31 10:36:18 2014 - [info] Wed Dec 31 10:36:17 2014 - [info] Resetting slave 10.0.128.110(10.0.128.110:3306) and starting replication from the new master 10.0.128.77(10.0.128.77:3306).. Wed Dec 31 10:36:18 2014 - [info] Executed CHANGE MASTER. Wed Dec 31 10:36:18 2014 - [info] Slave started. Wed Dec 31 10:36:18 2014 - [info] gtid_wait(f776aa4d-908d-11e4-b6ce-00163f008023:1-10006) completed on 10.0.128.110(10.0.128.110:3306). Executed 0 events. Wed Dec 31 10:36:18 2014 - [info] End of log messages from 10.0.128.110. Wed Dec 31 10:36:18 2014 - [info] -- Slave on host 10.0.128.110(10.0.128.110:3306) started. Wed Dec 31 10:36:18 2014 - [info] All new slave servers recovered successfully. Wed Dec 31 10:36:18 2014 - [info] Wed Dec 31 10:36:18 2014 - [info] * Phase 5: New master cleanup phase.. Wed Dec 31 10:36:18 2014 - [info] Wed Dec 31 10:36:18 2014 - [info] Resetting slave info on the new master.. Wed Dec 31 10:36:18 2014 - [info] 10.0.128.77: Resetting slave info succeeded. Wed Dec 31 10:36:18 2014 - [info] Master failover to 10.0.128.77(10.0.128.77:3306) completed successfully. Wed Dec 31 10:36:18 2014 - [info]
try to recover other slaves but other slave are both newest. So change master to 77 directly.
——————————————————————–
2. all slaves have lag and 114 is candidate master.
slaves will do not merge master’s binlog (I don’t know why)
initialize cluster again:
master : 10.0.128.113
slave : 10.0.128.114(candidate master)/77/110
drop table test ;
Run procedure again:
we got final result .
mysql> select count(*) from liuyang1.test; +----------+ | count(*) | +----------+ | 5852 | +----------+
we lost 4148 records.
Wed Dec 31 10:50:33 2014 - [info] * Phase 3.3: New Master Recovery Phase.. Wed Dec 31 10:50:33 2014 - [info] Wed Dec 31 10:50:33 2014 - [info] Waiting all logs to be applied.. Wed Dec 31 10:50:33 2014 - [info] done. Wed Dec 31 10:51:35 2014 - [info] Getting new master's binlog name and position.. Wed Dec 31 10:51:35 2014 - [info] mysql-bin.000001:4488549 Wed Dec 31 10:51:35 2014 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.128.114', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='rep', MASTER_PASSWORD='xxx'; Wed Dec 31 10:51:35 2014 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 4488549, 5d04dd5c-908e-11e4-b6d1-86f5fb218bef:1, f776aa4d-908d-11e4-b6ce-00163f008023:1-15861 Wed Dec 31 10:51:35 2014 - [info] Executing master IP activate script: Wed Dec 31 10:51:35 2014 - [info] /usr/local/dbadmin/mhascripts/master_ip_failover_pajk01.pl --command=start --ssh_user=root --orig_master_host=10.0.128.113 --orig_master_ip=10.0.128.113 --orig_master_port=3306 --new_master_host=10.0.128.114 --new_master_ip=10.0.128.114 --new_master_port=3306 --new_master_user='dbadmin' --new_master_password='NV7yVBpn88cg4WJCVlZd' Set read_only=0 on the new master. Enabling the VIP - 10.0.128.35/24 on the new master - 10.0.128.114 Wed Dec 31 10:51:38 2014 - [info] OK. Wed Dec 31 10:51:38 2014 - [info] ** Finished master recovery successfully. Wed Dec 31 10:51:38 2014 - [info] * Phase 3: Master Recovery Phase completed. Wed Dec 31 10:51:38 2014 - [info] Wed Dec 31 10:51:38 2014 - [info] * Phase 4: Slaves Recovery Phase.. Wed Dec 31 10:51:38 2014 - [info] Wed Dec 31 10:51:38 2014 - [info] Wed Dec 31 10:51:38 2014 - [info] * Phase 4.1: Starting Slaves in parallel.. Wed Dec 31 10:51:38 2014 - [info] Wed Dec 31 10:51:38 2014 - [info] -- Slave recovery on host 10.0.128.77(10.0.128.77:3306) started, pid: 13508. Check tmp log /var/log/masterha/test/10.0.128.77_3306_20141231105020.log if it takes time.. Wed Dec 31 10:51:38 2014 - [info] -- Slave recovery on host 10.0.128.110(10.0.128.110:3306) started, pid: 13509. Check tmp log /var/log/masterha/test/10.0.128.110_3306_20141231105020.log if it takes time.. Wed Dec 31 11:01:34 2014 - [info] Wed Dec 31 11:01:34 2014 - [info] Log messages from 10.0.128.77 ... Wed Dec 31 11:01:34 2014 - [info] Wed Dec 31 10:51:38 2014 - [info] Resetting slave 10.0.128.77(10.0.128.77:3306) and starting replication from the new master 10.0.128.114(10.0.128.114:3306).. Wed Dec 31 10:52:41 2014 - [info] Executed CHANGE MASTER. Wed Dec 31 10:52:41 2014 - [info] Slave started. Wed Dec 31 11:01:34 2014 - [info] gtid_wait(5d04dd5c-908e-11e4-b6d1-86f5fb218bef:1, f776aa4d-908d-11e4-b6ce-00163f008023:1-15861) completed on 10.0.128.77(10.0.128.77:3306). Executed 855 events. Wed Dec 31 11:01:34 2014 - [info] End of log messages from 10.0.128.77. Wed Dec 31 11:01:34 2014 - [info] -- Slave on host 10.0.128.77(10.0.128.77:3306) started. Wed Dec 31 11:02:03 2014 - [info] Wed Dec 31 11:02:03 2014 - [info] Log messages from 10.0.128.110 ... Wed Dec 31 11:02:03 2014 - [info] Wed Dec 31 10:51:38 2014 - [info] Resetting slave 10.0.128.110(10.0.128.110:3306) and starting replication from the new master 10.0.128.114(10.0.128.114:3306).. Wed Dec 31 10:52:41 2014 - [info] Executed CHANGE MASTER. Wed Dec 31 10:52:41 2014 - [info] Slave started. Wed Dec 31 11:02:03 2014 - [info] gtid_wait(5d04dd5c-908e-11e4-b6d1-86f5fb218bef:1, f776aa4d-908d-11e4-b6ce-00163f008023:1-15861) completed on 10.0.128.110(10.0.128.110:3306). Executed 142 events. Wed Dec 31 11:02:03 2014 - [info] End of log messages from 10.0.128.110. Wed Dec 31 11:02:03 2014 - [info] -- Slave on host 10.0.128.110(10.0.128.110:3306) started. Wed Dec 31 11:02:03 2014 - [info] All new slave servers recovered successfully. Wed Dec 31 11:02:03 2014 - [info] Wed Dec 31 11:02:03 2014 - [info] * Phase 5: New master cleanup phase.. Wed Dec 31 11:02:03 2014 - [info] Wed Dec 31 11:02:03 2014 - [info] Resetting slave info on the new master.. Wed Dec 31 11:02:03 2014 - [info] 10.0.128.114: Resetting slave info succeeded. Wed Dec 31 11:02:03 2014 - [info] Master failover to 10.0.128.114(10.0.128.114:3306) completed successfully. Wed Dec 31 11:02:03 2014 - [info]
10.0.128.114 do not merge newest binlog from 10.0.128.113 (although it can ssh to 113 server )
————————————————————————————-
add binlog server and make 77 server as candidate master
[binlog1] hostname=10.0.128.252
10.0.128.252 is manager node machine. we set binlog server in this machine.
#ls -l /data/mysql/data/
total 2896
-rw-rw-r– 1 mysql mysql 2960898 Dec 31 13:57 mysql-bin.000001
drop test table
run procedure again
we check lastest slave
10.0.128.114 :
Retrieved_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-3357 Executed_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-3356
10.0.128.77:
Retrieved_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-102 Executed_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-48
10.0.128.110:
Retrieved_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-344 Executed_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-270
and all slaves have lag : 10.0.128.77>10.0.128.110>10.0.128.114 (lastest slave)
now we kill master server:
Waiting to execute all relay logs on 10.0.128.77(10.0.128.77:3306).. Wed Dec 31 13:59:28 2014 - [info] master_pos_wait(mysql-bin.000001:1219763) completed on 10.0.128.77(10.0.128.77:3306). Executed 37 events. Wed Dec 31 13:59:28 2014 - [info] done. Wed Dec 31 13:59:28 2014 - [info] done. Wed Dec 31 13:59:28 2014 - [info] -- Saving binlog from host 10.0.128.252 started, pid: 15893 Wed Dec 31 13:59:29 2014 - [info] Wed Dec 31 13:59:29 2014 - [info] Log messages from 10.0.128.252 ... Wed Dec 31 13:59:29 2014 - [info] Wed Dec 31 13:59:28 2014 - [info] Fetching binary logs from binlog server 10.0.128.252.. Wed Dec 31 13:59:28 2014 - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin.000001 --start_pos=1275770 --output_file=/var/log/masterha/test/saved_binlog_binlog1_20141231135735.binlog --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=0 --manager_version=0.56 --oldest_version=5.6.17-log --binlog_dir=/data/mysql/data Creating /var/log/masterha/test if not exists.. ok. Concat binary/relay logs from mysql-bin.000001 pos 1275770 to mysql-bin.000001 EOF into /var/log/masterha/test/saved_binlog_binlog1_20141231135735.binlog .. Concat succeeded. Wed Dec 31 13:59:29 2014 - [info] scp from root@10.0.128.252:/var/log/masterha/test/saved_binlog_binlog1_20141231135735.binlog to local:/var/log/masterha/test/saved_binlog_10.0.128.252_binlog1_20141231135735.binlog succeeded. Wed Dec 31 13:59:29 2014 - [info] End of log messages from 10.0.128.252. Wed Dec 31 13:59:29 2014 - [info] Saved mysqlbinlog size from 10.0.128.252 is 4885833 bytes. Wed Dec 31 13:59:29 2014 - [info] Applying differential binlog /var/log/masterha/test/saved_binlog_10.0.128.252_binlog1_20141231135735.binlog .. Wed Dec 31 13:59:39 2014 - [info] Differential log apply from binlog server succeeded. Wed Dec 31 13:59:39 2014 - [info] Getting new master's binlog name and position.. Wed Dec 31 13:59:39 2014 - [info] master-bin.000001:3485577
MHA will scp binlog server binlog and Concat binlog, all slaves are consistent in the end. (same with old behavior)
check all slaves:
Retrieved_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:4311-10003 Executed_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-10003 Auto_Position: 1 mysql> select count(*) from liuyang1.test; +----------+ | count(*) | +----------+ | 10000 | +----------+
no data lost. and every slaves got final transaction 10003.
—————————————————————-
add an other binlog server (master server)
[binlog1] hostname=10.0.128.252 [binlog2] hostname=10.0.128.113
add 113 to cluster:
master : 10.0.128.113
slave : 10.0.128.77/110/114
run procedure again:
mysql> delimiter ; mysql> call myprocedure();
10.0.128.77:
Retrieved_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:10004-10327 Executed_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-10151, cfd90b74-90af-11e4-b7ab-86f5fb218bef:1 Auto_Position: 1
10.0.128.110:
Retrieved_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:10004-10621 Executed_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-10177, cfd90b74-90af-11e4-b7ab-86f5fb218bef:1 Auto_Position: 1
10.0.128.114:
Retrieved_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:10004-12723 Executed_Gtid_Set: 1cc021e4-90b0-11e4-b7ad-00163f008023:1-12722, cfd90b74-90af-11e4-b7ab-86f5fb218bef:1 Auto_Position: 1
now we remove binlog on 10.0.128.252 manually and kill master server :
find these messages:
Wed Dec 31 15:26:16 2014 - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin.000003 --start_pos=1243184 --output_file=/var/log/masterha/test/saved_binlog_binlog1_20141231152512.binlog --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=0 --manager_version=0.56 --oldest_version=5.6.17-log --binlog_dir=/data/mysql/data Failed to save binary log: Binlog not found from /data/mysql/data! If you got this error at MHA Manager, please set "master_binlog_dir=/path/to/binlog_directory_of_the_master" correctly in the MHA Manager's configuration file and try again. at /usr/bin/save_binary_logs line 123 eval {...} called at /usr/bin/save_binary_logs line 70 main::main() called at /usr/bin/save_binary_logs line 66 Wed Dec 31 15:26:16 2014 - [error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln660] Failed to save binary log events from the binlog server. Maybe disks on binary logs are not accessible or binary log itself is corrupt? Wed Dec 31 15:26:16 2014 - [info] End of log messages from 10.0.128.252. Wed Dec 31 15:26:16 2014 - [warning] Got error from 10.0.128.252. Wed Dec 31 15:26:18 2014 - [info] Wed Dec 31 15:26:18 2014 - [info] Log messages from 10.0.128.113 ... Wed Dec 31 15:26:18 2014 - [info] Wed Dec 31 15:26:16 2014 - [info] Fetching binary logs from binlog server 10.0.128.113.. Wed Dec 31 15:26:16 2014 - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin.000003 --start_pos=1243184 --output_file=/var/log/masterha/test/saved_binlog_binlog2_20141231152512.binlog --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=0 --manager_version=0.56 --oldest_version=5.6.17-log --binlog_dir=/data/mysql/data Creating /var/log/masterha/test if not exists.. ok.
MHA check binlog server 1 and find error with binlog then try to recover from binlog server 2.
With this feature we can add master server as first binlog server to guarantee no binlog loss.If master server crash and can not been reached,binlog server 2 will also provide service.
Final:
MHA has changed failover procedure when turn on GTID. 1.If you want to use binlog server you must open GTID (I don't know why Yoshi design like this) 2.MHA will not use relay-log to reocver (apply_diff_relay_logs will not be used ,just change master to lastest slave to recover with GTID) 3.if there's no binlog server setting. MHA will ignore master binlog (even Manager node can ssh to Master server) 4.if you do not set binlog server on MHA, data may lost (MHA just keep all slaves consistent,so if lastest slave has lag data will be lost) 5.you can set multiple binlog server ,MHA will check them orderly,you can even set Master server as binlog server.
原文地址:MHA with binlog server, 感谢原作者分享。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

1. Binlog Binlog is used to record write operations (excluding queries) information performed by the database and save it on the disk in binary form. Binlog is the logical log of mysql and is recorded by the server layer. Mysql databases using any storage engine will record binlog logs. Logical log: can be simply understood as a sql statement; physical log: data in MySQL is stored in the data page, and the physical log records changes on the data page; insert the code piece here and the binlog is written by appending Input, you can set the size of each binlog file through the max_binlog_size parameter. When the file size reaches the given value

WindowsServerBackup is a function that comes with the WindowsServer operating system, designed to help users protect important data and system configurations, and provide complete backup and recovery solutions for small, medium and enterprise-level enterprises. Only users running Server2022 and higher can use this feature. In this article, we will explain how to install, uninstall or reset WindowsServerBackup. How to Reset Windows Server Backup If you are experiencing problems with your server backup, the backup is taking too long, or you are unable to access stored files, then you may consider resetting your Windows Server backup settings. To reset Windows

Preface There are six types of log files in MySQL, namely: redo log (redolog), rollback log (undolog), binary log (binlog), error log (errorlog), slow query log (slowquerylog), general query log (generallog) ), relay log (relaylog). 1. What is redolog? Redolog, also known as redo log file, is used to record changes in transaction operations. It records the value after data modification. It will be recorded regardless of whether the transaction is submitted or not. Redolog files can come in handy when instances and media fail (mediafailure), such as database power outage, Inn

On the occasion of releasing the build 26040 version of Windows Server, Microsoft announced the official name of the product: Windows Server 2025. Also launched is the Windows11WindowsInsiderCanaryChannel version build26040. Some friends may still remember that many years ago someone successfully converted Windows NT from workstation mode to server mode, showing the commonalities between various versions of Microsoft operating systems. Although there are clear differences between Microsoft's current version of the server operating system and Windows 11, those who pay attention to the details may be curious: why Windows Server updated the brand,

Regarding the MySQL binary log (binlog), we all know that the binary log (binlog) is very important, especially when you need point-to-point disaster recovery, so we need to back it up. Regarding the backup of binary log (binlog), you can first switch the binlog based on the flushlogs method, and then copy & compress it to other storage on the remote server or local server, such as mounted NAS storage. You can also use mysqlbinlog to backup the binlog. Implement local backup or remote backup of MySQL binary log (binlog). Finally, the MySQL binary log (binlog

1. Introduction to Binlog log Binlog is the abbreviation of Binarylog, that is, binary log. Binlog has three main functions: converting random IO into sequential IO during persistence, master-slave replication and data recovery. This article focuses on issues related to master-slave replication. The Binlog log consists of an index file and many log files. Each log file consists of a magic number and an event. Each log file ends with a Rotate type event. For each event, it can be divided into two parts: event header and event body: The structure of the event header is as follows: The structure of the event body includes two parts: fixed size and variable size. For the format of Binlog log, you can have a simple understanding. Interested students can go deeper.

Many of us have seen this snippet over and over again in Python code: with open('Hi.text', 'w') as f: f.write("Hello, there") However, some of us Some people don't know what with is used for and why we need to use it here. In this read, you'll find out about almost any problem that can be solved with. let's start! First, let's consider what we need to do without using the with keyword. In this case we need to open the file first and try to write. Regardless of success or failure, we'd better be

1. Source of the problem When analyzing performance problems, slow queries and binlog slow transactions are commonly used methods. Recently, I was analyzing a slow query and found that it contained a large number of commit statements that were slow, but the matching could not be completed when analyzing the binlog slow transactions. For example, there may be 1,000 commit statements during this period, but there may be only 100 slow transactions. This is too big a difference, so why does this phenomenon occur? 2. The respective determination methods for slow transactions are usually as follows for an explicitly submitted (insert) transaction: GTID_LOG_EVENT and XID_EVENT are the time when the command ‘COMMIT’ is initiated.
