Table of Contents
一:问题描述
二:出错原因
三:解决办法
Home Database Mysql Tutorial 从库crash一直自动重启(mysqld got signal 11)问题解决

从库crash一直自动重启(mysqld got signal 11)问题解决

Jun 07, 2016 pm 02:50 PM
crash got mysqld s automatic Restart

一:问题描述 今天收到邮件报警,遂进数据库查看slave状态,发现io进程和sql进程都为NO. mysql show slave status \G;*************************** 1. row*************************** Slave_IO_State: Master_Host: 此处不予显示,哈哈 Master_User: replic

一:问题描述

 

今天收到邮件报警,遂进数据库查看slave状态,发现io进程和sql进程都为NO.

mysql> show slave status \G;
*************************** 1. row***************************
               Slave_IO_State:
                  Master_Host: 此处不予显示,哈哈
                  Master_User: replica
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File:master1-bin.001191
         Read_Master_Log_Pos: 29214749
               Relay_Log_File:web_appdb_10-relay-bin.000663
                Relay_Log_Pos: 29213639
       Relay_Master_Log_File: master1-bin.001191
            Slave_IO_Running: No
           Slave_SQL_Running: No
              Replicate_Do_DB:
         Replicate_Ignore_DB:
          Replicate_Do_Table:
      Replicate_Ignore_Table:
     Replicate_Wild_Do_Table: ccda.%,eip_fileservice.%
 Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
         Exec_Master_Log_Pos: 29213491
              Relay_Log_Space: 29215212
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
          Master_SSL_Allowed: No
          Master_SSL_CA_File:
          Master_SSL_CA_Path:
              Master_SSL_Cert:
           Master_SSL_Cipher:
               Master_SSL_Key:
       Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
 Replicate_Ignore_Server_Ids:
            Master_Server_Id: 0
1 row in set (0.01 sec)
 
ERROR:
No query specified
Copy after login

尝试启动,然后再次查看状态,竟然报错,说连不上数据库。

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
 
mysql> show slave status \G;
ERROR 2006 (HY000): MySQL server has goneaway
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to localMySQL server through socket '/tmp/mysql.sock' (2)
ERROR:
Can't connect to the server
 
ERROR:
No query specified
Copy after login

连续尝试多次,可以登录数据库了,再次查询复制,发现状态还是NO.

 

mysql> show slave status \G;
No connection. Trying to reconnect...
Connection id:    1
Current database: *** NONE ***
 
*************************** 1. row***************************
               Slave_IO_State:
                  Master_Host: 10.0.3.34
                  Master_User: replica
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File:master1-bin.001191
         Read_Master_Log_Pos: 29214749
               Relay_Log_File:web_appdb_10-relay-bin.000663
                Relay_Log_Pos: 29213639
       Relay_Master_Log_File: master1-bin.001191
            Slave_IO_Running: No
           Slave_SQL_Running: No
              Replicate_Do_DB:
         Replicate_Ignore_DB:
          Replicate_Do_Table:
      Replicate_Ignore_Table:
     Replicate_Wild_Do_Table: ccda.%,eip_fileservice.%
 Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                  Last_Error:
                 Skip_Counter: 0
         Exec_Master_Log_Pos: 29213491
              Relay_Log_Space: 29215426
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
          Master_SSL_Allowed: No
          Master_SSL_CA_File:
          Master_SSL_CA_Path:
              Master_SSL_Cert:
           Master_SSL_Cipher:
               Master_SSL_Key:
       Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
               Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
 Replicate_Ignore_Server_Ids:
            Master_Server_Id: 0
1 row in set (0.00 sec)
 
ERROR:
No query specified
Copy after login
 

发现只要start slave,该服务器数据库就会自动重启。

而且start slave io_thread没问题,当start slave sql_thread时,才会导致数据库自动重启。

 

查看错误日志:

160429 9:09:00 [Note] Event Scheduler: Loaded 0 events
160429 9:09:00 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.5.19-log'  socket: '/tmp/mysql.sock'  port: 3306 Source distribution
160429 11:04:47 [Note] Slave SQL threadinitialized, starting replication in log 'master1-bin.001191' at position29213491, relay log './web_appdb_10-relay-bin.000663' position: 29213639
160429 11:04:47 - mysqld got signal 11 ;
This could be because you hit a bug. It isalso possible that this binary
or one of the libraries it was linkedagainst is corrupt, improperly built,
or misconfigured. This error can also becaused by malfunctioning hardware.
We will try our best to scrape up some infothat will hopefully help diagnose
the problem, but since we have alreadycrashed, something is definitely wrong
and this may fail.
 
key_buffer_size=268435456
read_buffer_size=6291456
max_used_connections=3
max_threads=2000
thread_count=2
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size +sort_buffer_size)*max_threads = 20764878 K
bytes of memory
Hope that's ok; if not, decrease somevariables in the equation.
 
Thread pointer: 0x2ab2f1b54740
Attempting backtrace. You can use thefollowing information to find out
where mysqld died. If you see no messagesafter this, something went
terribly wrong...
stack_bottom = 0x594310e8 thread_stack0x30000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x33)[0x765df3]
/usr/local/mysql/bin/mysqld(handle_segfault+0x36e)[0x4ee4fe]
/lib64/libpthread.so.0[0x31a640ebe0]
/usr/local/mysql/bin/mysqld(_ZNK9table_def15compatible_withEP3THDP14Relay_log_infoP5TABLEPS5_+0x31a)[0x74c29a]
/usr/local/mysql/bin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0xcdc)[0x6f0d3c]
/usr/local/mysql/bin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_info+0x14d)[0x5021ed]
/usr/local/mysql/bin/mysqld[0x504b19]
/usr/local/mysql/bin/mysqld(handle_slave_sql+0xc0a)[0x5061ea]
/lib64/libpthread.so.0[0x31a640677d]
/lib64/libc.so.6(clone+0x6d)[0x31a54d49ad]
 
Trying to get some variables.
Some pointers may be invalid and cause thedump to abort.
Query ((nil)): is an invalid pointer
Connection ID (thread ID): 353
Status: NOT_KILLED
 
The manual page athttp://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find outwhat is causing the crash.
160429 11:04:48 mysqld_safe Number ofprocesses running now: 0
160429 11:04:48 mysqld_safe mysqldrestarted
160429 11:04:48 InnoDB: The InnoDB memoryheap is disabled
160429 11:04:48 InnoDB: Mutexes andrw_locks use GCC atomic builtins
160429 11:04:48 InnoDB: Compressed tablesuse zlib 1.2.3
160429 11:04:48 InnoDB: Initializing bufferpool, size = 32.0G
160429 11:04:50 InnoDB: Completedinitialization of buffer pool
160429 11:04:50 InnoDB: highest supportedfile format is Barracuda.
InnoDB: The log sequence number in ibdatafiles does not match
InnoDB: the log sequence number in theib_logfiles!
160429 11:04:50  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information fromthe .ibd files...
InnoDB: Restoring possible half-writtendata pages from the doublewrite
InnoDB: buffer...
InnoDB: Last MySQL binlog file position 0112571, file name ./mysql-bin.048292
160429 11:04:52  InnoDB: Waiting for the background threads tostart
160429 11:04:53 InnoDB: 1.1.8 started; logsequence number 5992159806777
160429 11:04:53 [Note] Recovering after acrash using mysql-bin
160429 11:04:53 [Note] Starting crashrecovery...
160429 11:04:53 [Note] Crash recoveryfinished.
160429 11:04:53 [Warning] Neither--relay-log nor --relay-log-index were used; so replication may break when thisMySQL server acts as a slave and has his hostname changed!! Please use'--relay-log=web_appdb_10-relay-bin' to avoid this problem.
160429 11:04:53 [Note] Event Scheduler:Loaded 0 events
160429 11:04:53 [Note]/usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.5.19-log'  socket: '/tmp/mysql.sock'  port: 3306 Source distribution
Copy after login

 

对于这个错误”mysqld got signal 11”,我在网上查了,有的说是磁盘空间满了,有的说是内存问题,也有可能是硬件错误,也有可能是中继日志重放位置的sql导致的。

 

查看中继日志该位置执行的语句:

Relay_Log_File:web_appdb_10-relay-bin.000663

                Relay_Log_Pos: 29213639

 

# at 29213639
#160428 21:29:32 server id 1  end_log_pos 29213559      Query  thread_id=624506       exec_time=0     error_code=0
SET TIMESTAMP=1461850172/*!*/;
/*!\C utf8mb4 *//*!*/;
SET@@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33/*!*/;
BEGIN
/*!*/;
# at 29213707
#160428 21:29:32 server id 1  end_log_pos 29213657      Table_map: `ccda`.`ess_accounting_relation`mapped to number 311993
# at 29213805
#160428 21:29:32 server id 1  end_log_pos 29213757      Table_map:`ccda`.`ess_accounting_relation_1` mapped to number 311994
# at 29213905
#160428 21:29:32 server id 1  end_log_pos 29214051      Update_rows: table id 311993 flags:STMT_END_F
 
BINLOG '
PBAiVxMBAAAAYgAAANnDvQEAALnCBAAAAAEABGNjZGEAF2Vzc19hY2NvdW50aW5nX3JlbGF0aW9u
AA4DDw8PDw8PDw8PDw8MAxYsATwAPAAGACwBPAAsAZYAlgCWAJYA/D8=
PBAiVxMBAAAAZAAAAD3EvQEAALrCBAAAAAEABGNjZGEAGWVzc19hY2NvdW50aW5nX3JlbGF0aW9u
XzEADgMPDw8PDw8PDw8PDwwDFiwBPAA8AAYALAE8ACwBlgCWAJYAlgD8Pw==
PBAiVxgBAAAAJgEAAGPFvQEAALnCBAAAAAEADv////8Q4Dnl7wATADEyMTEwMTAxMTE1MTEwMDE2
MzUS5bqU5LuY5LuY5qy+5Yet6K+BA0VBUwAACeaKpei0puWNlRYAemhhbmd5MTA0NzE1MTExODEz
MTYyMhcxMTExMDExNTExMTg1MDUyMzA2MjE2MQAAAI3GveRVEgAAEOA55e8AEwAxMjExMDEwMTEx
NTExMDAxNjM1EuW6lOS7mOS7mOasvuWHreivgQNFQVMUADExMTEwMTE1MTEwOTEwMzgwMDg5CeaK
pei0puWNlRYAemhhbmd5MTA0NzE1MTExODEzMTYyMhcxMTExMDExNTExMTg1MDUyMzA2MjE2MQAA
AI3GveRVEgAA
'/*!*/;
### UPDATE `ccda`.`ess_accounting_relation`
### WHERE
###  @1=15721785
###  @2='1211010111511001635'
###  @3='应付付款凭证'
###  @4='EAS'
###  @5=NULL
###  @6=''
###  @7='报账单'
###  @8='zhangy1047151118131622'
###  @9='11110115111850523062161'
###  @10=''
###  @11=''
###  @12=''
###  @13=2016-01-19 16:25:09
###  @14=NULL
### SET
###  @1=15721785
###  @2='1211010111511001635'
###  @3='应付付款凭证'
###  @4='EAS'
###  @5=NULL
###  @6='11110115110910380089'
###  @7='报账单'
###  @8='zhangy1047151118131622'
###  @9='11110115111850523062161'
###  @10=''
###  @11=''
###  @12=''
###  @13=2016-01-19 16:25:09
###  @14=NULL
Copy after login

 

先备份一下该记录,然后手动在从库上更新一下,看是否报错。

UPDATE `ccda`.`ess_accounting_relation`

SET attachId='11110115110910380089'

WHERE id = 15721785;

结果发现在从库上也可以正常update呀。

 

后来我想查看下该表表结构,结果出现错误:

mysql> show create table`ccda`.`ess_accounting_relation` \G;

ERROR 144 (HY000): Table'./ccda/ess_accounting_relation_1' is marked as crashed and last (automatic?)repair failed

ERROR:

No query specified

mysql> select count(*) fromccda.ess_accounting_relation_1;

ERROR 144 (HY000): Table'./ccda/ess_accounting_relation_1' is marked as crashed and last (automatic?)repair failed

 

诡异,刚才还能更新呢,现在却又不能正常访问了。

 

然后,尝试修复出问题的表:

check table ccda.ess_accounting_relation_1;

repair table ccda.ess_accounting_relation_1;

 

修复成功后,查看ccda.ess_accounting_relation 表结构,发现该表是个合并表,ess_accounting_relation_1是myisam引擎:

CREATE TABLE `ess_accounting_relation` (

……)

ENGINE=MRG_MyISAM DEFAULT CHARSET=utf8 INSERT_METHOD=LASTUNION=(`ess_accounting_relation_1`)

 

原本以为修复该表成功后,start slave,就正常了。结果还是会导致数据库重启。

此时,再检查ess_accounting_relation_1,也是正常的,没有显示崩溃。

 

我试验,在从库跳过该表的操作(用change master to或者set global sql_slave_skip_counter=n),当执行其他表的操作时,并没有导致从库重启。

 

我试验在从库配置文件里添加参数:replicate_ignore_table=ccda.ess_accounting_relation过滤掉这个表,然后重启数据库,再start slave,没有导致从库重启。

 

最后,我大胆试验下,在主库直接操作ess_accounting_relation_1的某条数据(前提是已经注释掉了上面的参数replicate_ignore_table),发现从库在应用相应数据时,并没有导致重启。

 

所以,问题就出在了这个mrg_myisam存储引擎。

 

这个表,其实每天也都有update,可是为什么最近才出现了这个问题,那就不知道了。因为这个表引用的子表数据量太大了吗?该表大概1600万数据。不晓得。


二:出错原因


一个mrg_myisam存储引擎的合并表,引用了一个myisam引擎的子表,更新前者导致slave数据库一直自动重启,且偶尔子表也会发生崩溃。 

这估计是mysql的一个bug吧。

 

三:解决办法


由于ess_accounting_relation表数据只来源于ccda.ess_accounting_relation_1这一个表,实际上并没有合并的意义,而且,通过了解发现,这个myisam表经常更新。myisam容易崩溃,且不支持行锁,故建议将ccda.ess_accounting_relation_1改成innodb存储引擎(可以先mysqldump备份下这个表,然后在备份文件里将MyISAM改成innodb),删掉ccda.ess_accounting_relation,将ccda.ess_accounting_relation_1重命名为ccda.ess_accounting_relation。

 

 

 --关于mrg_myisam介绍,请参考http://blog.csdn.net/yabingshi_tech/article/details/51320701

 --备注:mysql版本5.5.19

 

 

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to restart Samsung s24Ultra phone? How to restart Samsung s24Ultra phone? Feb 09, 2024 pm 09:54 PM

When using the Samsung S24 Ultra mobile phone, you may occasionally encounter some problems or need to reset the device. In this case, restarting the phone is a common solution. However, it may be confusing if you don't know much about the steps. However, don’t worry, I will show you how to restart your Samsung S24 Ultra phone properly. How to restart the Samsung s24 Ultra 1. Bring up the control menu to shut down: Slide down from the top of the Samsung screen to bring up the shortcut tool menu, click the power icon (a combination of arc and vertical line) to bring up the shutdown and restart selection interface, click Just restart; 2. Use the key combination to shut down: long press the volume-key plus the power key to bring up the shutdown and restart selection menu, click to select shutdown. By pressing and holding

How to solve the computer prompt 'reboot and select proper boot device' How to solve the computer prompt 'reboot and select proper boot device' Jan 15, 2024 pm 02:00 PM

Reinstalling the system may not be a foolproof solution, but after reinstalling, I found that when the computer is turned on, it will display white text on a black background, and then give a prompt: rebootandselectproperbootdevice, what is going on? Such a prompt is usually caused by a boot error. In order to help everyone, the editor has brought you a solution. Computer use is becoming more and more popular, and computer failures are becoming more and more common. No, recently some users encountered a black screen when turning on the computer, and prompted Reboot and Select Proper Boot device, and the computer system could not start normally. What's going on? How to solve it? The user is confused. Next, the editor will follow

How to restart nginx How to restart nginx Jul 27, 2023 pm 05:21 PM

How to restart nginx: 1. Restart Nginx on Linux and use systemd to manage the Nginx service. It will restart Nginx and read any new configuration changes. 2. Restart Nginx on Windows. Nginx will be reloaded and any configuration changes will be applied. , without having to completely stop and start the server; 3. Restart Nginx on your Mac, which will restart Nginx and apply any new configuration changes, etc.

Python script to restart computer Python script to restart computer Sep 08, 2023 pm 05:21 PM

Restarting your computer is a common task that we often perform to troubleshoot problems, install updates, or apply system changes. While there are many ways to restart your computer, using a Python script provides automation and convenience. In this article, we will explore how to create a Python script that can restart your computer with a simple execution. We will first discuss the importance of restarting your computer and the benefits it brings. We will then delve into the implementation details of the Python script, explaining the necessary modules and functionality involved. Throughout this article, we will provide detailed explanations and code snippets to ensure clear understanding. Importance of Restarting Your Computer Restarting your computer is a basic troubleshooting step that can

Ubisoft share falls to 10-year low after disappointing launch of Star Wars: Outlaws Ubisoft share falls to 10-year low after disappointing launch of Star Wars: Outlaws Sep 07, 2024 am 06:30 AM

The shares of the French video game developer Ubisoft reached a record low on Thursday. At a price of around 15.50 euros per share, the value has plummeted by around 10%. As a result, the company's market capitalization fell below 2 billion euros. In

What is the correct way to restart a service in Linux? What is the correct way to restart a service in Linux? Mar 15, 2024 am 09:09 AM

What is the correct way to restart a service in Linux? When using a Linux system, we often encounter situations where we need to restart a certain service, but sometimes we may encounter some problems when restarting the service, such as the service not actually stopping or starting. Therefore, it is very important to master the correct way to restart services. In Linux, you can usually use the systemctl command to manage system services. The systemctl command is part of the systemd system manager

Solve the problem of restarting in a loop after entering the password in win10 Solve the problem of restarting in a loop after entering the password in win10 Dec 29, 2023 pm 09:53 PM

When we inadvertently perform some wrong operations, or there are certain errors in the system itself, we may be unable to enter the desktop after entering the password and keep restarting. At this time we can repair it in safe mode. Let’s take a look at the specific methods below. Win10 cannot enter the desktop after entering a password and keeps restarting. Solution 1. First, press and hold "shift" on the keyboard and click the power button in the lower right corner, then choose to restart the computer until the repair interface appears and then release the "shift" key. 2. If there is no power button in the lower right corner, you can also use the power button of the computer host, but you need to restart it three times or more in a row. 3. After the repair interface appears, we click "View advanced repair options". 4. Select "Troubleshoot". 5

Common system crash and automatic restart problems in Linux systems and their solutions Common system crash and automatic restart problems in Linux systems and their solutions Jun 29, 2023 pm 01:28 PM

As a stable and efficient operating system, Linux system is widely used in servers, embedded devices and other fields. Still, it still suffers from system crashes and automatic reboots. This article discusses common system crash and auto-restart issues and provides solutions. 1. System crash problem Memory problem: In Linux systems, excessive memory usage may cause the system to crash. Solutions include optimizing code, releasing unnecessary memory, increasing physical memory capacity, etc. Software conflicts: In Linux systems, the combination of different software

See all articles