首頁 資料庫 mysql教程 双节点RAC各个节点主机频繁自动重启故障解决

双节点RAC各个节点主机频繁自动重启故障解决

Jun 07, 2016 pm 05:13 PM

最近在vmware中搭建了一个oracle10g RAC的双节点实验平台并将oracle RAC从10.2.0.1升级到10.2.0.5,后来发现两台linux经常自动重

1)         背景介绍:

最近在vmware中搭建了一个Oracle10g RAC的双节点实验平台并将oracle RAC从10.2.0.1升级到10.2.0.5,后来发现两台linux经常自动重启; 
 
2)         平台信息:
vmware7 + OEL5.7X64 + ASMLib2.0 + ORACLE10.2.0.5
 
3)         /var/log/message日志:
NODE1:Linux1
Apr 18 20:44:18 Linux1 syslogd 1.4.1: restart.
Apr 18 20:44:18 Linux1 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Apr 18 20:44:18 Linux1 kernel: Initializing cgroup subsys cpuset
Apr 18 20:44:18 Linux1 kernel: Initializing cgroup subsys cpu
Apr 18 20:44:18 Linux1 kernel: Linux version 2.6.32-200.13.1.el5uek (mockbuild@ca-build9.us.oracle.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP Wed Jul 27 21:02:33 EDT 2011
Apr 18 20:44:18 Linux1 kernel: Command line: ro root=/dev/VolGroup00/LogVol00 rhgb quiet
Apr 18 20:44:18 Linux1 kernel: KERNEL supported cpus:
Apr 18 20:44:18 Linux1 kernel:   Intel GenuineIntel
Apr 18 20:44:18 Linux1 kernel:   AMD AuthenticAMD
Apr 18 20:44:18 Linux1 kernel:   Centaur CentaurHauls
Apr 18 20:44:18 Linux1 kernel: BIOS-provided physical RAM map:
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000000ca000 - 00000000000cc000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000000dc000 - 00000000000e4000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 0000000000100000 - 00000000bfef0000 (usable)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000bfef0000 - 00000000bfeff000 (ACPI data)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000bfeff000 - 00000000bff00000 (ACPI NVS)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000bff00000 - 00000000c0000000 (usable)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 00000000fffe0000 - 0000000100000000 (reserved)
Apr 18 20:44:18 Linux1 kernel: BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
Apr 18 20:44:18 Linux1 kernel: DMI present.
NODE2:Linux2
Apr 18 20:43:35 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 has been idle for 30.0 seconds, shutting it down.
Apr 18 20:43:35 Linux2 kernel: (swapper,0,0):o2net_idle_timer:1498 here are some times that might help debug the situation: (tmr 1334752985.559806 now 1334753015.306532 dr 1334752985.559360 adv 1334752985.559806:1334752985.559807 func (b651ea27:504) 1334752951.27068:1334752951.27323)
Apr 18 20:43:35 Linux2 kernel: o2net: no longer connected to node Linux1 (num 0) at 192.168.3.131:7777
Apr 18 20:43:56 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 shutdown, state 7
Apr 18 20:44:05 Linux2 kernel: (o2net,3480,0):o2net_connect_expired:1659 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors.
Apr 18 20:44:24 Linux2 avahi-daemon[4341]: Registering new address record for 192.168.0.136 on eth0.
Apr 18 20:44:26 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 shutdown, state 7
Apr 18 20:44:28 Linux2 last message repeated 2 times
Apr 18 20:44:28 Linux2 kernel: (o2hb-9938799A41,3564,1):o2dlm_eviction_cb:267 o2dlm has evicted node 0 from group 9938799A418642218A66FE77029DE473
Apr 18 20:44:28 Linux2 kernel: (ocfs2rec,19793,1):ocfs2_replay_journal:1605 Recovering node 0 from slot 0 on device (8,65)
Apr 18 20:44:30 Linux2 kernel: o2net: connection to node Linux1 (num 0) at 192.168.3.131:7777 shutdown, state 8
Apr 18 20:44:31 Linux2 kernel: (ocfs2rec,19793,0):ocfs2_begin_quota_recovery:407 Beginning quota recovery in slot 0
Apr 18 20:44:31 Linux2 kernel: (ocfs2_wq,3567,1):ocfs2_finish_quota_recovery:598 Finishing quota recovery in slot 0
Apr 18 20:44:31 Linux2 kernel: (dlm_reco_thread,3573,0):dlm_get_lock_resource:836 9938799A418642218A66FE77029DE473:$RECOVERY: at least one node (0) to recover before lock mastery can begin
Apr 18 20:44:31 Linux2 kernel: (dlm_reco_thread,3573,0):dlm_get_lock_resource:870 9938799A418642218A66FE77029DE473: recovery map is not empty, but must master $RECOVERY lock now
Apr 18 20:44:31 Linux2 kernel: (dlm_reco_thread,3573,0):dlm_do_recovery:523 (3573) Node 1 is the Recovery Master for the Dead Node 0 for Domain 9938799A418642218A66FE77029DE473
以上信息在两台机器中会交换出现,说明并不是总是固定的一台机器对另外一台超时。
 
 
4)         根据message信息报错,应该是o2cb的idle时间超限导致的,,系统中O2CB服务的状态为:
[oracle@Linux1]service o2cb status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Stack glue driver: Loaded
Stack plugin "o2cb": Loaded
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold = 301
 Network idle timeout: 30000                            /此处单位为毫秒,正式message中报的30秒
 Network keepalive delay: 2000
 Network reconnect delay: 2000
Checking O2CB heartbeat: Active

linux

本網站聲明
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn

熱AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智慧驅動的應用程序,用於創建逼真的裸體照片

AI Clothes Remover

AI Clothes Remover

用於從照片中去除衣服的線上人工智慧工具。

Undress AI Tool

Undress AI Tool

免費脫衣圖片

Clothoff.io

Clothoff.io

AI脫衣器

AI Hentai Generator

AI Hentai Generator

免費產生 AI 無盡。

熱門文章

R.E.P.O.能量晶體解釋及其做什麼(黃色晶體)
2 週前 By 尊渡假赌尊渡假赌尊渡假赌
倉庫:如何復興隊友
4 週前 By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island冒險:如何獲得巨型種子
3 週前 By 尊渡假赌尊渡假赌尊渡假赌

熱工具

記事本++7.3.1

記事本++7.3.1

好用且免費的程式碼編輯器

SublimeText3漢化版

SublimeText3漢化版

中文版,非常好用

禪工作室 13.0.1

禪工作室 13.0.1

強大的PHP整合開發環境

Dreamweaver CS6

Dreamweaver CS6

視覺化網頁開發工具

SublimeText3 Mac版

SublimeText3 Mac版

神級程式碼編輯軟體(SublimeText3)

減少在Docker中使用MySQL內存的使用 減少在Docker中使用MySQL內存的使用 Mar 04, 2025 pm 03:52 PM

減少在Docker中使用MySQL內存的使用

如何使用Alter Table語句在MySQL中更改表? 如何使用Alter Table語句在MySQL中更改表? Mar 19, 2025 pm 03:51 PM

如何使用Alter Table語句在MySQL中更改表?

mysql無法打開共享庫怎麼解決 mysql無法打開共享庫怎麼解決 Mar 04, 2025 pm 04:01 PM

mysql無法打開共享庫怎麼解決

在 Linux 中運行 MySQl(有/沒有帶有 phpmyadmin 的 podman 容器) 在 Linux 中運行 MySQl(有/沒有帶有 phpmyadmin 的 podman 容器) Mar 04, 2025 pm 03:54 PM

在 Linux 中運行 MySQl(有/沒有帶有 phpmyadmin 的 podman 容器)

什麼是 SQLite?全面概述 什麼是 SQLite?全面概述 Mar 04, 2025 pm 03:55 PM

什麼是 SQLite?全面概述

在MacOS上運行多個MySQL版本:逐步指南 在MacOS上運行多個MySQL版本:逐步指南 Mar 04, 2025 pm 03:49 PM

在MacOS上運行多個MySQL版本:逐步指南

哪些流行的MySQL GUI工具(例如MySQL Workbench,PhpMyAdmin)是什麼? 哪些流行的MySQL GUI工具(例如MySQL Workbench,PhpMyAdmin)是什麼? Mar 21, 2025 pm 06:28 PM

哪些流行的MySQL GUI工具(例如MySQL Workbench,PhpMyAdmin)是什麼?

如何為MySQL連接配置SSL/TLS加密? 如何為MySQL連接配置SSL/TLS加密? Mar 18, 2025 pm 12:01 PM

如何為MySQL連接配置SSL/TLS加密?

See all articles