原文博客链接地址:10gR2 rac如何重跑root.sh ? 前几天遇到一客户的10205 rac,出现LMD进程IPC SEND TIMEOUT问题。准备深入研究下Oracle RAC 的LMON,LMD以及LMS进程,发现自己的VM RAC无法启动了,最后看了下,居然是有个节点的分区不见了。 Node2 ? 1234567
原文博客链接地址:10gR2 rac如何重跑root.sh ?
前几天遇到一客户的10205 rac,出现LMD进程IPC SEND TIMEOUT问题。准备深入研究下Oracle RAC
的LMON,LMD以及LMS进程,发现自己的VM RAC无法启动了,最后看了下,居然是有个节点的分区不见了。
++++Node2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
[root@rac2 raw]# ls -ltr /dev/sdf*
<code class="php plain">brw-r----- 1 root disk 8, 84 Dec 4 2013 /dev/sdf4
<code class="php plain">brw-r----- 1 root disk 8, 83 Dec 4 2013 /dev/sdf3
<code class="php plain">brw-r----- 1 root disk 8, 82 Dec 4 2013 /dev/sdf2
<code class="php plain">brw-r----- 1 root disk 8, 81 Dec 4 2013 /dev/sdf1
<code class="php plain">brwxrwxr-x 1 oracle oinstall 8, 80 Dec 4 2013 /dev/sdf
<code class="php plain">[root@rac2 bin]# cat /etc/rc.d/rc.local
<code class="php plain">#!/bin/sh
<code class="php plain">#
<code class="php plain"># This script will be executed *after* all the other init scripts.
<code class="php plain"># You can put your own initialization stuff in here
<code class="php keyword">if <code class="php plain">you don't
<code class="php plain"># want to
<code class="php keyword">do <code class="php plain">the full Sys V style init stuff.
<code class="php plain">touch /<code class="php keyword">var<code class="php plain">/lock/subsys/local
<code class="php functions">chown <code class="php plain">
-R oracle:oinstall /dev/sdf
<code class="php functions">chown <code class="php plain">
-R oracle:oinstall /dev/sde
<code class="php functions">chown <code class="php plain">
-R oracle:oinstall /dev/sdb
<code class="php functions">chown <code class="php plain">
-R oracle:oinstall /dev/sdd
<code class="php functions">chown <code class="php plain">
-R oracle:oinstall /dev/sdc
<code class="php functions">chmod <code class="php plain">
-R 775 /dev/sdf
<code class="php functions">chmod <code class="php plain">
-R 775 /dev/sde
<code class="php functions">chmod <code class="php plain">
-R 775 /dev/sdb
<code class="php functions">chmod <code class="php plain">
-R 775 /dev/sdd
<code class="php functions">chmod <code class="php plain">
-R 775 /dev/sdc
<code class="php plain">raw /dev/raw/raw1 /dev/sdf1
<code class="php plain">raw /dev/raw/raw2 /dev/sdf2
<code class="php plain">raw /dev/raw/raw3 /dev/sdf3
<code class="php plain">raw /dev/raw/raw4 /dev/sdf4
<code class="php functions">chown <code class="php plain">
-R oracle:dba /dev/raw
|
++++Node1
?1 2 3 4 |
[root@rac1 bin]# partprobe
<code class="php plain">[root@rac1 bin]# ls -ltr /dev/sdf*
<code class="php plain">brwxrwxr-x 1 oracle oinstall 8, 80 Jun 29 01:37 /dev/sdf
<code class="php plain">[root@rac1 bin]#
|
我这里是将其中一个共享盘进行了分区,然后绑定为raw。发现其中一个节点的分区不见了,最后把节点2
reboot之后,2个节点的分区都看不到了。无语了。
印象中Linux有一些方法可以恢复partition,尝试使用gpart来试试,如下是我尝试:
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
[root@rac1 repodata]# gpart /dev/sdf
<code class="php plain">Begin scan...
<code class="php functions">End <code class="php plain">
scan.
<code class="php plain">Checking partitions...
<code class="php plain">Ok.
<code class="php plain">Guessed primary partition table:
<code class="php plain">Primary partition(1)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">Primary partition(2)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">Primary partition(3)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">Primary partition(4)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">[root@rac1 repodata]# gpart -W /dev/sdf /dev/sdf
<code class="php plain">Begin scan...
<code class="php functions">End <code class="php plain">
scan.
<code class="php plain">Checking partitions...
<code class="php plain">Ok.
<code class="php plain">Guessed primary partition table:
<code class="php plain">Primary partition(1)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">Primary partition(2)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">Primary partition(3)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">Primary partition(4)
<code class="php spaces"><code class="php plain">type: 000(0x00)(unused)
<code class="php spaces"><code class="php plain">size: 0mb #s(0) s(0-0)
<code class="php spaces"><code class="php plain">chs: (0/0/0)-(0/0/0)d (0/0/0)-(0/0/0)r
<code class="php plain">Edit this table (y,n) : y
<code class="php plain">Edit which partition (1..4, q to quit) : q
<code class="php plain">Activate which partition (1..4, q to quit) : 1
<code class="php plain">Write this partition table (y,n) : y
<code class="php plain">* Warning: partition table written, you should reboot now.
<code class="php plain">[root@rac1 repodata]# ls -ltr /dev/sdf*
<code class="php plain">brwxrwxr-x 1 oracle oinstall 8, 80 Jun 29 03:57 /dev/sdf
|
大家可以看到,我这里gpart扫出来的,虽然能看到有4个分区,但是全是空的。不知道为什么,太怪异了。
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
[root@rac1 ~]# fdisk -l
<code class="php plain">Disk /dev/sda: 21.4 GB, 21474836480 bytes
<code class="php plain">255 heads, 63 sectors/track, 2610 cylinders
<code class="php plain">Units = cylinders of 16065 * 512 = 8225280 bytes
<code class="php plain">Device Boot Start
<code class="php functions">End <code class="php plain">Blocks Id System
<code class="php plain">/dev/sda1 * 1 6 48163+ 83 Linux
<code class="php plain">/dev/sda2 7 515 4088542+ 83 Linux
<code class="php plain">/dev/sda3 516 776 2096482+ 82 Linux swap / Solaris
<code class="php plain">/dev/sda4 777 2610 14731605 5 Extended
<code class="php plain">/dev/sda5 777 2610 14731573+ 83 Linux
<code class="php plain">Disk /dev/sdb: 524 MB, 524288000 bytes
<code class="php plain">64 heads, 32 sectors/track, 500 cylinders
<code class="php plain">Units = cylinders of 2048 * 512 = 1048576 bytes
<code class="php plain">Disk /dev/sdb doesn't contain a valid partition table
<code class="php plain">Disk /dev/sdc: 4294 MB, 4294967296 bytes
<code class="php plain">255 heads, 63 sectors/track, 522 cylinders
<code class="php plain">Units = cylinders of 16065 * 512 = 8225280 bytes
<code class="php plain">Disk /dev/sdc doesn't contain a valid partition table
<code class="php plain">Disk /dev/sdd: 4294 MB, 4294967296 bytes
<code class="php plain">255 heads, 63 sectors/track, 522 cylinders
<code class="php plain">Units = cylinders of 16065 * 512 = 8225280 bytes
<code class="php plain">Disk /dev/sdd doesn't contain a valid partition table
<code class="php plain">Disk /dev/sde: 4294 MB, 4294967296 bytes
<code class="php plain">255 heads, 63 sectors/track, 522 cylinders
<code class="php plain">Units = cylinders of 16065 * 512 = 8225280 bytes
<code class="php plain">Disk /dev/sde doesn't contain a valid partition table
<code class="php plain">Disk /dev/sdf: 2147 MB, 2147483648 bytes
<code class="php plain">255 heads, 63 sectors/track, 261 cylinders
<code class="php plain">Units = cylinders of 16065 * 512 = 8225280 bytes
<code class="php plain">Device Boot Start
<code class="php functions">End <code class="php plain">Blocks Id System
<code class="php plain">/dev/sdf1 * 1 1 0 0
<code class="php functions">Empty
<code class="php plain">Partition 1 has different physical/logical beginnings (non-Linux?):
<code class="php spaces"><code class="php plain">phys=(0, 0, 0) logical=(0, 0, 1)
<code class="php plain">Partition 1 has different physical/logical endings:
<code class="php spaces"><code class="php plain">phys=(0, 0, 0) logical=(267349, 89, 4)
<code class="php plain">Partition 1 does not
<code class="php functions">end <code class="php plain">on cylinder boundary.
|
因此,最后gpart写回去也一样没用,因为数据都没了。 由于我这里也没有ocr的备份,因此只能重建了。
在10gR2 版本中,我们可以不必要重新安装,只跑root.sh脚本即可,那么怎么搞呢 ?
如果你直接运行root.sh,那么会遇到类似如下的问题:
[root@rac1 crs]# ./root.sh
WARNING: directory ‘/home/oracle/app/oracle/product/10.2.0′ is not owned by root
WARNING: directory ‘/home/oracle/app/oracle/product’ is not owned by root
WARNING: directory ‘/home/oracle/app/oracle’ is not owned by root
WARNING: directory ‘/home/oracle/app’ is not owned by root
WARNING: directory ‘/home/oracle’ is not owned by root
No value set for the CRS parameter CRS_OCR_LOCATIONS. Using Values in paramfile.crs
Checking to see if Oracle CRS stack is already configured
Oracle CRS stack is already configured and will be running under init(1M)
[root@rac1 crs]#
##### 清理部分文件,准备执行root.sh
?1 2 3 4 5 6 7 8 9 10 11 12 13 |
1. 删除/etc/oracle/ocr.loc
<code class="plain plain">[root@rac1 crs]# mv /etc/oracle/ocr.loc /etc/oracle/ocr.loc.bak
<code class="plain plain">mv: overwrite `/etc/oracle/ocr.loc.bak'? y
<code class="plain plain">2. 删除cssfatal文件
<code class="plain plain">[root@rac1 crs]# cd /etc/oracle/
<code class="plain plain">[root@rac1 oracle]# pwd
<code class="plain plain">/etc/oracle/scls_scr/rac1/oracle
<code class="plain plain">[root@rac1 oracle]# rm cssfatal
<code class="plain plain">rm: remove regular file `cssfatal'? n
<code class="plain plain">[root@rac1 oracle]# mv cssfatal cssfatal.bak
<code class="plain plain">3. 删除/etc/inittab中的信息
|
最后我分开执行root.sh脚本即可,如下是过程:
节点1:
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
[root@rac1 rac1]# /home/oracle/app/oracle/product/10.2.0/crs/root.sh
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product/10.2.0' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle' is not owned by root
<code class="plain plain">No value set for the CRS parameter CRS_OCR_LOCATIONS. Using Values in paramfile.crs
<code class="plain plain">Checking to see if Oracle CRS stack is already configured
<code class="plain plain">Setting the permissions on OCR backup directory
<code class="plain plain">Setting up NS directories
<code class="plain plain">Oracle Cluster Registry configuration upgraded successfully
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product/10.2.0' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle' is not owned by root
<code class="plain plain">Successfully accumulated necessary OCR keys.
<code class="plain plain">Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
<code class="plain plain">node <nodenumber>: <nodename> <private interconnect name> <hostname>
<code class="plain plain">node 1: rac1 rac1-priv rac1
<code class="plain plain">node 2: rac2 rac2-priv rac2
<code class="plain plain">Creating OCR keys for user 'root', privgrp 'root'..
<code class="plain plain">Operation successful.
<code class="plain plain">Now formatting voting device: /dev/raw/raw2
<code class="plain plain">Format of 1 voting devices complete.
<code class="plain plain">Startup will be queued to init within 30 seconds.
<code class="plain plain">Adding daemons to inittab
<code class="plain plain">Expecting the CRS daemons to be up within 600 seconds.
<code class="plain plain">CSS is active on these nodes.
<code class="plain spaces"><code class="plain plain">rac1
<code class="plain plain">CSS is inactive on these nodes.
<code class="plain spaces"><code class="plain plain">rac2
<code class="plain plain">Local node checking complete.
<code class="plain plain">Run root.sh on remaining nodes to start CRS daemons.
|
节点2:
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
[root@rac2 oracle]# /home/oracle/app/oracle/product/10.2.0/crs/root.sh
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product/10.2.0' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle' is not owned by root
<code class="plain plain">No value set for the CRS parameter CRS_OCR_LOCATIONS. Using Values in paramfile.crs
<code class="plain plain">Checking to see if Oracle CRS stack is already configured
<code class="plain plain">Setting the permissions on OCR backup directory
<code class="plain plain">Setting up NS directories
<code class="plain plain">Oracle Cluster Registry configuration upgraded successfully
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product/10.2.0' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle/product' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app/oracle' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle/app' is not owned by root
<code class="plain plain">WARNING: directory '/home/oracle' is not owned by root
<code class="plain plain">clscfg: EXISTING configuration version 3 detected.
<code class="plain plain">clscfg: version 3 is 10G Release 2.
<code class="plain plain">Successfully accumulated necessary OCR keys.
<code class="plain plain">Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
<code class="plain plain">node <nodenumber>: <nodename> <private interconnect name> <hostname>
<code class="plain plain">node 1: rac1 rac1-priv rac1
<code class="plain plain">node 2: rac2 rac2-priv rac2
<code class="plain plain">clscfg: Arguments check out successfully.
<code class="plain plain">NO KEYS WERE WRITTEN. Supply -force parameter to override.
<code class="plain plain">-force is destructive and will destroy any previous cluster
<code class="plain plain">configuration.
<code class="plain plain">Oracle Cluster Registry for cluster has already been initialized
<code class="plain plain">Startup will be queued to init within 30 seconds.
<code class="plain plain">Adding daemons to inittab
<code class="plain plain">Expecting the CRS daemons to be up within 600 seconds.
<code class="plain plain">CSS is active on these nodes.
<code class="plain spaces"><code class="plain plain">rac1
<code class="plain spaces"><code class="plain plain">rac2
<code class="plain plain">CSS is active on all nodes.
<code class="plain plain">Waiting for the Oracle CRSD and EVMD to start
<code class="plain plain">Oracle CRS stack installed and running under init(1M)
<code class="plain plain">Running vipca(silent) for configuring nodeapps
<code class="plain plain">Creating VIP application resource on (2) nodes...
<code class="plain plain">Creating GSD application resource on (2) nodes...
<code class="plain plain">Creating ONS application resource on (2) nodes...
<code class="plain plain">Starting VIP application resource on (2) nodes...
<code class="plain plain">Starting GSD application resource on (2) nodes...
<code class="plain plain">Starting ONS application resource on (2) nodes...
<code class="plain plain">Done.
|
最后我们看到,crs进程都正常起来了,如下:
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
[root@rac1 oracle]# ps -ef|grep d.bin
<code class="plain plain">oracle 12371 12370 0 04:34 ? 00:00:00 /home/oracle/app/oracle/product/10.2.0/crs/bin/evmd.bin
<code class="plain plain">root 12446 11819 0 04:34 ? 00:00:00 /home/oracle/app/oracle/product/10.2.0/crs/bin/crsd.bin reboot
<code class="plain plain">root 12688 12452 0 04:34 ? 00:00:00 /home/oracle/app/oracle/product/10.2.0/crs/bin/oprocd.bin run -t 1000 -m 500
<code class="plain plain">oracle 12914 12520 0 04:34 ? 00:00:00 /home/oracle/app/oracle/product/10.2.0/crs/bin/ocssd.bin
<code class="plain plain">root 15267 5027 0 04:41 pts/1 00:00:00 grep d.bin
<code class="plain plain">[root@rac1 oracle]# cd /home/oracle/app/oracle/product/10.2.0/crs/bin
<code class="plain plain">[root@rac1 bin]# ./ocrcheck
<code class="plain plain">Status of Oracle Cluster Registry is as follows :
<code class="plain spaces"><code class="plain plain">Version : 2
<code class="plain spaces"><code class="plain plain">Total space (kbytes) : 521836
<code class="plain spaces"><code class="plain plain">Used space (kbytes) : 4604
<code class="plain spaces"><code class="plain plain">Available space (kbytes) : 517232
<code class="plain spaces"><code class="plain plain">ID : 559767577
<code class="plain spaces"><code class="plain plain">Device/File Name : /dev/raw/raw1
<code class="plain spaces"><code class="plain plain">Device/File integrity check succeeded
<code class="plain plain">Device/File not configured
<code class="plain plain">Cluster registry integrity check succeeded
<code class="plain plain">[root@rac1 bin]# ./crsctl query css votedisk
<code class="plain spaces"><code class="plain plain">0. 0 /dev/raw/raw4
<code class="plain plain">located 1 votedisk(s).
<code class="plain plain">[root@rac1 bin]#
<code class="plain plain">[root@rac2 bin]# ./crs_stat -t -v
<code class="plain plain">Name Type R/RA F/FT Target State Host
<code class="plain plain">----------------------------------------------------------------------
<code class="plain plain">ora.rac1.gsd application 0/5 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora.rac1.ons application 0/3 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora.rac1.vip application 0/0 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora.rac2.gsd application 0/5 0/0 ONLINE ONLINE rac2
<code class="plain plain">ora.rac2.ons application 0/3 0/0 ONLINE ONLINE rac2
<code class="plain plain">ora.rac2.vip application 0/0 0/0 ONLINE ONLINE rac2
<code class="plain plain">[root@rac2 bin]#
|
+++注册db和asm
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
[oracle@rac1 bdump]$ srvctl add database -d roger -o /home/oracle/app/oracle/product/10.2.0/db_1
<code class="plain plain">[oracle@rac1 bdump]$ srvctl add instance -d roger -i roger1 -n rac1
<code class="plain plain">[oracle@rac1 bdump]$ srvctl add instance -d roger -i roger2 -n rac2
<code class="plain plain">[oracle@rac1 bdump]$ srvctl add asm -n rac1 -i +ASM1 -o /home/oracle/app/oracle/product/10.2.0/db_1
<code class="plain plain">null
<code class="plain spaces"><code class="plain plain">[PRKS-1030 : Failed to add configuration for ASM instance "+ASM1" on node "rac1" in cluster registry, [PROC-5: User does not have permission to perform a cluster
registry operation on this key. Authentication error [User does not have permission to perform this operation] [0]]
<code class="plain spaces"><code class="plain plain">[PROC-5: User does not have permission to perform a cluster registry operation on this key. Authentication error [User does not have permission to perform this
operation] [0]]]
<code class="plain plain">[oracle@rac1 bdump]$
<code class="plain plain">[root@rac2 bin]# ./crs_getperm ora.rac1.vip
<code class="plain plain">Name: ora.rac1.vip
<code class="plain plain">owner:root:rwx,pgrp:oinstall:r-x,other::r--,user:oracle:r-x,
<code class="plain plain">[root@rac2 bin]# ./crs_getperm ora.rac2.vip
<code class="plain plain">Name: ora.rac2.vip
<code class="plain plain">owner:root:rwx,pgrp:oinstall:r-x,other::r--,user:oracle:r-x,
<code class="plain plain">[root@rac2 bin]#
|
可以看到上面执行报错了,开始以为是vip资源的问题,于是修改了owner,最后发现这是错位的步骤:
?1 2 3 4 5 6 7 8 9 10 |
[root@rac2 bin]# ./crs_setperm ora.rac1.vip -o oracle
<code class="plain plain">[root@rac2 bin]# ./crs_setperm ora.rac1.vip -g oinstall
<code class="plain plain">[root@rac2 bin]# ./crs_setperm ora.rac2.vip -o oracle
<code class="plain plain">[root@rac2 bin]# ./crs_setperm ora.rac2.vip -g oinstall
<code class="plain plain">[root@rac2 bin]#
<code class="plain plain">[oracle@rac1 bdump]$ srvctl add asm -n rac1 -i +ASM1 -o /home/oracle/app/oracle/product/10.2.0/db_1
<code class="plain plain">null
<code class="plain spaces"><code class="plain plain">[PRKS-1030 : Failed to add configuration for ASM instance "+ASM1" on node "rac1" in cluster registry, [PROC-5: User does not have permission to perform a cluster
registry operation on this key. Authentication error [User does not have permission to perform this operation] [0]]
<code class="plain spaces"><code class="plain plain">[PROC-5: User does not have permission to perform a cluster registry operation on this key. Authentication error [User does not have permission to perform this
operation] [0]]]
|
虽然添加失败,我就改用root执行,进行添加,如下:
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
[root@rac1 bin]# ./srvctl add asm -n rac1 -i +ASM1 -o /home/oracle/app/oracle/product/10.2.0/db_1
<code class="plain plain">[root@rac1 bin]# ./srvctl add asm -n rac2 -i +ASM2 -o /home/oracle/app/oracle/product/10.2.0/db_1
<code class="plain plain">[root@rac1 bin]# ./crs_stat -p|grep asm
<code class="plain plain">NAME=ora.rac1.ASM1.asm
<code class="plain plain">NAME=ora.rac2.ASM2.asm
<code class="plain plain">[root@rac1 bin]# ./crs_setperm ora.rac1.ASM1.asm -o oracle
<code class="plain plain">[root@rac1 bin]# ./crs_setperm ora.rac1.ASM2.asm -o oracle
<code class="plain plain">[root@rac1 bin]# ./crs_setperm ora.rac1.ASM1.asm -g oinstall
<code class="plain plain">[root@rac1 bin]# ./crs_setperm ora.rac2.ASM2.asm -g oinstall
<code class="plain plain">[oracle@rac1 bdump]$ crs_start ora.rac1.ASM1.asm
<code class="plain plain">Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
<code class="plain plain">Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
<code class="plain plain">[oracle@rac1 bdump]$ crs_start ora.rac2.ASM2.asm
<code class="plain plain">Attempting to start `ora.rac2.ASM2.asm` on member `rac2`
<code class="plain plain">Start of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
<code class="plain plain">[oracle@rac1 bdump]$ crs_stat -t
<code class="plain plain">Name Type Target State Host
<code class="plain plain">------------------------------------------------------------
<code class="plain plain">ora....SM1.asm application ONLINE ONLINE rac1
<code class="plain plain">ora....C1.lsnr application ONLINE OFFLINE
<code class="plain plain">ora.rac1.gsd application ONLINE ONLINE rac1
<code class="plain plain">ora.rac1.ons application ONLINE ONLINE rac1
<code class="plain plain">ora.rac1.vip application ONLINE OFFLINE
<code class="plain plain">ora....SM2.asm application ONLINE ONLINE rac2
<code class="plain plain">ora....C2.lsnr application ONLINE OFFLINE
<code class="plain plain">ora.rac2.gsd application ONLINE ONLINE rac2
<code class="plain plain">ora.rac2.ons application ONLINE ONLINE rac2
<code class="plain plain">ora.rac2.vip application ONLINE OFFLINE
<code class="plain plain">ora.roger.db application ONLINE ONLINE rac2
<code class="plain plain">ora....r1.inst application ONLINE ONLINE rac1
<code class="plain plain">ora....r2.inst application ONLINE ONLINE rac2
|
最后发现vip和lsnr资源死活起不来,看了下日志,才发现开始vip资源搞错了,vip资源的owner本身就应该是root才对。
通过查看crsd.log 日志,会发现如下信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
2014-06-29 09:04:56.578: [ CRSRES][2719009680]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:04:56.775: [ CRSRES][2708519824]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:04:56.820: [ CRSRES][2687540112]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:04:56.903: [ CRSRES][2719009680]0Attempting to start `ora.rac1.vip` on member `rac1`
<code class="plain plain">2014-06-29 09:04:56.929: [ CRSRES][2708519824]0Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
<code class="plain plain">2014-06-29 09:04:56.951: [ CRSRES][2687540112]0Attempting to start `ora.roger.roger1.inst` on member `rac1`
<code class="plain plain">2014-06-29 09:04:58.798: [ CRSAPP][2719009680]0StartResource error for ora.rac1.vip error code = 1
<code class="plain plain">2014-06-29 09:04:59.579: [ CRSRES][2719009680]0Start of `ora.rac1.vip` on member `rac1` failed.
<code class="plain plain">2014-06-29 09:05:00.007: [ COMMCRS][2644503440]clsc_send_msg: (0x98bede0) NS err (12571, 12560), transport (530, 111, 0)
<code class="plain plain">2014-06-29 09:05:00.007: [ CRSCOMM][2719009680]0CLSC connect failed torac2ret = 9
<code class="plain plain">2014-06-29 09:05:00.008: [ CRSEVT][2719009680]0invokepeer ret 200
<code class="plain plain">2014-06-29 09:05:00.040: [ CRSRES][2719009680]0Remote start never sent to rac2: X_E2E_NotSent : Failed to connect to node: rac2
<code class="plain plain">(File: caa_CmdRTI.cpp, line: 504
<code class="plain plain">2014-06-29 09:05:00.040: [ CRSRES][2719009680][ALERT]0Remote start for `ora.rac1.vip` failed on member `rac2`
<code class="plain plain">2014-06-29 09:05:01.047: [ CRSRES][2719009680]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:05:01.147: [ CRSRES][2719009680]0Attempting to start `ora.rac1.vip` on member `rac1`
<code class="plain plain">2014-06-29 09:05:02.400: [ CRSAPP][2719009680]0StartResource error for ora.rac1.vip error code = 1
<code class="plain plain">2014-06-29 09:05:03.702: [ CRSRES][2719009680]0Start of `ora.rac1.vip` on member `rac1` failed.
<code class="plain plain">2014-06-29 09:05:04.811: [ CRSRES][2613033872]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:05:04.967: [ CRSRES][2613033872]0Attempting to start `ora.rac1.vip` on member `rac1`
<code class="plain plain">2014-06-29 09:05:05.268: [ CRSAPP][2613033872]0StartResource error for ora.rac1.vip error code = 1
<code class="plain plain">2014-06-29 09:05:06.769: [ CRSRES][2613033872]0Start of `ora.rac1.vip` on member `rac1` failed.
<code class="plain plain">2014-06-29 09:05:11.078: [ CRSRES][2613033872]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:05:11.342: [ CRSRES][2613033872]0Attempting to start `ora.rac1.ons` on member `rac1`
<code class="plain plain">2014-06-29 09:05:13.926: [ CRSRES][2613033872]0Start of `ora.rac1.ons` on member `rac1` succeeded.
<code class="plain plain">2014-06-29 09:05:13.966: [ CRSRES][2708519824]0Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
<code class="plain plain">2014-06-29 09:05:45.321: [ CRSRES][2708519824]0CRS-1002: Resource 'ora.rac1.ons' is already running on member 'rac1'
<code class="plain plain">2014-06-29 09:05:46.461: [ CRSRES][2687540112]0Start of `ora.roger.roger1.inst` on member `rac1` succeeded.
<code class="plain plain">2014-06-29 09:05:46.472: [ CRSRES][2698029968]0Skip online resource: ora.rac1.ons
<code class="plain plain">2014-06-29 09:05:49.505: [ CRSRES][2687540112]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:05:49.969: [ CRSRES][2613033872]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:05:50.186: [ CRSRES][2613033872]0Attempting to start `ora.rac1.vip` on member `rac1`
<code class="plain plain">2014-06-29 09:05:50.307: [ CRSRES][2687540112]0Attempting to start `ora.rac1.gsd` on member `rac1`
<code class="plain plain">2014-06-29 09:05:50.788: [ CRSRES][2677050256]0Attempting to start `ora.rac2.vip` on member `rac2`
<code class="plain plain">2014-06-29 09:05:50.906: [ CRSRES][2698029968]0Attempting to start `ora.rac2.gsd` on member `rac2`
<code class="plain plain">2014-06-29 09:05:50.985: [ CRSRES][2719009680]0Attempting to start `ora.rac2.ons` on member `rac2`
<code class="plain plain">2014-06-29 09:05:51.079: [ CRSRES][2708519824]0Attempting to start `ora.roger.db` on member `rac2`
<code class="plain plain">2014-06-29 09:05:51.082: [ CRSAPP][2613033872]0StartResource error for ora.rac1.vip error code = 1
<code class="plain plain">2014-06-29 09:05:51.978: [ CRSRES][2613033872]0Start of `ora.rac1.vip` on member `rac1` failed.
<code class="plain plain">2014-06-29 09:05:52.059: [ CRSRES][2613033872]0rac2 : CRS-1019: Resource ora.rac1.LISTENER_RAC1.lsnr (application) cannot run on rac2
<code class="plain plain">2014-06-29 09:05:53.001: [ CRSRES][2687540112]0Start of `ora.rac1.gsd` on member `rac1` succeeded.
<code class="plain plain">2014-06-29 09:05:54.193: [ CRSRES][2708519824]0Start of `ora.roger.db` on member `rac2` succeeded.
<code class="plain plain">2014-06-29 09:05:54.505: [ CRSRES][2698029968]0Start of `ora.rac2.gsd` on member `rac2` succeeded.
<code class="plain plain">2014-06-29 09:05:54.869: [ CRSRES][2634013584]0CRS-1002: Resource 'ora.roger.db' is already running on member 'rac2'
<code class="plain plain">2014-06-29 09:05:55.054: [ CRSRES][2677050256]0Start of `ora.rac2.vip` on member `rac2` failed.
<code class="plain plain">2014-06-29 09:05:55.226: [ CRSRES][2677050256]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:05:55.277: [ CRSRES][2677050256]0Attempting to start `ora.rac2.vip` on member `rac1`
<code class="plain plain">2014-06-29 09:05:55.585: [ CRSAPP][2677050256]0StartResource error for ora.rac2.vip error code = 1
<code class="plain plain">2014-06-29 09:05:55.714: [ CRSRES][2719009680]0Start of `ora.rac2.ons` on member `rac2` succeeded.
<code class="plain plain">2014-06-29 09:05:55.910: [ CRSRES][2677050256]0Start of `ora.rac2.vip` on member `rac1` failed.
<code class="plain plain">2014-06-29 09:05:56.363: [ CRSRES][2677050256]0Attempting to start `ora.rac2.vip` on member `rac2`
<code class="plain plain">2014-06-29 09:05:57.180: [ CRSRES][2677050256]0Start of `ora.rac2.vip` on member `rac2` failed.
<code class="plain plain">2014-06-29 09:05:57.993: [ CRSRES][2654993296]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:05:58.611: [ CRSAPP][2654993296]0StartResource error for ora.rac1.vip error code = 1
<code class="plain plain">2014-06-29 09:05:59.333: [ CRSRES][2708519824]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:06:00.129: [ CRSAPP][2708519824]0StartResource error for ora.rac2.vip error code = 1
<code class="plain plain">2014-06-29 09:06:06.328: [ CRSRES][2708519824]0startRunnable: setting CLI values
<code class="plain plain">2014-06-29 09:06:06.916: [ CRSAPP][2708519824]0StartResource error for ora.rac1.vip error code = 1
|
最后将vip资源改回为root,一切正常:
?1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
[root@rac1 bin]# ./crs_setperm ora.rac1.vip -o root
<code class="plain plain">[root@rac1 bin]# ./crs_setperm ora.rac1.vip -g root
<code class="plain plain">[root@rac1 bin]# ./crs_start ora.rac1.vip
<code class="plain plain">Attempting to start `ora.rac1.vip` on member `rac1`
<code class="plain plain">Start of `ora.rac1.vip` on member `rac1` succeeded.
<code class="plain plain">[root@rac1 bin]# ./crs_setperm ora.rac2.vip -o root
<code class="plain plain">[root@rac1 bin]# ./crs_setperm ora.rac2.vip -g root
<code class="plain plain">[root@rac1 bin]# ./crs_start ora.rac2.vip
<code class="plain plain">Attempting to start `ora.rac2.vip` on member `rac2`
<code class="plain plain">Start of `ora.rac2.vip` on member `rac2` succeeded.
<code class="plain plain">[root@rac1 bin]#
<code class="plain plain">[oracle@rac1 racg]$ crs_stat -t -v
<code class="plain plain">Name Type R/RA F/FT Target State Host
<code class="plain plain">----------------------------------------------------------------------
<code class="plain plain">ora....SM1.asm application 0/5 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora....C1.lsnr application 0/5 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora.rac1.gsd application 0/5 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora.rac1.ons application 0/3 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora.rac1.vip application 0/0 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora....SM2.asm application 0/5 0/0 ONLINE ONLINE rac2
<code class="plain plain">ora....C2.lsnr application 0/5 0/0 ONLINE ONLINE rac2
<code class="plain plain">ora.rac2.gsd application 0/5 0/0 ONLINE ONLINE rac2
<code class="plain plain">ora.rac2.ons application 0/3 0/0 ONLINE ONLINE rac2
<code class="plain plain">ora.rac2.vip application 0/0 0/0 ONLINE ONLINE rac2
<code class="plain plain">ora.roger.db application 0/0 0/1 ONLINE ONLINE rac2
<code class="plain plain">ora....r1.inst application 0/5 0/0 ONLINE ONLINE rac1
<code class="plain plain">ora....r2.inst application 0/5 0/0 ONLINE ONLINE rac2
<code class="plain plain">[oracle@rac1 racg]$
|
1 |
很久没有搞10gR2 rac,有点生疏了,我檫!
|