Oracle里Instance Recovery的终点 Posted: August 1, 2012 | Author: Cui Hua | Filed under: Oracle | 4 Comments ? 在这篇文章里,我证明了On Disk RBA不是Instance Recovery的终点,Instance Recovery的终点就是current redo log file的最尾端。 DSI403e
在这篇文章里,我证明了On Disk RBA不是Instance Recovery的终点,Instance Recovery的终点就是current redo log file的最尾端。
DSI403e的第152页明确指出了Low Cache RBA和On Disk RBA各自的含义:
Low Cache RBA: The low RBA of the checkpoint queues indicate where recovery can begin. All buffers with lower RBAs were already written. Note that the buffer, or buffers, at this low RBA may be in the middle of a disk write when this value is constructed.
On Disk RBA: The on disk RBA is the highest RBA that is definitely on disk. Instance recovery must apply redo at least up to here. There are unusual circumstances where the on disk RBA may actually be lower than the low cache RBA. In these circumstances, the dirty buffers could not be written because a log force is needed. If the instance dies without a log force, then the redo at the low cache RBA may not even exist. Process death during redo generation, and buffer invalidation due to offline immediate, may remove the buffer at the on disk RBA from the checkpoint queue.
从中我们可以看出:
1、可能会出现Low Cache RBA的值已被写入control文件,但它所对应的redo record和dirty buffer还没有被写入redo log和数据文件;
2、On Disk RBA只是表示Instance Recovery的时候至少要恢复到On Disk RBA这个点;说白了,它只是真正的current redo log file的最尾端一个前镜像。
3、可能会出现On Disk RBA比Low Cache RBA小的情况,如果Oracle发现存在这种情况,则会强制写redo;
实际上,Instance Recovery的起点是Low Cache RBA和Thread Checkpoint RBA中的最大值,Instance Recovery的终点就是current redo log file的最尾端。
Oracle在做Instance Recovery的时候是受隐含参数_two_pass的控制,其默认为true,这表示要Oracle做Instance Recovery的时候是采用Two Pass Recovery,即要扫描current redo log file两次。
Two Pass Recovery的核心是在做Instance Recovery时要去掉那些已经被写入数据文件的数据块所对应的redo record,Oracle称这些redo record为Block Written Record (BWR)。
BWR所对应的op codes可能是23.1(这个我不确定,是猜的),如下是我从redo中dump的结果:
CHANGE #1 MEDIA RECOVERY MARKER SCN:0x0000.00000000 SEQ:0 OP:23.1 ENC:0
Block Written – afn: 2 rdba: 0x0080f3f4 BFT:(1024,8451060) non-BFT:(2,62452)
scn: 0x0000.13cc668e seq: 0x01 flg:0x06
Block Written – afn: 2 rdba: 0x0080f3f5 BFT:(1024,8451061) non-BFT:(2,62453)
scn: 0x0000.13cc6782 seq: 0x01 flg:0x06
Block Written - afn: 2 rdba: 0x0080f3f6 BFT:(1024,8451062) non-BFT:(2,62454)
scn: 0x0000.13cc6782 seq: 0x01 flg:0x06
好了,言归正传,我们现在来证明On Disk RBA不是Instance Recovery的终点,Instance Recovery的终点就是current redo log file的最尾端。
我们同时开三个session,先在session 1做一些准备工作:
Session 1:
创建一个自己到自己的db link,这是为了规避Oracle对PL/SQL循环中commit的优化过程,确保后续的PL/SQL循环里每一次commit后redo都能被及时写入redo log:
SQL> create public database link CUIHUA112 connect to SCOTT identified by tiger using ‘cuihua112′;
Database link created
验证一下上述自己到自己的db link是否有效:
SQL> select count(*) from dba_objects;
COUNT(*)
———-
71962
SQL> select count(*) from dba_objects@cuihua112;
COUNT(*)
———-
71962
创建一个测试表T:
SQL> create table t(id number);
Table created
创建一个用于测试的存储过程,在PL/SQL循环里每隔1秒钟就向表T插入一条数据,每插入一条就commit一次:
SQL> create or replace procedure p_instance_recovery_demo is
2 i number;
3 begin
4 for i in 1..100 loop
5 insert into t@cuihua112 values (i);
6 commit;
7 dbms_lock.sleep(1);
8 end loop;
9 end p_instance_recovery_demo;
10 /
Procedure created
准备工作做完了,我开始在session 1中执行上述存储过程:
Session 1:
SQL> exec p_instance_recovery_demo;
……这里正在执行
然后我们到session 2中去查询表T的插入数据的情况,并同时确定T1中插入数据的物理存储位置:
Session 2:
SQL> select t.id,dbms_rowid.rowid_relative_fno(rowid)||’_’||dbms_rowid.rowid_block_number(rowid) location from t;
ID LOCATION
———- ——————————————————————————–
1 4_87284
2 4_87284
3 4_87284
4 4_87284
……省略显示部分内容
27 4_87284
28 4_87284
29 4_87284
30 4_87284
30 rows selected
从上述结果中我们可以看到,现在表T1中已被插入了30条记录,这些记录的物理存储位置都是file 4,block 87274。
在session 3中我执行一次thread checkpoint,同时马上紧跟着执行shutdown abort:
Session 3:
SQL> alter system checkpoint;
系统已更改。
SQL> shutdown abort;
ORACLE 例程已经关闭。
回到session 1,我们发现上述存储过程p_instance_recovery_demo已被Oracle中断:
Session 1:
SQL> exec p_instance_recovery_demo;
begin p_instance_recovery_demo; end;
ORA-03113: 通信通道的文件结尾
进程 ID: 5816
会话 ID: 24 序列号: 3
我们现在新开一个command窗口,先用BBED去看一下现在数据文件上的Thread Checkpoint RBA是多少。这里有一点背景知识需要交待,在Oracle 10g以上的版本里,Checkpoint RBA的位置跟Oracle 9i是不一样的,其起始位置是在数据文件头的offset 500处:
BBED> p kcvfhckp
struct kcvfhckp, 160 bytes @484
struct kcvcpscn, 8 bytes @484
ub4 kscnbas @484 0x78799017
ub2 kscnwrp @488 0x079b
ub4 kcvcptim @492 0x2f1875d4
ub2 kcvcpthr @496 0x0001
union u, 12 bytes @500
struct kcvcprba, 12 bytes @500
ub4 kcrbaseq @500 0x00001fb9
ub4 kcrbabno @504 0x00000002
ub2 kcrbabof @508 0x0010
ub1 kcvcpetb[0] @512 0x02
……省略显示部分内容
ub1 kcvcpetb[127] @639 0x00
ub1 kcvcpetb[128] @640 0x00
回到上述command窗口,这里因为是将windows平台上Oracle 9i的BBED用于Oracle 11gR2,所以定位数据块的时候block number要往后错一位:
BBED> set file 4 block 2
FILE# 4
BLOCK# 2
BBED> set offset 500
OFFSET 500
BBED> dump
File: C:\APP\CUIHUA\ORADATA\CUIHUA112\USERS01.DBF (4)
Block: 2 Offsets: 500 to 1011 Dba:0x01000002
————————————————————————
c3010000 0fff0000 10000000 02000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
……省略显示部分内容
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
可以看到现在的Thread Checkpoint RBA是000001c3.0000ff0f.0010。
我们在来看一下表T的数据插入情况:
BBED> set file 4 block 87285
FILE# 4
BLOCK# 87285
BBED> map /v
File: C:\APP\CUIHUA\ORADATA\CUIHUA112\USERS01.DBF (4)
Block: 87285 Dba:0x010154f5
————————————————————
KTB Data Block (Table/Cluster)
struct kcbh, 20 bytes @0
ub1 type_kcbh @0
……省略显示部分内容
ub2 spare3_kcbh @18
struct ktbbh, 72 bytes @20
ub1 ktbbhtyp @20
……省略显示部分内容
struct ktbbhitl[2], 48 bytes @44
struct kdbh, 14 bytes @100
ub1 kdbhflag @100
……省略显示部分内容
b2 kdbhtosp @112
struct kdbt[1], 4 bytes @114
b2 kdbtoffs @114
b2 kdbtnrow @116
sb2 kdbr[49] @118
ub1 freespace[7678] @216
ub1 rowdata[294] @7894
ub4 tailchk @8188
很明显,现在表T在磁盘上只有49条记录。
我们来看一下这些记录:
BBED> p *kdbr[0]
rowdata[288]
————
ub1 rowdata[288] @8182 0x2c
BBED> x /rn
rowdata[288] @8182
————
flag@8182: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@8183: 0x00
cols@8184: 1
col 0[2] @8185: 1
BBED> p *kdbr[1]
rowdata[282]
————
ub1 rowdata[282] @8176 0x2c
BBED> x /rn
rowdata[282] @8176
————
flag@8176: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@8177: 0x00
cols@8178: 1
col 0[2] @8179: 2
BBED> p *kdbr[48]
rowdata[0]
———-
ub1 rowdata[0] @7894 0x2c
BBED> x /rn
rowdata[0] @7894
———-
flag@7894: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@7895: 0x01
cols@7896: 1
col 0[2] @7897: 49
从上述内容我们可以看出,表T现在在磁盘上的第1条记录的id列的值为1,第2条记录的id列的值为2,第49条记录的id列的值为49,这和我们预期的一致。
现在我们将上述shutdown abort后的库启动到mount状态,然后对control文件和current redo log file执行dump操作:
E:\>sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on 星期三 8月 1 10:20:58 2012
Copyright (c) 1982, 2010, Oracle. All rights reserved.
SQL> conn / as sysdba;
已连接到空闲例程。
SQL> startup mount
ORACLE 例程已经启动。
Total System Global Area 640286720 bytes
Fixed Size 1376492 bytes
Variable Size 234884884 bytes
Database Buffers 398458880 bytes
Redo Buffers 5566464 bytes
数据库装载完毕。
SQL> select group# from v$log where status=’CURRENT';
GROUP#
———-
1
SQL> select member from v$logfile where group#=1;
MEMBER
——————————————————————————–
C:\APP\CUIHUA\ORADATA\CUIHUA112\REDO01.LOG
SQL> oradebug setmypid
已处理的语句
SQL> alter session set events ‘immediate trace name controlf level 3′;
会话已更改。
SQL> alter system dump logfile ‘C:\APP\CUIHUA\ORADATA\CUIHUA112\REDO01.LOG';
系统已更改。
SQL> oradebug tracefile_name
c:\app\cuihua\diag\rdbms\cuihua112\cuihua112\trace\cuihua112_ora_4964.trc
我们从上述trace文件c:\app\cuihua\diag\rdbms\cuihua112\cuihua112\trace\cuihua112_ora_4964.trc中就可以看到现在上述库的控制文件和current redo log的内容。
先来看上述trace文件中包含的对控制文件的dump内容里的Low Cache RBA和On Disk RBA:
***************************************************************************
CHECKPOINT PROGRESS RECORDS
***************************************************************************
(size = 8180, compat size = 8180, div max = 11, div in-use = 0,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 2, numrecs = 11)
THREAD #1 – status:0x2 flags:0x0 dirty:1
low cache rba:(0x1c3.ff1a.0) on disk rba:(0x1c3.ff1d.0)
on disk scn: 0x0001.c0184127 08/01/2012 09:56:43
resetlogs scn: 0x0000.000e5bb0 09/18/2011 16:03:45
heartbeat: 790133046 mount id: 10175328
现在的Low Cache RBA是0x1c3.ff1a.0,刚才我们已经用BBED看到Thread Checkpoint RBA是000001c3.0000ff0f.0010,很显然现在Low Cache RBA大于Thread Checkpoint RBA,所以Oracle在做Instance Recovery的时候就会以Low Cache RBA为准。
SQL> select to_number(‘1c3′,’XXXXXXXX’) from dual;
TO_NUMBER(‘1C3′,’XXXXXXXX’)
—————————
451
SQL> select to_number(‘ff1a’,’XXXXXXXX’) from dual;
TO_NUMBER(‘FF1A’,’XXXXXXXX’)
—————————-
65306
从上面的结果里我们可以看出,Low Cache RBA是0x1c3.ff1a.0,转换过来就是Low Cache RBA的logfile sequence是451,logfile block number是65306。
另外,我们可以看到现在控制文件里记录的On Disk RBA是0x1c3.ff1d.0,转换过来就是On Disk RBA的logfile sequence是451,logfile block number是65309:
SQL> select to_number(‘1c3′,’XXXXXXXX’) from dual;
TO_NUMBER(‘1C3′,’XXXXXXXX’)
—————————
451
SQL> select to_number(‘ff1d’,’XXXXXXXX’) from dual;
TO_NUMBER(‘FF1D’,’XXXXXXXX’)
—————————-
65309
我们再来看上述trace文件中包含的对current redo log file的dump内容里的尾端redo record的情况:
首先来看current redo log file的尾端的插入记录:
REDO RECORD – Thread:1 RBA: 0x0001c3.0000ff31.0110 LEN: 0x0140 VLD: 0x01
SCN: 0x0001.c018412d SUBSCN: 2 08/01/2012 09:56:46
CHANGE #1 TYP:0 CLS:25 AFN:3 DBA:0x00c000c0 OBJ:4294967295 SCN:0x0001.c018412d SEQ:1 OP:5.2 ENC:0 RBL:0
ktudh redo: slt: 0x0013 sqn: 0x00000000 flg: 0x0002 siz: 80 fbi: 0
uba: 0x00c019b8.02d7.16 pxid: 0x0000.000.00000000
CHANGE #2 TYP:0 CLS:26 AFN:3 DBA:0x00c019b8 OBJ:4294967295 SCN:0x0001.c018412d SEQ:1 OP:5.1 ENC:0 RBL:0
ktudb redo: siz: 80 spc: 5148 flg: 0x1022 seq: 0x02d7 rec: 0x16
xid: 0x0005.013.000009f4
ktubu redo: slt: 19 rci: 0 opc: 11.1 objn: 82084 objd: 82084 tsn: 4
Undo type: Regular undo Undo type: Last buffer split: No
Tablespace Undo: No
0x00000000
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
compat bit: 4 (post-11) padding: 0
op: L itl: xid: 0x0003.007.0000091e uba: 0x00c000ec.0451.0f
flg: C— lkc: 0 scn: 0x0001.c0184129
KDO Op code: DRP row dependencies Disabled
xtype: XA flags: 0x00000000 bdba: 0x010154f4 hdba: 0x010154f2
itli: 2 ispac: 0 maxfr: 4858
tabn: 0 slot: 51(0x33)
CHANGE #3 TYP:2 CLS:1 AFN:4 DBA:0x010154f4 OBJ:82084 SCN:0x0001.c018412c SEQ:1 OP:11.2 ENC:0 RBL:0
KTB Redo
op: 0x01 ver: 0x01
compat bit: 4 (post-11) padding: 0
op: F xid: 0x0005.013.000009f4 uba: 0x00c019b8.02d7.16
KDO Op code: IRP row dependencies Disabled
xtype: XA flags: 0x00000000 bdba: 0x010154f4 hdba: 0x010154f2
itli: 2 ispac: 0 maxfr: 4858
tabn: 0 slot: 51(0x33) size/delt: 6
fb: –H-FL– lb: 0x2 cc: 1
null: -
col 0: [ 2] c1 35
从上述内容我们可以看到,最后一条对object 82084的插入记录(这里op codes为11.2,表示是insert操作)的第1列(即表T的id列)的插入值是0xc135,它所对应的RBA是0x0001c3.0000ff31.0110:
对象82084就是表T:
SQL> select owner,object_name from dba_objects where object_id=82084;
OWNER OBJECT_NAME
—————————— ——————–
SCOTT T
0xc135就是52:
SQL> select utl_raw.cast_to_number(‘c135′) from dual;
UTL_RAW.CAST_TO_NUMBER(‘C135′)
——————————
52
注意到这里差异就产生了。我们刚才用BBED查看了表T在磁盘上的最后一条记录,其id值是49。但这里对current redo log file的dump清晰的告诉我们,上述表T的最后一条被成功插入的记录的id值是52。也就是说,id为50、51和52的那三条记录还在buffer cache里,还没有被写回到数据文件。
另外我们刚才已经从对控制文件的dump内容看到On Disk RBA的值是0x1c3.ff1d.0,而上述插入id值为52的这条redo record的RBA是0x0001c3.0000ff31.0110,即现在的On Disk RBA小于id值为52的这条redo record所在的RBA。如果Oracle在做Instance Recovery的时候只恢复到On Disk RBA,那么就意味着id为52的这条记录就真的丢掉了,这显然是很扯淡的事情,不可能这样的。
我们接着来看current redo log file尾端的最后一条redo record:
REDO RECORD – Thread:1 RBA: 0x0001c3.0000ff34.0010 LEN: 0x008c VLD: 0x05
SCN: 0x0001.c018412f SUBSCN: 1 08/01/2012 09:56:46
CHANGE #1 TYP:0 CLS:25 AFN:3 DBA:0x00c000c0 OBJ:4294967295 SCN:0x0001.c018412d SEQ:4 OP:5.4 ENC:0 RBL:0
ktucm redo: slt: 0x0013 sqn: 0x000009f4 srt: 0 sta: 9 flg: 0x2 ktucf redo: uba: 0x00c019b8.02d7.18 ext: 2 spc: 4638 fbi: 0
END OF REDO DUMP
从上面的内容我们可以看到,现在current redo log file尾端的最后一条redo record对应的RBA是0x0001c3.0000ff34.0010,翻译过来就是current redo log file尾端的最后一条redo record对应的logfile sequence是451,logfile block number是65332:
SQL> select to_number(‘1c3′,’XXXXXXXX’) from dual;
TO_NUMBER(‘1C3′,’XXXXXXXX’)
—————————
451
SQL> select to_number(‘ff34′,’XXXXXXXX’) from dual;
TO_NUMBER(‘FF34′,’XXXXXXXX’)
—————————-
65332
好了,我们现在回到上述command窗口来把上述数据库open。在open完毕后我们马上紧跟着执行对当前控制文件的dump操作:
SQL> alter database open;
数据库已更改。
SQL> alter session set events ‘immediate trace name controlf level 3′;
会话已更改。
很显然,Oracle在open上述库的过程中做了Instance Recovery,我们现在去看相关的alert log里记录的内容:
alter database open
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Started redo scan
Completed redo scan
read 13 KB redo, 8 data blocks need recovery
Started redo application at
Thread 1: logseq 451, block 65306
Recovery of Online Redo Log: Thread 1 Group 1 Seq 451 Reading mem 0
Mem# 0: C:\APP\CUIHUA\ORADATA\CUIHUA112\REDO01.LOG
Completed redo application of 0.00MB
Completed crash recovery at
Thread 1: logseq 451, block 65333, scn 7517802320
8 data blocks read, 8 data blocks written, 13 redo k-bytes read
从上述alert log我们可以知道,Oracle做Instance Recovery的起点是logseq 451, block 65306;终点是logseq 451, block 65333。
从之前的对控制文件的dump我们可以看到控制文件里记录的Low Cache RBA是0x1c3.ff1a.0,转换过来就是Low Cache RBA的logfile sequence是451,logfile block number是65306。”这和alert log里记录的Instance Recovery的起点一致,即Instance Recovery的起点就是Low Cache RBA和Thread Checkpoint RBA中的最大值。
另外,控制文件里记录的On Disk RBA是0x1c3.ff1d.0,转换过来就是On Disk RBA的logfile sequence是451,logfile block number是65309。显然这个和alert log里记录的Instance Recovery的终点不一致,所以On Disk RBA不是Instance Recovery的终点。
从之前的对current redo log的dump我们可以看到在shutdown abort时current redo log的最后一条redo record所对应的RBA是0x0001c3.0000ff34.0010,转换过来就是current redo log file的最后一条redo record对应的logfile sequence是451,logfile block number是65332,这个再往后错一位就匹配上了alert log里记录的Instance Recovery的终点。即Instance Recovery的终点就是current redo log file的最尾端。
正是因为Instance Recovery的终点是current redo log file的最尾端,这就保证了只要我们成功commit了,那这些commit后的数据在Oracle数据库里就不会丢失。
我们来看一下开库后现在表T的数据情况:
SQL> select count(*) from scott.t;
COUNT(*)
———-
52
SQL> select max(id) from scott.t;
MAX(ID)
———-
52
显然,最后插入的那条id为52的记录没有丢失。
至此,我们已经完成了整个的证明过程。最后我们来看一下我们开库后马上做的那个对控制文件的dump的内容:
***************************************************************************
CHECKPOINT PROGRESS RECORDS
***************************************************************************
(size = 8180, compat size = 8180, div max = 11, div in-use = 0,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 2, numrecs = 11)
THREAD #1 – status:0x2 flags:0x0 dirty:31
low cache rba:(0x1c4.3.0) on disk rba:(0x1c4.5d.0)
on disk scn: 0x0001.c018901b 08/01/2012 10:35:50
resetlogs scn: 0x0000.000e5bb0 09/18/2011 16:03:45
heartbeat: 790133317 mount id: 10175328
从上述内容我们可以看到,Oracle在Open上述库后马上递增了Low Cache RBA和On Disk RBA。
Kamus在讨论这个问题的时候曾提到这样一个观点:他认为Oracle在做Instance Recovery的时候是从Low Cache RBA到On Disk RBA,如果出现了上述这种On Disk RBA滞后于current redo log file的最末端的情况,Oracle会先用current redo log file的最末端去更新On Disk RBA,再做Instance Recovery,这样就依然可以说Oracle在做Instance Recovery的时候是从Low Cache RBA到On Disk RBA了。
本来这个问题不好证明真伪,但他提示我在做Instance Recovery的时候会产生一个trace文件。这个提醒了我,很可惜上文中那包含对控制文件和current redo log file的dump文件被我删掉了,我只好再做一遍。
再如法炮制一遍,我们先来看对shutdown abort后控制文件的dump:
***************************************************************************
CHECKPOINT PROGRESS RECORDS
***************************************************************************
(size = 8180, compat size = 8180, div max = 11, div in-use = 0,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 2, numrecs = 11)
THREAD #1 – status:0x2 flags:0x0 dirty:119
low cache rba:(0x1c4.5357.0) on disk rba:(0x1c4.564a.0)
on disk scn: 0x0001.c018ba6b 08/01/2012 15:34:01
resetlogs scn: 0x0000.000e5bb0 09/18/2011 16:03:45
heartbeat: 790232485 mount id: 10259158
从上述对控制文件的dump我们可以看出,现在的Low Cache RBA是0x1c4.5357.0,即logseq 452, block 21335。On Disk RBA是0x1c4.564a.0,即logseq 452, block 22090:
SQL> select to_number(‘1c4′,’XXXXXXXX’) from dual;
TO_NUMBER(‘1C4′,’XXXXXXXX’)
—————————
452
SQL> select to_number(‘5357′,’XXXXXXXX’) from dual;
TO_NUMBER(‘5357′,’XXXXXXXX’)
—————————-
21335
SQL> select to_number(‘564a’,’XXXXXXXX’) from dual;
TO_NUMBER(‘564A’,’XXXXXXXX’)
—————————-
22090
再来看对current redo log的最尾端的dump内容:
REDO RECORD – Thread:1 RBA: 0x0001c4.00005651.0010 LEN: 0x008c VLD: 0x05
SCN: 0x0001.c018ba70 SUBSCN: 1 08/01/2012 15:34:03
CHANGE #1 TYP:0 CLS:19 AFN:3 DBA:0x00c00090 OBJ:4294967295 SCN:0x0001.c018ba6e SEQ:4 OP:5.4 ENC:0 RBL:0
ktucm redo: slt: 0x000a sqn: 0x0000090b srt: 0 sta: 9 flg: 0x2 ktucf redo: uba: 0x00c01400.0265.2b ext: 3 spc: 2988 fbi: 0
END OF REDO DUMP
从上述对current redo log的dump内容我们可以看出,现在最尾端的redo record所对应的RBA是0x0001c4.00005651.0010,即logseq 452, block 22097:
SQL> select to_number(‘1c4′,’XXXXXXXX’) from dual;
TO_NUMBER(‘1C4′,’XXXXXXXX’)
—————————
452
SQL> select to_number(‘5651′,’XXXXXXXX’) from dual;
TO_NUMBER(‘5651′,’XXXXXXXX’)
—————————-
22097
然后,我们看一下开库时alert log中记录的跟Instance Recovery相关的内容:
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Started redo scan
Completed redo scan
read 381 KB redo, 119 data blocks need recovery
Started redo application at
Thread 1: logseq 452, block 21335
Recovery of Online Redo Log: Thread 1 Group 2 Seq 452 Reading mem 0
Mem# 0: C:\APP\CUIHUA\ORADATA\CUIHUA112\REDO02.LOG
Completed redo application of 0.30MB
Completed crash recovery at
Thread 1: logseq 452, block 22098, scn 7517833361
119 data blocks read, 119 data blocks written, 381 redo k-bytes read
还是和之前的结论一样,当On Disk RBA滞后于current redo log file的最尾端的时候,Oracle做Instance Recovery的终点不是On Disk RBA,而是current redo log file的最尾端。
注意这里Oracle在做Instance Recovery的时候是使用的并行恢复,并且启了两个子进程(parallel recovery started with 2 processes)。
子进程P000需要负责44个block的恢复,它的trace文件内容为如下所示:
KCRP: blocks claimed = 44, eliminated = 0
—– Recovery Hash Table Statistics ———
Hash table buckets = 32768
Longest hash chain = 1
Average hash chain = 44/44 = 1.0
Max compares per lookup = 0
Avg compares per lookup = 0/44 = 0.0
———————————————-
—– Recovery Hash Table Statistics ———
Hash table buckets = 32768
Longest hash chain = 1
Average hash chain = 44/44 = 1.0
Max compares per lookup = 1
Avg compares per lookup = 1147/1147 = 1.0
子进程P001需要负责75个block的恢复,它的trace文件内容为如下所示:
KCRP: blocks claimed = 75, eliminated = 0
—– Recovery Hash Table Statistics ———
Hash table buckets = 32768
Longest hash chain = 1
Average hash chain = 75/75 = 1.0
Max compares per lookup = 0
Avg compares per lookup = 0/75 = 0.0
———————————————-
*** 2012-08-01 15:39:15.328
—– Recovery Hash Table Statistics ———
Hash table buckets = 32768
Longest hash chain = 1
Average hash chain = 75/75 = 1.0
Max compares per lookup = 1
Avg compares per lookup = 1341/1341 = 1.0
———————————————-
44+75=119,这刚好就是alert log中记录的Instance Recovery需要恢复的block的总数119。
我们再来看QC的trace文件:
Successfully allocated 2 recovery slaves
*** 2012-08-01 15:39:13.734
Using 66 overflow buffers per recovery slave
Thread 1 checkpoint: logseq 452, block 21334, scn 7517812856
cache-low rba: logseq 452, block 21335
on-disk rba: logseq 452, block 22090, scn 7517813355
start recovery at logseq 452, block 21335, scn 0
*** 2012-08-01 15:39:14.078
Started writing zeroblks thread 1 seq 452 blocks 22098-22105
*** 2012-08-01 15:39:14.078
Completed writing zeroblks thread 1 seq 452
==== Redo read statistics for thread 1 ====
Total physical reads (from disk and memory): 4096Kb
– Redo read_disk statistics –
Read rate (ASYNC): 381Kb in 0.32s => 1.16 Mb/sec
Longest record: 21Kb, moves: 0/665 (0%)
Longest LWN: 23Kb, moves: 0/155 (0%), moved: 0Mb
Last redo scn: 0x0001.c018ba70 (7517813360)
———————————————-
—– Recovery Hash Table Statistics ———
Hash table buckets = 32768
Longest hash chain = 1
Average hash chain = 119/119 = 1.0
Max compares per lookup = 1
Avg compares per lookup = 2369/2488 = 1.0
———————————————-
*** 2012-08-01 15:39:14.140
KCRA: start recovery claims for 119 data blocks
*** 2012-08-01 15:39:14.546
KCRA: blocks processed = 119/119, claimed = 119, eliminated = 0
*** 2012-08-01 15:39:14.656
Recovery of Online Redo Log: Thread 1 Group 2 Seq 452 Reading mem 0
*** 2012-08-01 15:39:14.734
Completed redo application of 0.30MB
*** 2012-08-01 15:39:14.781
Completed recovery checkpoint
IR RIA: redo_size 390144 bytes, time_taken 87 ms
—– Recovery Hash Table Statistics ———
Hash table buckets = 32768
Longest hash chain = 1
Average hash chain = 119/119 = 1.0
Max compares per lookup = 1
Avg compares per lookup = 2488/2488 = 1.0
———————————————-
Recovery sets nab of thread 1 seq 452 to 22098 with 8 zeroblks
*** 2012-08-01 15:39:21.156
kwqmnich: current time:: 7: 39: 20: 0
kwqmnich: instance no 0 repartition flag 1
kwqmnich: initialized job cache structure
*** 2012-08-01 15:39:21.906
kwqinfy: Call kwqrNondurSubInstTsk
我们注意到按时间顺序,上述trace文件中有如下内容:
Thread 1 checkpoint: logseq 452, block 21334, scn 7517812856
cache-low rba: logseq 452, block 21335
on-disk rba: logseq 452, block 22090, scn 7517813355
start recovery at logseq 452, block 21335, scn 0
很明显Oracle这里比对了Thread Checkpoint RBA和Low Cache RBA的值,并从中选择了这两个中的最大值作为Instance Recovery的起点。
并且这里Oracle没有去更新On Disk RBA的值,用的还是控制文件中记录的shutdown abort时的值logseq 452, block 22090。
*** 2012-08-01 15:39:14.078
Started writing zeroblks thread 1 seq 452 blocks 22098-22105
*** 2012-08-01 15:39:14.078
Completed writing zeroblks thread 1 seq 452
Oracle这里从current redo log file的最尾端thread 1 seq 452 blocks 22098开始写入了8个空块,这应该为开库后做的redo log切换做准备。
*** 2012-08-01 15:39:14.781
Completed recovery checkpoint
IR RIA: redo_size 390144 bytes, time_taken 87 ms
最后Oracle做完了所有的Instance Recovery工作才用当时的recovery checkpoint去更新控制文件。
最后我们再次总结一下:
Instance Recovery的起点是Low Cache RBA和Thread Checkpoint RBA中的最大值,Instance Recovery的终点就是current redo log file的最尾端,不是On Disk RBA。