一个存储档案的rac数据库起不来了,生产环境是linux rac 11.2.0.4,原因是因为用工具测试磁盘IO时损坏了ocr所在磁盘组与存储数据ASM磁盘的磁盘头。下面是恢复过程:
1.检查crs的状态:
[grid@darac1 ~]$ crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager [root@darac1 crsd]# ps -ef|grep crs root 3126 1 1 10:34 ? 00:00:31 /u01/app/product/11.2.0/crs/bin/ohasd.bin reboot grid 3514 1 0 10:34 ? 00:00:07 /u01/app/product/11.2.0/crs/bin/oraagent.bin grid 3525 1 0 10:34 ? 00:00:00 /u01/app/product/11.2.0/crs/bin/mdnsd.bin grid 3537 1 0 10:34 ? 00:00:16 /u01/app/product/11.2.0/crs/bin/gpnpd.bin grid 3549 1 1 10:34 ? 00:00:33 /u01/app/product/11.2.0/crs/bin/gipcd.bin root 4128 1 0 10:54 ? 00:00:02 /u01/app/product/11.2.0/crs/bin/cssdmonitor root 4144 1 0 10:54 ? 00:00:01 /u01/app/product/11.2.0/crs/bin/cssdagent grid 4167 1 2 10:55 ? 00:00:14 /u01/app/product/11.2.0/crs/bin/ocssd.bin root 4354 3680 0 11:04 pts/1 00:00:00 grep crs
2.强制关闭crs
[root@darac1 bin]# ./crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'darac1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'darac1' CRS-2673: Attempting to stop 'ora.gipcd' on 'darac1' CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'darac1' CRS-2677: Stop of 'ora.cssdmonitor' on 'darac1' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'darac1' succeeded CRS-2677: Stop of 'ora.gipcd' on 'darac1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'darac1' CRS-2677: Stop of 'ora.gpnpd' on 'darac1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'darac1' has completed CRS-4133: Oracle High Availability Services has been stopped.
3.以exclusive模式启动crs
[root@darac1 bin]# ./crsctl start crs -excl -nocrs CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'darac1' CRS-2676: Start of 'ora.mdnsd' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'darac1' CRS-2676: Start of 'ora.gpnpd' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'darac1' CRS-2672: Attempting to start 'ora.gipcd' on 'darac1' CRS-2676: Start of 'ora.gipcd' on 'darac1' succeeded CRS-2676: Start of 'ora.cssdmonitor' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'darac1' CRS-2672: Attempting to start 'ora.diskmon' on 'darac1' CRS-2676: Start of 'ora.diskmon' on 'darac1' succeeded CRS-2676: Start of 'ora.cssd' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.drivers.acfs' on 'darac1' CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'darac1' CRS-2672: Attempting to start 'ora.ctssd' on 'darac1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'darac1' CRS-2676: Start of 'ora.ctssd' on 'darac1' succeeded CRS-2676: Start of 'ora.drivers.acfs' on 'darac1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'darac1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'darac1' CRS-2676: Start of 'ora.asm' on 'darac1' succeeded
4.查看GI相关的alert.log日志文件如何
[ohasd(5040)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2016-10-13 11:20:47.302:
[gpnpd(5215)]CRS-2328:GPNPD started on node darac1.
2016-10-13 11:20:58.388:
[ohasd(5040)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2016-10-13 11:21:00.608:
[cssd(5318)]CRS-1713:CSSD daemon is started in clustered mode
2016-10-13 11:21:01.521:
[/u01/app/product/11.2.0/crs/bin/orarootagent.bin(5304)]CRS-5013:Agent "/u01/app/product/11.2.0/crs/bin/orarootagent.bin" failed to start process
"/u01/app/product/11.2.0/crs/bin/osysmond" for action "start": details at "(:CLSN00008:)" in
"/u01/app/product/11.2.0/crs/log/darac1/agent/ohasd/orarootagent_root//orarootagent_root.log"
2016-10-13 11:21:03.585:
[ohasd(5040)]CRS-2878:Failed to restart resource 'ora.crf'
2016-10-13 11:21:05.399:
[/u01/app/product/11.2.0/crs/bin/orarootagent.bin(5340)]CRS-5013:Agent "/u01/app/product/11.2.0/crs/bin/orarootagent.bin" failed to start process
"/u01/app/product/11.2.0/crs/bin/osysmond" for action "start": details at "(:CLSN00008:)" in
"/u01/app/product/11.2.0/crs/log/darac1/agent/ohasd/orarootagent_root//orarootagent_root.log"
2016-10-13 11:21:10.703:
[ohasd(5040)]CRS-2878:Failed to restart resource 'ora.crf'
2016-10-13 11:21:23.464:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:21:38.698:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:21:53.925:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:22:09.463:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:22:24.804:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:22:40.252:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:22:56.722:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:23:12.009:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:23:27.290:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:23:42.872:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:23:58.198:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:24:13.500:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:24:28.786:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:24:43.488:
[client(5394)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/product/11.2.0/crs/log/darac1/client/ocrcheck_5394.log.
2016-10-13 11:24:43.959:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:24:51.823:
[client(5424)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/product/11.2.0/crs/log/darac1/client/crsctl_grid.log.
2016-10-13 11:24:59.345:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:25:14.526:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:25:29.696:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:25:44.860:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:26:00.042:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:26:15.218:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:26:30.409:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:26:45.577:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:26:49.031:
[client(5460)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /u01/app/product/11.2.0/crs/log/darac1/client/ocrconfig_5460.log.
2016-10-13 11:27:00.766:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:27:15.951:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:27:31.142:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:27:46.339:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:28:01.530:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:28:16.733:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:28:32.008:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:28:47.191:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:29:02.389:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:29:17.610:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:29:32.832:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:29:48.035:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:30:03.229:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:30:18.434:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:30:33.679:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:30:48.876:
[cssd(5318)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in
/u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:31:01.534:
[/u01/app/product/11.2.0/crs/bin/cssdagent(5284)]CRS-5818:Aborted command 'start' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:0:2} in
/u01/app/product/11.2.0/crs/log/darac1/agent/ohasd/oracssdagent_root//oracssdagent_root.log.
2016-10-13 11:31:01.540:
[cssd(5318)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/product/11.2.0/crs/log/darac1/cssd/ocssd.log
2016-10-13 11:31:01.541:
[cssd(5318)]CRS-1603:CSSD on node darac1 shutdown by user.
从上面的信息可以看到找不到voting files
5.检查ASM的alert.log可以找如下创建CRSDG,DATADG磁盘组的创建语句:
Wed Dec 02 16:09:01 2015 SQL> CREATE DISKGROUP CRSDG EXTERNAL REDUNDANCY DISK '/dev/raw/raw1' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
6.检查磁盘头
[grid@darac1 ~]$ kfed read /dev/raw/raw1 kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 300392945 ; 0x00c: 0x11e7a1f1 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 B7F46200 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
7.使用kfed恢复CRSDG的磁盘头,但因为备份信息也被损坏所以恢复时报错,而且没有手动备份
[grid@darac1 ~]$ kfed repair /dev/raw/raw1 KFED-00320: Invalid block num1 = [0], num2 = [1], error = [endian_kfbh]
没有通过自动备份的磁盘头信息来进行恢复,只能使用自动备份的ocr信息来恢复了操作如下。
8.创建磁盘组
[grid@darac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 13 13:00:42 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> select * from v$asm_diskgroup; no rows selected SQL> create diskgroup CRSDG external redundancy disk '/dev/raw/raw1' attribute 'COMPATIBLE.ASM' = '11.2.0.0.0'; Diskgroup created.
9.查看自动备份的ocr文件
[root@darac1 bin]# ./ocrconfig -showbackup PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy darac2 2016/10/13 06:29:53 /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup00.ocr darac2 2016/10/13 02:29:45 /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup01.ocr darac2 2016/10/12 22:29:37 /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup02.ocr darac2 2016/10/12 02:27:20 /u01/app/product/11.2.0/crs/cdata/darac-cluster/day.ocr darac2 2016/10/11 22:27:10 /u01/app/product/11.2.0/crs/cdata/darac-cluster/week.ocr
10.还原ocr
[root@darac1 bin]# ./ocrconfig -restore /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup00.ocr
11.处理votedisk
[root@darac1 bin]# ./ocrconfig -restore /u01/app/product/11.2.0/crs/cdata/darac-cluster/backup00.ocr [root@darac1 bin]# ./crsctl replace votedisk +CRSDG Successful addition of voting disk 44eaf86504ea4f76bfb43cb7931a3fc7. Successfully replaced voting disk group with +CRSDG. CRS-4266: Voting file(s) successfully replaced
12.创建asm spfile
[grid@darac1 ~]$ vi /tmp/asm.txt instance_type='asm' large_pool_size=12M remote_login_passwordfile= 'EXCLUSIVE' asm_diskstring = '/dev/raw/raw*' asm_power_limit =1 [grid@darac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 13 13:40:02 2016 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - Production With the Real Application Clusters and Automatic Storage Management options SQL> create spfile='+CRSDG' FROM pfile='/tmp/asm.txt'; File created.
13.重启crs
[root@darac1 bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'darac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'darac1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'darac1'
CRS-2673: Attempting to stop 'ora.asm' on 'darac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'darac1'
CRS-2677: Stop of 'ora.ctssd' on 'darac1' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'darac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'darac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'darac1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'darac1' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'darac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'darac1'
CRS-2677: Stop of 'ora.cssd' on 'darac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'darac1'
CRS-2677: Stop of 'ora.gipcd' on 'darac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'darac1'
CRS-2677: Stop of 'ora.gpnpd' on 'darac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'darac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@darac1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[grid@darac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRSDG.dg
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.DATADG.dg
ONLINE OFFLINE darac1
ONLINE OFFLINE darac2
ora.LISTENER.lsnr
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.asm
ONLINE ONLINE darac1 Started
ONLINE ONLINE darac2 Started
ora.gsd
OFFLINE OFFLINE darac1
OFFLINE OFFLINE darac2
ora.net1.network
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.ons
ONLINE ONLINE darac1
ONLINE OFFLINE darac2
ora.registry.acfs
ONLINE ONLINE darac1
ONLINE ONLINE darac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE darac1
ora.cvu
1 ONLINE ONLINE darac1
ora.darac.db
1 ONLINE OFFLINE Corrupted Controlfi
le
2 ONLINE OFFLINE Corrupted Controlfi
le
ora.darac1.vip
1 ONLINE ONLINE darac1
ora.darac2.vip
1 ONLINE ONLINE darac2
ora.darac3.vip
1 ONLINE OFFLINE
ora.oc4j
1 ONLINE OFFLINE STARTING
ora.scan1.vip
1 ONLINE ONLINE darac1
从上面的信息可以看到DATADG磁盘组没有加载,数据库darac也没有启动,并且显示错误的控制文件。alert_asm1.log中,有创建磁盘组的信息:
Wed Dec 02 18:27:46 2015 SQL> CREATE DISKGROUP DATADG EXTERNAL REDUNDANCY DISK '/dev/raw/raw3' SIZE 10240M ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
14.查看磁盘组的状态
SQL> select name,state from v$asm_diskgroup; NAME STATE -------------------------------------------------- ---------------------- CRSDG MOUNTED ARCH MOUNTED
15.手动加载DATADG磁盘报错
SQL> alter diskgroup DATADG mount; alter diskgroup DATADG mount * ERROR at line 1: ORA-15032: not all alterations performed ORA-15017: diskgroup "DATADG" cannot be mounted ORA-15040: diskgroup is incomplete
16.查看磁盘组磁盘头的状态,可以看到/dev/raw/raw3为candidate
SQL> select name,path,header_status from v$asm_disk;
NAME PATH HEADER_STATUS
-------------------------------------------------- -------------------------------------------------- ------------------------------
/dev/raw/raw3 CANDIDATE
ARCH_0000 /dev/raw/raw2 MEMBER
CRSDG_0000 /dev/raw/raw1 MEMBER
17.尝试使用自动备份的磁盘头信息来恢复磁盘头,这个DATADG磁盘恢复成功。
[grid@darac1 ~]$ kfed repair /dev/raw/raw3
SQL> select name,state from v$asm_diskgroup;
NAME STATE
-------------------------------------------------- ----------------------
CRSDG MOUNTED
DATADG DISMOUNTED
ARCH MOUNTED
SQL> select name,path,header_status from v$asm_disk;
NAME PATH HEADER_STATUS
-------------------------------------------------- -------------------------------------------------- ------------------------------
/dev/raw/raw3 MEMBER
ARCH_0000 /dev/raw/raw2 MEMBER
CRSDG_0000 /dev/raw/raw1 MEMBER
18.手动加载DATADG磁盘报错
SQL> alter diskgroup DATADG mount; Diskgroup altered. SQL> select name,state from v$asm_diskgroup; NAME STATE -------------------------------------------------- ---------------------- CRSDG MOUNTED DATADG MOUNTED ARCH MOUNTED
19.查看磁盘组磁盘头的状态,可以看到/dev/raw/raw3为member
SQL> select name,path,header_status from v$asm_disk; NAME PATH HEADER_STATUS -------------------------------------------------- -------------------------------------------------- ------------------------------ ARCH_0000 /dev/raw/raw2 MEMBER DATADG_0000 /dev/raw/raw3 MEMBER CRSDG_0000 /dev/raw/raw1 MEMBER
20.启动数据库darac
[grid@darac1 ~]$ srvctl start database -d darac
[grid@darac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.CRSDG.dg
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.DATADG.dg
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.LISTENER.lsnr
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.asm
ONLINE ONLINE darac1 Started
ONLINE ONLINE darac2 Started
ora.gsd
OFFLINE OFFLINE darac1
OFFLINE OFFLINE darac2
ora.net1.network
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.ons
ONLINE ONLINE darac1
ONLINE ONLINE darac2
ora.registry.acfs
ONLINE ONLINE darac1
ONLINE ONLINE darac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE darac1
ora.cvu
1 ONLINE ONLINE darac1
ora.darac.db
1 ONLINE ONLINE darac1 Open
2 ONLINE ONLINE darac2 Open
ora.darac1.vip
1 ONLINE ONLINE darac1
ora.darac2.vip
1 ONLINE ONLINE darac2
ora.darac3.vip
1 ONLINE OFFLINE
ora.oc4j
1 ONLINE ONLINE darac1
ora.scan1.vip
1 ONLINE ONLINE darac1
到此数据库恢复成功。
1 第13步,重建DATADG操作,此举不怕把磁盘组上原先存在的业务数据给整丢了?
2 第8步,重建了CRSDG磁盘组后,不可以直接执行:kfed repairt /dev/raw/raw1?也就是,第9步的意义何在?
还是说,kfed 操作必须是在 CRS磁盘组 正常的情况下,才能执行,当 crs磁盘组 故障时,得用这命令:
./crsctl replace votedisk +CRSDG
3 第7步,kfed repairt /dev/raw/raw1 失败,而第17步,却可以使用自动备份的磁盘头来恢复,为何?
第7步,kfed repairt /dev/raw/raw1 失败,而第17步,却可以使用自动备份的磁盘头来恢复,为何?
因为第17步恢复的的datadg磁盘组,IO测试工具没有把/dev/raw/raw3磁盘所自动备份的磁盘头信息给破坏
第8步,重建了CRSDG磁盘组后,不可以直接执行:kfed repairt /dev/raw/raw1?也就是,第9步的意义何在?
还是说,kfed 操作必须是在 CRS磁盘组 正常的情况下,才能执行,当 crs磁盘组 故障时,得用这命令:
./crsctl replace votedisk +CRSDG
第9步的意义是查看自动备份的ocr文件,因为文件每四个小时备份一次,还原时要指定备分文件名
重建crsdg磁盘组后,只能使用备份的ocr文件来还原
第13步,是ASM的日志文件中显示了操作信息,因为磁盘组不能mount ,asm实例选择的重建操作,是日志记录,我不是的恢复操作记录。