11g rac multipath asmlib ASM asm_open error Operation not permitted

某生产库,oracle linux,11.2.0.4 rac 一节点重启之后不能正常启动。

[root@test1 ~]# su - grid
[grid@test1 ~]$ crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

正常节点如下:

[grid@test2 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       test2
ora.DATA.dg
               ONLINE  ONLINE       test2
ora.LISTENER.lsnr
               ONLINE  ONLINE       test2
ora.OCR.dg
               ONLINE  ONLINE       test2
ora.asm
               ONLINE  ONLINE       test2                    Started
ora.gsd
               OFFLINE OFFLINE      test2
ora.net1.network
               ONLINE  ONLINE       test2
ora.ons
               ONLINE  ONLINE       test2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       test2
ora.cvu
      1        ONLINE  ONLINE       test2
ora.dgdb1.vip
      1        ONLINE  INTERMEDIATE test2                    FAILED OVER
ora.dgdb2.vip
      1        ONLINE  ONLINE       test2
ora.oc4j
      1        ONLINE  ONLINE       test2
ora.test.db
      1        ONLINE  OFFLINE
      2        ONLINE  ONLINE       test2                    Open
ora.scan1.vip
      1        ONLINE  ONLINE       test2

[grid@test1 grid]$ crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

检查css服务状态,可以看到连接失败。

[grid@test1 grid]$ crsctl check css
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

检查cssd进程,可以看到没有启动

[grid@test1 grid]$ ps -ef |grep cssd
root      22124      1  0 19:37 ?        00:00:00 /u01/app/11.2.0/grid/bin/cssdmonitor
grid      22496  15743  0 19:40 pts/3    00:00:00 grep cssd


[grid@test1 grid]$ crs_stat -p ora.cssd
CRS-0184: Cannot communicate with the CRS daemon.

检查cssd.log

[root@dgdb1 grid]# tail -f /u01/app/11.2.0/grid/log/test1/cssd/ocssd.log

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:OCR1:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:OCR2:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:OCR3:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:TEST_ARCH1:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:TEST_DATA1:

2016-11-21 16:51:34.870: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:TEST_DATA2:

2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted

上面的错误信息显示asmlib asm出错,没有操作权限,指定ASMLib在发现磁盘的时候需要忽略的盘和需要检查的盘。在我们的环境中是使用了Multipath来对多块磁盘做多路径处理,因此需要包括dm开头的磁盘,而忽略sd开头的磁盘。这样的问题也应该只会发生在使用了Multipath的磁盘上,修改/etc/sysconfig/oracleasm

[root@test bin]# vi /etc/sysconfig/oracleasm
#
# This is a configuration file for automatic loading of the Oracle
# Automatic Storage Management library kernel driver.  It is generated
# By running /etc/init.d/oracleasm configure.  Please use that method
# to modify this file
#

# ORACLEASM_ENABLED: 'true' means to load the driver on boot.
ORACLEASM_ENABLED=true

# ORACLEASM_UID: Default user owning the /dev/oracleasm mount point.
ORACLEASM_UID=grid

# ORACLEASM_GID: Default group owning the /dev/oracleasm mount point.
ORACLEASM_GID=asmadmin

# ORACLEASM_SCANBOOT: 'true' means scan for ASM disks on boot.
ORACLEASM_SCANBOOT=true

# ORACLEASM_SCANORDER: Matching patterns to order disk scanning
ORACLEASM_SCANORDER="dm" --指定要扫描的磁盘匹配格式

# ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
ORACLEASM_SCANEXCLUDE="sd"--指定要排除扫描的磁盘匹配格式

# ORACLEASM_USE_LOGICAL_BLOCK_SIZE: 'true' means use the logical block size
# reported by the underlying disk instead of the physical. The default
# is 'false'
ORACLEASM_USE_LOGICAL_BLOCK_SIZE=false

重新挂载asmlib

[root@test1 bin]# oracleasm exit
Unmounting ASMlib driver filesystem: /dev/oracleasm
Unloading module "oracleasm": oracleasm
[root@test1 bin]# oracleasm init
Loading module "oracleasm": oracleasm
Configuring "oracleasm" to use device physical block size
Mounting ASMlib driver filesystem: /dev/oracleasm

扫描磁盘

[root@test1 ~]# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [  OK  ]
[root@test1 ~]# oracleasm listdisks
OCR1
OCR2
OCR3
TEST_ARCH1
TEST_DATA1
TEST_DATA2

停止crs

root@test bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'dgdb1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'dgdb1'
CRS-2673: Attempting to stop 'ora.crf' on 'dgdb1'
CRS-2677: Stop of 'ora.mdnsd' on 'dgdb1' succeeded
CRS-2677: Stop of 'ora.crf' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'dgdb1'
CRS-2677: Stop of 'ora.gipcd' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'dgdb1'
CRS-2677: Stop of 'ora.gpnpd' on 'dgdb1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'dgdb1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

启动crs

[root@test1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[grid@test1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.DATA.dg
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.LISTENER.lsnr
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.OCR.dg
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.asm
               ONLINE  ONLINE       test1                    Started
               ONLINE  ONLINE       test2                    Started
ora.gsd
               OFFLINE OFFLINE      test1
               OFFLINE OFFLINE      test2
ora.net1.network
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.ons
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       test2
ora.cvu
      1        ONLINE  ONLINE       test2
ora.test1.vip
      1        ONLINE  ONLINE       test1
ora.test2.vip
      1        ONLINE  ONLINE       test2
ora.oc4j
      1        ONLINE  ONLINE       test2
ora.test.db
      1        ONLINE  ONLINE       test1                    Open
      2        ONLINE  ONLINE       test2                    Open
ora.scan1.vip
      1        ONLINE  ONLINE       test2

到此该节点所有服务正常启动

发表评论

电子邮件地址不会被公开。