JDBC connect SCAN IP

SCAN(Single Client Access Name)是Oracle从11g R2开始推出的,客户端可以通过SCAN特性负载均衡地连接到RAC数据库。SCAN提供一个域名来访问RAC,域名可以解析1个到3个(注意,最多3个)SCAN IP,我们可以通过DNS或者GNS来解析实现。其中DNS大家都很熟悉,这里不多说。GNS(Grid Naming Service)则是Oracle 11g R2的新功能,可以通过DHCP服务为节点和SCAN分配VIP和SCAN IP。另外还有个优点是,对于新加入集群的节点,它会自动分配VIP地址,更新集群资源,客户端依然通过SCAN特性负载均衡地连接到新增集群节点上。除了DNS和GNS解析方法外,SCAN也可以使用hosts文件来解析。客户原来使用的Oracle 10g R2单机,现在使用的是Oracle 11g R2 RAC,客户应用程序模块不完全是通过Weblogic的jdbc数据源来连接数据库,有个别模块单独执行jdbc连接,其连接代码如下:


                Connection con=null;
		try
		{
			Class.forName("oracle.jdbc.driver.OracleDriver");
			con=DriverManager.getConnection("jdbc:oracle:thin:@10.10.12.3:1521:RLZY","xxxxx","xxxxx");
		}
		catch(Exception e)
		{
			e.printStackTrace();
		}	

程序执行出现如下错误


java.sql.SQLException: Listener refused the connection with the following error:
ORA-12505, TNS:listener does not currently know of SID given in connect descriptor
 
	at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:489)
	at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:553)
	at oracle.jdbc.driver.T4CConnection.(T4CConnection.java:254)
	at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
	at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:528)
	at java.sql.DriverManager.getConnection(DriverManager.java:571)
	at java.sql.DriverManager.getConnection(DriverManager.java:215)
	at t.testdb.execute(testdb.java:19)
	at t.testdb.main(testdb.java:63)
Caused by: oracle.net.ns.NetException: Listener refused the connection with the following error:
ORA-12505, TNS:listener does not currently know of SID given in connect descriptor
 
	at oracle.net.ns.NSProtocol.connect(NSProtocol.java:399)
	at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1140)
	at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:340)
	... 8 more
Exception in thread "main" java.lang.NullPointerException
	at t.testdb.execute(testdb.java:30)
	at t.testdb.main(testdb.java:63)

错误信息说明不能识别连接字符串中的SID,这里为什么会显示使用的是SID,我们指定的是SERVICE_NAME,这就与JDBC连接字符串的写法有关。如果使用jdbc:oracle:thin:@10.10.12.3:1521:RLZY,JDBC会将RLZY解析为SID,如果是10.10.12.3:1521/RLZY,JDBC会将RLZY解析为服务名。对于单实例来说,一般SID与SERVICE_NAME相同,但是对于RAC来说,SID与SERVICE_NAME是不一样的。对于11g RAC要使用SCAN IP来连接,那么只能使用SERVICE_NAME,将代码修改成如下格式:


                Connection con=null;
		try
		{
			Class.forName("oracle.jdbc.driver.OracleDriver");
			con=DriverManager.getConnection("jdbc:oracle:thin:@10.10.12.3:1521/RLZY","xxx","xxx");
		}
		catch(Exception e)
		{
			e.printStackTrace();
		}	

总结,对于jdbc连接数据库不管是单实例还是RAC,都建议使用在连接字符串中使用jdbc:oracle:thin:@IP:PORT/SERVICE_NAME这种方法就可以完美处理这种问题。

redhat linux 11.2 rac grid infrastructure add scan ip

由于客户想要在新的生产环境中仍然使用原来的虚拟IP,所以当数据从原数据迁移到新的rac环境中后,我们需要修改虚拟IP,但原来是个单实例使用两上虚拟IP地址,现在是一个RAC数据库,如果是修改RAC的虚拟IP,那么不能做负载均衡,每次都连接到的是一个节点,由于SCAN IP使用/etc/hosts方式创建并且在安装生产环境时只指定了一个IP地址(10.138.130.155),所以这里需要通过增加SCAN IP地址(10.138.130.156/157)的方式来解决。

检查现在/etc/hosts配置:

[root@jyrac1 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               jyrac1 localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.138.130.151 jyrac1
10.138.130.152 jyrac2

10.138.130.153 jyrac1-vip
10.138.130.154 jyrac2-vip

10.10.10.1 jyrac1-priv
10.10.10.2 jyrac2-priv

10.138.130.155 jyrac-scan

查看scan ip所在节点,发现scan ip是在节点jyrac1上

[grid@jyrac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHDG.dg
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.CRSDG.dg
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.DATADG.dg
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.LISTENER.lsnr
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.asm
               ONLINE  ONLINE       jyrac1                   Started             
               ONLINE  ONLINE       jyrac2                   Started             
ora.gsd
               OFFLINE OFFLINE      jyrac1                                       
               OFFLINE OFFLINE      jyrac2                                       
ora.net1.network
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.ons
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.registry.acfs
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       jyrac1                                       
ora.cvu
      1        ONLINE  ONLINE       jyrac1                                       
ora.jyrac.db
      1        ONLINE  ONLINE       jyrac1                   Open                
      2        ONLINE  ONLINE       jyrac2                   Open                
ora.jyrac1.vip
      1        ONLINE  ONLINE       jyrac1                                       
ora.jyrac2.vip
      1        ONLINE  ONLINE       jyrac2                                       
ora.oc4j
      1        ONLINE  ONLINE       jyrac1                                       
ora.scan1.vip
      1        ONLINE  ONLINE       jyrac1            

查看scan ip所在的网卡,发现在eth0上

[root@jyrac1 ~]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:50:56:B1:00:FD  
          inet addr:10.138.130.151  Bcast:10.138.130.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:feb1:fd/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1077473 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9663995 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:103934930 (99.1 MiB)  TX bytes:14096320534 (13.1 GiB)
          Base address:0x2800 Memory:fd5c0000-fd5e0000 

eth0:1    Link encap:Ethernet  HWaddr 00:50:56:B1:00:FD  
          inet addr:10.138.130.153  Bcast:10.138.130.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Base address:0x2800 Memory:fd5c0000-fd5e0000 

eth0:2    Link encap:Ethernet  HWaddr 00:50:56:B1:00:FD  
          inet addr:10.138.130.155  Bcast:10.138.130.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Base address:0x2800 Memory:fd5c0000-fd5e0000 

eth1      Link encap:Ethernet  HWaddr 00:50:56:B1:B6:3C  
          inet addr:10.10.10.1  Bcast:10.10.10.255  Mask:255.255.255.0
          inet6 addr: fe80::250:56ff:feb1:b63c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:371756 errors:0 dropped:0 overruns:0 frame:0
          TX packets:290401 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:193253619 (184.3 MiB)  TX bytes:185131607 (176.5 MiB)
          Base address:0x2840 Memory:fd5a0000-fd5c0000 

eth1:1    Link encap:Ethernet  HWaddr 00:50:56:B1:B6:3C  
          inet addr:169.254.202.209  Bcast:169.254.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Base address:0x2840 Memory:fd5a0000-fd5c0000 

检查scan的配置信息

[grid@jyrac1 ~]$ srvctl config scan
SCAN name: jyrac-scan, Network: 1/10.138.130.0/255.255.255.0/eth0
SCAN VIP name: scan1, IP: /jyrac-scan/10.138.130.155

检查scan listener状态

[grid@jyrac1 ~]$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node jyrac1

关闭scan listener

[grid@jyrac1 ~]$ srvctl stop scan_listener

关闭scan

[grid@jyrac1 ~]$ srvctl stop scan
[grid@jyrac1 ~]$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is not running

修改各节点的/etc/hosts文件增加两个scan ip地址

[root@jyrac1 ~]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               jyrac1 localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.138.130.151 jyrac1
10.138.130.152 jyrac2

10.138.130.153 jyrac1-vip
10.138.130.154 jyrac2-vip

10.10.10.1 jyrac1-priv
10.10.10.2 jyrac2-priv

10.138.130.155 jyrac-scan
10.138.130.156 jyrac-scan
10.138.130.157 jyrac-scan


[root@jyrac2 /]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               jyrac2 localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.138.130.151 jyrac1
10.138.130.152 jyrac2

10.138.130.153 jyrac1-vip
10.138.130.154 jyrac2-vip

10.10.10.1 jyrac1-priv
10.10.10.2 jyrac2-priv

10.138.130.155 jyrac-scan
10.138.130.156 jyrac-scan
10.138.130.157 jyrac-scan

修改scan配置,但一定要用root用户来执行

[root@jyrac1 ~]# cd /u01/app/product/11.2.0/crs/bin
[root@jyrac1 bin]# ./srvctl modify scan -n jyrac-scan

检查scan配置

[grid@jyrac1 ~]$ srvctl config scan
SCAN name: jyrac-scan, Network: 1/10.138.130.0/255.255.255.0/eth0
SCAN VIP name: scan1, IP: /jyrac-scan/10.138.130.155

对于/etc/hosts文件解析scan ip,因为不能做轮训的负载均衡,所以这时候scan ip就只能有一个了
但是在oracle linux 6.4中确会显示所有scan ip,例如:

[root@db1 ~]# cd /u01/app/11.2.0/grid/bin
[root@db1 bin]# ./srvctl modify scan -n scan-ip


[grid@db1 bin]$ srvctl config scan
SCAN name: scan-ip, Network: 1/10.138.129.0/255.255.255.0/eth2
SCAN VIP name: scan1, IP: /scan-ip/10.138.129.105
SCAN VIP name: scan2, IP: /scan-ip/10.138.129.120
SCAN VIP name: scan3, IP: /scan-ip/10.138.129.121

具体操作请见:http://www.jydba.net/oracle-linux-11-2-rac-grid-infrastructure-add-scan-ip/

重启scan ,scan listener

                                     
[grid@jyrac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCHDG.dg
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.CRSDG.dg
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.DATADG.dg
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.LISTENER.lsnr
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.asm
               ONLINE  ONLINE       jyrac1                   Started             
               ONLINE  ONLINE       jyrac2                   Started             
ora.gsd
               OFFLINE OFFLINE      jyrac1                                       
               OFFLINE OFFLINE      jyrac2                                       
ora.net1.network
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.ons
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
ora.registry.acfs
               ONLINE  ONLINE       jyrac1                                       
               ONLINE  ONLINE       jyrac2                                       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       jyrac2                                       
ora.cvu
      1        ONLINE  ONLINE       jyrac1                                       
ora.jyrac.db
      1        ONLINE  ONLINE       jyrac1                   Open                
      2        ONLINE  ONLINE       jyrac2                   Open                
ora.jyrac1.vip
      1        ONLINE  ONLINE       jyrac1                                       
ora.jyrac2.vip
      1        ONLINE  ONLINE       jyrac2                                       
ora.oc4j
      1        ONLINE  ONLINE       jyrac1                                       
ora.scan1.vip
      1        ONLINE  ONLINE       jyrac2          



手动增加scan ip
[root@jyrac1 bin]# ./crsctl stat res ora.scan1.vip -p > /tmp/2.txt
[root@jyrac1 bin]# ./crsctl stat res ora.scan1.vip -p > /tmp/3.txt
[root@jyrac1 bin]# vi /tmp/2.txt
NAME=ora.scan2.vip
TYPE=ora.scan_vip.type
ACL=owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:grid:r-x
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=1
CHECK_TIMEOUT=120
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=scan_vip) ELEMENT(HOSTING_MEMBERS=%HOSTING_MEMBERS%)
DEGREE=1
DESCRIPTION=Oracle SCAN VIP resource
ENABLED=1
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
GEN_USR_ORA_STATIC_VIP=
GEN_USR_ORA_VIP=
HOSTING_MEMBERS=
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=balanced
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=0
SCAN_NAME=jyrac-scan
SCRIPT_TIMEOUT=60
SERVER_POOLS=*
START_DEPENDENCIES=hard(ora.net1.network) dispersion:active(type:ora.scan_vip.type) pullup(global:ora.net1.network)
START_TIMEOUT=120
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)
STOP_TIMEOUT=0
TYPE_VERSION=1.1
UPTIME_THRESHOLD=1h
USR_ORA_ENV=
USR_ORA_VIP=10.138.130.156
VERSION=11.2.0.4.0

~
"/tmp/2.txt" 43L, 1092C written
[root@jyrac1 bin]# ./crsctl add resource ora.scan2.vip -type ora.scan_vip.type -file /tmp/2.txt
[root@jyrac1 bin]# vi /tmp/3.txt
NAME=ora.scan3.vip
TYPE=ora.scan_vip.type
ACL=owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:grid:r-x
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=1
CHECK_TIMEOUT=120
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=scan_vip) ELEMENT(HOSTING_MEMBERS=%HOSTING_MEMBERS%)
DEGREE=1
DESCRIPTION=Oracle SCAN VIP resource
ENABLED=1
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
GEN_USR_ORA_STATIC_VIP=
GEN_USR_ORA_VIP=
HOSTING_MEMBERS=
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=balanced
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=0
SCAN_NAME=jyrac-scan
SCRIPT_TIMEOUT=60
SERVER_POOLS=*
START_DEPENDENCIES=hard(ora.net1.network) dispersion:active(type:ora.scan_vip.type) pullup(global:ora.net1.network)
START_TIMEOUT=120
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)
STOP_TIMEOUT=0
TYPE_VERSION=1.1
UPTIME_THRESHOLD=1h
USR_ORA_ENV=
USR_ORA_VIP=10.138.130.157
VERSION=11.2.0.4.0

~
"/tmp/3.txt" 43L, 1092C written
[root@jyrac1 bin]# ./crsctl add resource ora.scan2.vip -type ora.scan_vip.type -file /tmp/2.txt
[root@jyrac1 bin]# ./crsctl add resource ora.scan3.vip -type ora.scan_vip.type -file /tmp/3.txt

手动增加scan listener

[root@jyrac1 bin]# ./crsctl stat res  ora.LISTENER_SCAN1.lsnr -p > /tmp/l_2.txt
[root@jyrac1 bin]# ./crsctl stat res  ora.LISTENER_SCAN1.lsnr -p > /tmp/l_3.txt

[root@jyrac1 bin]# vi /tmp/l_3.txt
NAME=ora.LISTENER_SCAN3.lsnr
TYPE=ora.scan_listener.type
ACL=owner:grid:rwx,pgrp:oinstall:r-x,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=60
CHECK_TIMEOUT=120
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=scan_listener) PROPERTY(LISTENER_NAME=PARSE(%NAME%, ., 2))
DEGREE=1
DESCRIPTION=Oracle SCAN listener resource
ENABLED=1
ENDPOINTS=TCP:1521
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=balanced
PORT=1521
PROFILE_CHANGE_TEMPLATE=
REGISTRATION_INVITED_NODES=
REGISTRATION_INVITED_SUBNETS=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
SERVER_POOLS=*
START_DEPENDENCIES=hard(ora.scan3.vip) dispersion:active(type:ora.scan_listener.type) pullup(ora.scan3.vip)
START_TIMEOUT=180
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.scan3.vip)
STOP_TIMEOUT=0
TYPE_VERSION=2.2
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_OPI=false
VERSION=11.2.0.4.0

"/tmp/l_3.txt" 44L, 1091C written
[root@jyrac1 bin]# vi /tmp/l_2.txt
NAME=ora.LISTENER_SCAN2.lsnr
TYPE=ora.scan_listener.type
ACL=owner:grid:rwx,pgrp:oinstall:r-x,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
ACTIVE_PLACEMENT=1
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=60
CHECK_TIMEOUT=120
DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=scan_listener) PROPERTY(LISTENER_NAME=PARSE(%NAME%, ., 2))
DEGREE=1
DESCRIPTION=Oracle SCAN listener resource
ENABLED=1
ENDPOINTS=TCP:1521
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=balanced
PORT=1521
PROFILE_CHANGE_TEMPLATE=
REGISTRATION_INVITED_NODES=
REGISTRATION_INVITED_SUBNETS=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
SERVER_POOLS=*
START_DEPENDENCIES=hard(ora.scan2.vip) dispersion:active(type:ora.scan_listener.type) pullup(ora.scan2.vip)
START_TIMEOUT=180
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.scan2.vip)
STOP_TIMEOUT=0
TYPE_VERSION=2.2
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_OPI=false
VERSION=11.2.0.4.0

"/tmp/l_2.txt" 44L, 1091C written


[root@jyrac1 bin]# ./crsctl add resource ora.LISTENER_SCAN2.lsnr -type ora.scan_listener.type -file /tmp/l_2.txt
[root@jyrac1 bin]# ./crsctl add resource ora.LISTENER_SCAN3.lsnr -type ora.scan_listener.type -file /tmp/l_3.txt

使用增加的scan ip连接数据库,登录不了数据库

C:\Users\Administrator>sqlplus "system/system"@10.138.130.156:1521/jyrac

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 11月 22 22:18:58 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-12514: TNS: 监听程序当前无法识别连接描述符中请求的服务


请输入用户名:
ERROR:
ORA-12560: TNS: 协议适配器错误


请输入用户名:
ERROR:
ORA-12560: TNS: 协议适配器错误


SP2-0157: 在 3 次尝试之后无法连接到 ORACLE, 退出 SQL*Plus

C:\Users\Administrator>

修改rac所有数据库实例中的listener_networks参数

SQL> alter system set LISTENER_NETWORKS='((NAME=network1)(LOCAL_LISTENER=10.138.130.153:1521)(REMOTE_LISTENER=10.138.130.155:1521))','((NAME=network2)(LOCAL_LISTENER=10.138.130.153:1521)(REMOTE_LISTENER=10.138.130.156:1521))','((NAME=network3)(LOCAL_LISTENER=10.138.130.153:1521)(REMOTE_LISTENER=10.138.130.157:1521))' sid='jyrac1';

System altered.

SQL> alter system set LISTENER_NETWORKS='((NAME=network1)(LOCAL_LISTENER=10.138.130.154:1521)(REMOTE_LISTENER=10.138.130.155:1521))','((NAME=network2)(LOCAL_LISTENER=10.138.130.154:1521)(REMOTE_LISTENER=10.138.130.156:1521))','((NAME=network3)(LOCAL_LISTENER=10.138.130.154:1521)(REMOTE_LISTENER=10.138.130.157:1521))' sid='jyrac2';

System altered.

SQL> show parameter listener

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
listener_networks                    string      ((NAME=network1)(LOCAL_LISTENE
                                                 R=10.138.130.152:1521)(REMOTE_
                                                 LISTENER=10.138.130.155:1521))
                                                 , ((NAME=network2)(LOCAL_LISTE
                                                 NER=10.138.130.152:1521)(REMOT
                                                 E_LISTENER=10.138.130.156:1521
                                                 )), ((NAME=network3)(LOCAL_LIS
                                                 TENER=10.138.130.152:1521)(REM
                                                 OTE_LISTENER=10.138.130.157:15
                                                 21))
local_listener                       string       (ADDRESS=(PROTOCOL=TCP)(HOST=

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
                                                 10.138.130.153)(PORT=1521))
remote_listener                      string      jyrac-scan:1521


再次使用增加的scan ip连接数据库,成功登录

C:\Users\Administrator>sqlplus "system/system"@10.138.130.156:1521/jyrac

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 11月 22 22:25:24 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management,
OLAP,
Data Mining and Real Application Testing options

SQL> exit
从 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management,
OLAP,
Data Mining and Real Application Testing options 断开

C:\Users\Administrator>sqlplus "system/system"@10.138.130.157:1521/jyrac

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 11月 22 22:25:36 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management,
OLAP,
Data Mining and Real Application Testing options

SQL>

oracle linux 11.2 rac grid infrastructure add scan ip

某个项目由于原生产环所使用的虚拟IP与多个系统对接,所以客户想要在新的生产环境中仍然使用原来的虚拟IP,所以当数据从原数据迁移到新的rac环境中后,我们需要修改虚拟IP,但原来是个单实例使用两上虚拟IP地址,现在是一个RAC数据库,如果是修改RAC的虚拟IP,那么不能做负载均衡,每次都连接到的是一个节点,由于SCAN IP使用/etc/hosts方式创建并且在安装生产环境时只指定了一个IP地址(10.138.129.105),所以这里需要通过增加SCAN IP地址(10.138.129.120/121)的方式来解决。
1.查看并停止scan以及scan_listener

[grid@db1 ~]$ cd $ORACLE_HOME/bin
[grid@db1 bin]$ pwd
/u01/app/11.2.0/grid/bin
[grid@db1 bin]$ srvctl config scan
SCAN name: scan-ip, Network: 1/10.138.129.0/255.255.255.0/eth2
SCAN VIP name: scan1, IP: /scan-ip/10.138.129.105
[grid@db1 bin]$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node db2
[grid@db1 bin]$ srvctl stop scan_listener
[grid@db1 bin]$ srvctl stop scan
[grid@db1 bin]$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is not running

2.编辑/etc/hosts文件增加所需要的scan ip地下

[root@db1 ~]# vi /etc/hosts
127.0.0.1   localhost


10.138.129.101  db1
10.138.129.102  db2

10.138.129.103  db1-vip
10.138.129.104  db2-vip

10.138.129.105  scan-ip
10.138.129.120  scan-ip
10.138.129.121  scan-ip

192.168.20.1    db1-priv
192.168.20.2    db2-priv

[root@db2 ~]# vi /etc/hosts
127.0.0.1   localhost


10.138.129.101  db1
10.138.129.102  db2

10.138.129.103  db1-vip
10.138.129.104  db2-vip

10.138.129.105  scan-ip
10.138.129.120  scan-ip
10.138.129.121  scan-ip

192.168.20.1    db1-priv
192.168.20.2    db2-priv

3.使用root用户更新scan

[root@db1 ~]# cd /u01/app/11.2.0/grid/bin
[root@db1 bin]# ./srvctl modify scan -n scan-ip

4.查看scan配置,可以看到增加的IP已经被添加

[grid@db1 bin]$ srvctl config scan
SCAN name: scan-ip, Network: 1/10.138.129.0/255.255.255.0/eth2
SCAN VIP name: scan1, IP: /scan-ip/10.138.129.105
SCAN VIP name: scan2, IP: /scan-ip/10.138.129.120
SCAN VIP name: scan3, IP: /scan-ip/10.138.129.121

5.更新与启动scan_listener

[grid@db1 ~]$ srvctl modify scan_listener -u
[grid@db1 ~]$ srvctl start scan_listener
[grid@db1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CWDATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.OCR.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.SBKDATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       db2
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       db1
ora.caiwu.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.chdyl.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.cvu
      1        ONLINE  ONLINE       db2
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.oc4j
      1        ONLINE  ONLINE       db2
ora.rlzy.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.scan1.vip
      1        ONLINE  ONLINE       db1
ora.scan2.vip
      1        ONLINE  ONLINE       db2
ora.scan3.vip
      1        ONLINE  ONLINE       db1

6.使用增加的scan ip来测试连接

C:\Users\Administrator>sqlplus "system/powersi"@10.138.129.120:1521/rlzy

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 11月 22 21:35:24 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-12514: TNS: 监听程序当前无法识别连接描述符中请求的服务


请输入用户名:
ERROR:
ORA-12560: TNS: 协议适配器错误


请输入用户名:
ERROR:
ORA-12560: TNS: 协议适配器错误


SP2-0157: 在 3 次尝试之后无法连接到 ORACLE, 退出 SQL*Plus

7.修改rac所有数据库实例的listener_networks参数,并再次使用增加的scan ip来测试连接

SQL> alter system set LISTENER_NETWORKS='((NAME=network1)(LOCAL_LISTENER=10.138.129.104:1521)(REMOTE_LISTENER=10.138.129.105:1521))','((NAME=network2)(LOCAL_LISTENER=10.138.129.104:1521)(REMOTE_LISTENER=10.138.129.120:1521))','((NAME=network3)(LOCAL_LISTENER=10.138.129.104:1521)(REMOTE_LISTENER=10.138.129.121:1521))' sid='RLZY2';

System altered.



SQL> alter system set LISTENER_NETWORKS='((NAME=network1)(LOCAL_LISTENER=10.138.129.103:1521)(REMOTE_LISTENER=10.138.129.105:1521))','((NAME=network2)(LOCAL_LISTENER=10.138.129.103:1521)(REMOTE_LISTENER=10.138.129.120:1521))','((NAME=network3)(LOCAL_LISTENER=10.138.129.103:1521)(REMOTE_LISTENER=10.138.129.121:1521))' sid='RLZY1';

System altered.



C:\Users\Administrator>
C:\Users\Administrator>sqlplus "system/powersi"@10.138.129.120:1521/rlzy

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 11月 22 21:43:13 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management,
OLAP,
Data Mining and Real Application Testing options

SQL> exit
从 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management,
OLAP,
Data Mining and Real Application Testing options 断开

C:\Users\Administrator>sqlplus "system/powersi"@10.138.129.121:1521/rlzy

SQL*Plus: Release 11.2.0.4.0 Production on 星期二 11月 22 21:44:37 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management,
OLAP,
Data Mining and Real Application Testing options

SQL>

11g ASM asm_preferred_read_failure_group

ASM优先读取
asm_preferred_read_failure_group参数被用来指定一个故障磁盘组列表,它指定了RAC中每个节点的本地读取方式。asm_preferred_read_failure_group格式如下:
asm_preferred_read_failure_groups=diskgroup_name.failuregroup_name,…
每个条目包含了diskgroup_name,它是磁盘组名字,failuregroup_name,它是磁盘组所使用的故障磁盘组名字,这两个变量使用一个句号进行分隔。多个条目可以使用逗事情进行分隔。这个 参数可以进行动态修改。

在extended rac中,为asm_preferred_read)failure_groups参数所指定的故障磁盘组应该只包含本地节点中的磁盘,v$asm_disk中的preferred_read列指示了优先读取磁盘。

注意,当增加或删除磁盘时,最佳的方法是从存储将要发生改变的节点执行增加或删除命令。这能更有效的执行重新平衡,因为区重新定位使用同样的故障磁盘组进行本地化–,也就是相同节点。

下面的例子将显示如何部署优先读取功能与展现其一些优点。这个例子演示当asm_preferred_read_failure_groups参数没有设置时的I/O模式,以及修改参数之后对I/O的影响。

1.创建有两个故障磁盘组的ASM磁盘组data_nrml,其中raw5/6/7磁盘是与节点1在一个机房,raw/12/13/14磁盘是与节点2在另一个机房:

SQL> create diskgroup data_nrml normal redundancy
  2  failgroup fg1 disk '/dev/raw/raw5','/dev/raw/raw6','/dev/raw/raw7'
  3  failgroup fg2 disk '/dev/raw/raw12','/dev/raw/raw13','/dev/raw/raw14';
Diskgroup created.

2.在节点2,创建表测试表t1

SQL> create table t1 as select * from dba_tables;

Table created.

SQL> insert into t1 select * from t1;

3668 rows created.
.....
SQL> /

SQL> /
1467392 rows created.
SQL> commit;

Commit complete.

SQL> alter system flush buffer_cache;

System altered.


3.查询磁盘组的读写I/O累计信息,可以看到实例1中的FG1,FG2的读写分别为169,176,实例2中的FG1,FG2的读写分别为43,59

SQL> set long 9999
SQL> set linesize 9999
SQL> select inst_id, failgroup, sum(reads), sum(writes) from gv$asm_disk where failgroup in ('FG1','FG2') group by inst_id, failgroup;

   INST_ID FAILGROUP                                                    SUM(READS) SUM(WRITES)
---------- ------------------------------------------------------------ ---------- -----------
         1 FG1                                                                 169        3809
         1 FG2                                                                 176        3809
         2 FG1                                                                  43       17201
         2 FG2                                                                  59       17201

4.执行查询:

SQL> show parameter asm_preferred_read_failure_groups

NAME                                 TYPE
------------------------------------ ----------------------
VALUE
------------------------------
asm_preferred_read_failure_groups    string

SQL> set timing on
SQL> select count(*) from jy.t1;

  COUNT(*)
----------
   1467392
Elapsed: 00:00:07.23

5.再次查询磁盘组的读写I/O累计信息,可以看到实例1中的FG1,FG2的读写分别为450,564,比之前增加了450-169=281,564-176=388,实例2中的FG1,FG2的读写分别为85,99,比之前加了85- 43=42,99-59=40,从这些数据可以看到,获取数据时同时访问了两个故障磁盘组,因为我的数据是在节点2进行插入的,显示的读取I/O数据从故障磁盘组FG2的还要比FG1稍微多点,执行时间为7秒。

SQL> select inst_id, failgroup, sum(reads), sum(writes) from gv$asm_disk where failgroup in ('FG1','FG2') group by inst_id, failgroup;

   INST_ID FAILGROUP                                                    SUM(READS) SUM(WRITES)
---------- ------------------------------------------------------------ ---------- -----------
         1 FG1                                                                 450        8879
         1 FG2                                                                 564        8879
         2 FG1                                                                  85       38166
         2 FG2                                                                  99       38166

6.设置 asm_preferred_read_failure_groups参数,让节点1优先从故障磁盘组FG1进行读取,让节点2优先从故障磁盘组FG2进行读取

SQL> alter system set asm_preferred_read_failure_groups='DATA_NRML.FG1' scope=both sid='+ASM1';

System altered.

SQL> alter system set asm_preferred_read_failure_groups='DATA_NRML.FG2' scope=both sid='+ASM2';

System altered.

7.检查实例的优先读取磁盘组信息,从下面的查询结果可以看到,实例1的优先读取故障磁盘组为FG1,实例2为FG2:

SQL> select inst_id, failgroup, name, preferred_read from gv$asm_disk where failgroup in ('FG1','FG2') order by inst_id, failgroup;

   INST_ID FAILGROUP                                                    NAME                                                         PR
---------- ------------------------------------------------------------ ------------------------------------------------------------ --
         1 FG1                                                          DATA_NRML_0002                                               Y
         1 FG1                                                          DATA_NRML_0000                                               Y
         1 FG1                                                          DATA_NRML_0001                                               Y
         1 FG2                                                          DATA_NRML_0003                                               N
         1 FG2                                                          DATA_NRML_0005                                               N
         1 FG2                                                          DATA_NRML_0004                                               N
         2 FG1                                                          DATA_NRML_0001                                               N
         2 FG1                                                          DATA_NRML_0000                                               N
         2 FG1                                                          DATA_NRML_0002                                               N
         2 FG2                                                          DATA_NRML_0004                                               Y
         2 FG2                                                          DATA_NRML_0003                                               Y
         2 FG2                                                          DATA_NRML_0005                                               Y

12 rows selected.

8.在节点1开启会话再次执行查询

SQL> alter system flush buffer_cache;

System altered.
SQL> set timing on
SQL> select count(*) from jy.t1;

  COUNT(*)
----------
   1467392
Elapsed: 00:00:03.26

9.再次查询磁盘组的读写I/O累计信息,可以看到实例1中的FG1,FG2的读写分别为867,567,比之前增加了867-450=417,567-564=3,实例2中的FG1,FG2的读写分别为88,102,比之前加了88- 85=3,102-99=3,从这些数据可以看到,获取数据时基本上都是访问的故障磁盘组FG1,从这些数据可以看到,实例1也基本上都是访问的故障磁盘组FG1,对FG2读取次为3,实例2对于故障磁盘组 FG1,FG2的访问次数为3,都很少,执行时间从7秒变为了3秒。

SQL> select inst_id, failgroup, sum(reads), sum(writes) from gv$asm_disk where failgroup in ('FG1','FG2') group by inst_id, failgroup;

   INST_ID FAILGROUP                                                    SUM(READS) SUM(WRITES)
---------- ------------------------------------------------------------ ---------- -----------
         1 FG1                                                                 867        8891
         1 FG2                                                                 567        8891
         2 FG1                                                                  88       38166
         2 FG2                                                                 102       38166

11.2 rac emctl start dbconsole OC4J Configuration issue

某RAC数据库的em在服务器重启之后,手动执行启动不能成功,错误信息如下:

[oracle@jyrac1 ~]$ emctl start dbconsole
OC4J Configuration issue. /u01/app/oracle/product/11.2.0/db/oc4j/j2ee/OC4J_DBConsole_jyrac1_rac not found. 

上面的信息显示找不到OC4J_DBConsole_jyrac1_rac

查看em配置信息:

[oracle@jyrac1 ~]$ emca -displayConfig dbcontrol -cluster

STARTED EMCA at Nov 23, 2016 2:28:21 PM
EM Configuration Assistant, Version 11.2.0.3.0 Production
Copyright (c) 2003, 2011, Oracle.  All rights reserved.

Enter the following information:
Database unique name: jyrac
Service name: jyrac
Do you wish to continue? [yes(Y)/no(N)]: y
Nov 23, 2016 2:28:32 PM oracle.sysman.emcp.EMConfig perform
INFO: This operation is being logged at /u01/app/oracle/cfgtoollogs/emca/jyrac/emca_2016_11_23_14_28_20.log.
Nov 23, 2016 2:28:35 PM oracle.sysman.emcp.EMDBPostConfig showClusterDBCAgentMessage
INFO: 
****************  Current Configuration  ****************
 INSTANCE            NODE           DBCONTROL_UPLOAD_HOST
----------        ----------        ---------------------

jyrac             jyrac1             jyrac1
jyrac             jyrac2             jyrac1


Enterprise Manager configuration completed successfully
FINISHED EMCA at Nov 23, 2016 2:28:35 PM

如是,检查OC4J_DBConsole_jyrac1_rac目录是否存在:

[oracle@jyrac1 ~]$ cd /u01/app/oracle/product/11.2.0/db/oc4j/j2ee/
[oracle@jyrac1 j2ee]$ ls
deploy_db_wf.ini  oc4j_applications  OC4J_DBConsole_jyrac1_jyrac  OC4J_Workflow_Component_Container   utilities
home              OC4J_DBConsole     OC4J_DBConsole_jyrac2_jyrac  OC4J_Workflow_Management_Container

发现确实不存在,但是存在类似的目录(OC4J_DBConsole_jyrac1_jyrac) ,如是打算copy一份:

[oracle@jyrac1 j2ee]$ cp OC4J_DBConsole_jyrac1_jyrac OC4J_DBConsole_jyrac1_rac

再次启动em,发现缺少jyrac1_rac目录:

[oracle@jyrac1 j2ee]$ emctl start dbconsole
EM Configuration issue. /u01/app/oracle/product/11.2.0/db/jyrac1_rac not found. 

在目录/u01/app/oracle/product/11.2.0/db/发现了类似的jyrac_jyrac目录:

[oracle@jyrac1 db]$ ls -lrt
....
drwxr-----  3 oracle oinstall  4096 Nov 22 19:37 jyrac2_jyrac
drwxr-----  3 oracle oinstall  4096 Nov 22 19:38 jyrac1_jyrac
drwxr-xr-x  7 oracle oinstall  4096 Nov 22 19:39 install
drwxr-----  3 oracle oinstall  4096 Nov 22 19:40 jyrac1_jyrac1
drwxr-xr-x  2 oracle oinstall  4096 Nov 23 08:18 dbs
drwxr-----  3 oracle oinstall  4096 Nov 23 14:36 jyrac1_rac

如是打算copy一份jyrac-rac

[oracle@jyrac1 db]$ cp -r jyrac1_jyrac jyrac1_rac

再次启动EM

[oracle@jyrac1 db]$ emctl status dbconsole
Oracle Enterprise Manager 11g Database Control Release 11.2.0.4.0 
Copyright (c) 1996, 2013 Oracle Corporation.  All rights reserved.
https://jyrac1:1158/em/console/aboutApplication
Oracle Enterprise Manager 11g is not running. 
------------------------------------------------------------------
Logs are generated in directory /u01/app/oracle/product/11.2.0/db/jyrac1_rac/sysman/log 
[oracle@jyrac1 db]$ emctl start dbconsole
Oracle Enterprise Manager 11g Database Control Release 11.2.0.4.0 
Copyright (c) 1996, 2013 Oracle Corporation.  All rights reserved.
https://jyrac1:1158/em/console/aboutApplication
Starting Oracle Enterprise Manager 11g Database Control ........ started. 
------------------------------------------------------------------
Logs are generated in directory /u01/app/oracle/product/11.2.0/db/jyrac1_rac/sysman/log 

不必须重新创建EM,到此完成。

11g rac multipath asmlib ASM asm_open error Operation not permitted

某生产库,oracle linux,11.2.0.4 rac 一节点重启之后不能正常启动。

[root@test1 ~]# su - grid
[grid@test1 ~]$ crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

正常节点如下:

[grid@test2 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       test2
ora.DATA.dg
               ONLINE  ONLINE       test2
ora.LISTENER.lsnr
               ONLINE  ONLINE       test2
ora.OCR.dg
               ONLINE  ONLINE       test2
ora.asm
               ONLINE  ONLINE       test2                    Started
ora.gsd
               OFFLINE OFFLINE      test2
ora.net1.network
               ONLINE  ONLINE       test2
ora.ons
               ONLINE  ONLINE       test2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       test2
ora.cvu
      1        ONLINE  ONLINE       test2
ora.dgdb1.vip
      1        ONLINE  INTERMEDIATE test2                    FAILED OVER
ora.dgdb2.vip
      1        ONLINE  ONLINE       test2
ora.oc4j
      1        ONLINE  ONLINE       test2
ora.test.db
      1        ONLINE  OFFLINE
      2        ONLINE  ONLINE       test2                    Open
ora.scan1.vip
      1        ONLINE  ONLINE       test2

[grid@test1 grid]$ crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

检查css服务状态,可以看到连接失败。

[grid@test1 grid]$ crsctl check css
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

检查cssd进程,可以看到没有启动

[grid@test1 grid]$ ps -ef |grep cssd
root      22124      1  0 19:37 ?        00:00:00 /u01/app/11.2.0/grid/bin/cssdmonitor
grid      22496  15743  0 19:40 pts/3    00:00:00 grep cssd


[grid@test1 grid]$ crs_stat -p ora.cssd
CRS-0184: Cannot communicate with the CRS daemon.

检查cssd.log

[root@dgdb1 grid]# tail -f /u01/app/11.2.0/grid/log/test1/cssd/ocssd.log

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:OCR1:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:OCR2:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:OCR3:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:TEST_ARCH1:

2016-11-21 16:51:34.869: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:TEST_DATA1:

2016-11-21 16:51:34.870: [   SKGFD][2561705728]Fetching asmlib disk :ORCL:TEST_DATA2:

2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
)
2016-11-21 16:51:34.870: [   SKGFD][2561705728]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted

上面的错误信息显示asmlib asm出错,没有操作权限,指定ASMLib在发现磁盘的时候需要忽略的盘和需要检查的盘。在我们的环境中是使用了Multipath来对多块磁盘做多路径处理,因此需要包括dm开头的磁盘,而忽略sd开头的磁盘。这样的问题也应该只会发生在使用了Multipath的磁盘上,修改/etc/sysconfig/oracleasm

[root@test bin]# vi /etc/sysconfig/oracleasm
#
# This is a configuration file for automatic loading of the Oracle
# Automatic Storage Management library kernel driver.  It is generated
# By running /etc/init.d/oracleasm configure.  Please use that method
# to modify this file
#

# ORACLEASM_ENABLED: 'true' means to load the driver on boot.
ORACLEASM_ENABLED=true

# ORACLEASM_UID: Default user owning the /dev/oracleasm mount point.
ORACLEASM_UID=grid

# ORACLEASM_GID: Default group owning the /dev/oracleasm mount point.
ORACLEASM_GID=asmadmin

# ORACLEASM_SCANBOOT: 'true' means scan for ASM disks on boot.
ORACLEASM_SCANBOOT=true

# ORACLEASM_SCANORDER: Matching patterns to order disk scanning
ORACLEASM_SCANORDER="dm" --指定要扫描的磁盘匹配格式

# ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
ORACLEASM_SCANEXCLUDE="sd"--指定要排除扫描的磁盘匹配格式

# ORACLEASM_USE_LOGICAL_BLOCK_SIZE: 'true' means use the logical block size
# reported by the underlying disk instead of the physical. The default
# is 'false'
ORACLEASM_USE_LOGICAL_BLOCK_SIZE=false

重新挂载asmlib

[root@test1 bin]# oracleasm exit
Unmounting ASMlib driver filesystem: /dev/oracleasm
Unloading module "oracleasm": oracleasm
[root@test1 bin]# oracleasm init
Loading module "oracleasm": oracleasm
Configuring "oracleasm" to use device physical block size
Mounting ASMlib driver filesystem: /dev/oracleasm

扫描磁盘

[root@test1 ~]# /etc/init.d/oracleasm scandisks
Scanning the system for Oracle ASMLib disks: [  OK  ]
[root@test1 ~]# oracleasm listdisks
OCR1
OCR2
OCR3
TEST_ARCH1
TEST_DATA1
TEST_DATA2

停止crs

root@test bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'dgdb1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'dgdb1'
CRS-2673: Attempting to stop 'ora.crf' on 'dgdb1'
CRS-2677: Stop of 'ora.mdnsd' on 'dgdb1' succeeded
CRS-2677: Stop of 'ora.crf' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'dgdb1'
CRS-2677: Stop of 'ora.gipcd' on 'dgdb1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'dgdb1'
CRS-2677: Stop of 'ora.gpnpd' on 'dgdb1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'dgdb1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

启动crs

[root@test1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[grid@test1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.DATA.dg
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.LISTENER.lsnr
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.OCR.dg
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.asm
               ONLINE  ONLINE       test1                    Started
               ONLINE  ONLINE       test2                    Started
ora.gsd
               OFFLINE OFFLINE      test1
               OFFLINE OFFLINE      test2
ora.net1.network
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
ora.ons
               ONLINE  ONLINE       test1
               ONLINE  ONLINE       test2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       test2
ora.cvu
      1        ONLINE  ONLINE       test2
ora.test1.vip
      1        ONLINE  ONLINE       test1
ora.test2.vip
      1        ONLINE  ONLINE       test2
ora.oc4j
      1        ONLINE  ONLINE       test2
ora.test.db
      1        ONLINE  ONLINE       test1                    Open
      2        ONLINE  ONLINE       test2                    Open
ora.scan1.vip
      1        ONLINE  ONLINE       test2

到此该节点所有服务正常启动