Oracle 11R2 Grid Infrastructure执行root.sh脚本rootcrs.pl execution failed的处理

Oracle 11.2.0.4在Redhat Linux 6.1上执行/u01/app/product/11.2.0/crs/root.sh脚本时报以下错误信息:

/u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n beiku1 ... failed
FirstNode configuration failed at /u01/app/product/11.2.0/crs/crs/install/crsconfig_lib.pm line 9379.
/u01/app/product/11.2.0/crs/perl/bin/perl -I/u01/app/product/11.2.0/crs/perl/lib -I/u01/app/product/11.2.0/crs/crs/install /u01/app/product/11.2.0/crs/crs/install/rootcrs.pl execution failed

从上面的错误信息可以看到在执行srvctl start nodeapps -n bieku1时失败,尝试手动执行这个命令

[grid@beiku1 bin]$ ./srvctl start nodeapps -n beiku1
PRCR-1013 : Failed to start resource ora.ons
PRCR-1064 : Failed to start resource ora.ons on node beiku1
CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log"
CRS-2674: Start of 'ora.ons' on 'beiku1' failed

错误信息是Start of ‘ora.ons’ on ‘beiku1’ failed,那么来检查$ORACLE_HOME/cfgtoollogs/crsconfig/rootcrs_$HOSTNAME.log日志文件

[grid@beiku1 crs]$ cd $ORACLE_HOME/cfgtoollogs/crsconfig/
[grid@beiku1 crsconfig]$ ls -lrt
total 332
-rwxrwxr-x 1 grid oinstall  81336 Aug 26 15:36 srvmcfg0.log
-rwxrwxr-x 1 grid oinstall  18719 Aug 26 15:36 srvmcfg1.log
-rwxrwxr-x 1 grid oinstall  23213 Aug 26 15:36 srvmcfg2.log
-rwxrwxr-x 1 grid oinstall  24700 Aug 26 15:36 srvmcfg3.log
-rwxrwxr-x 1 grid oinstall  10705 Aug 26 15:36 srvmcfg4.log
-rwxrwxr-x 1 grid oinstall  25594 Aug 26 15:37 srvmcfg5.log
-rwxrwxr-x 1 grid oinstall 132771 Aug 26 15:37 rootcrs_beiku1.log
[grid@beiku1 crsconfig]$ cat rootcrs_beiku1.log
2015-08-26 15:36:52: J2EE (OC4J) Container Resource Add Wallet ... passed ...
2015-08-26 15:36:52: Running as user grid: /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate
2015-08-26 15:36:52: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/qosctl -autogenerate '
2015-08-26 15:36:54: Removing file /tmp/fileoriV8Q
2015-08-26 15:36:54: Successfully removed file: /tmp/fileoriV8Q
2015-08-26 15:36:54: /bin/su successfully executed

2015-08-26 15:36:54: qosctl output: User qosadmin added successfully.

User oc4jadmin added successfully.

2015-08-26 15:36:54: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin
2015-08-26 15:36:54: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user oc4jadmin '
2015-08-26 15:36:55: Removing file /tmp/fileHsIIY7
2015-08-26 15:36:55: Successfully removed file: /tmp/fileHsIIY7
2015-08-26 15:36:55: /bin/su successfully executed

2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin
2015-08-26 15:36:55: s_run_as_user2: Running /bin/su grid -c ' /u01/app/product/11.2.0/crs/bin/crsctl query wallet -type APPQOSADMIN -user qosadmin '
2015-08-26 15:36:55: Removing file /tmp/fileQXtLZo
2015-08-26 15:36:55: Successfully removed file: /tmp/fileQXtLZo
2015-08-26 15:36:55: /bin/su successfully executed

2015-08-26 15:36:55: Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: trace file=/u01/app/product/11.2.0/crs/cfgtoollogs/crsconfig/srvmcfg5.log
2015-08-26 15:36:55: Running as user grid: /u01/app/product/11.2.0/crs/bin/srvctl add cvu
2015-08-26 15:36:55:   Invoking "/u01/app/product/11.2.0/crs/bin/srvctl add cvu" as user "grid"
2015-08-26 15:36:55: Executing /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:55: Executing cmd: /bin/su grid -c "/u01/app/product/11.2.0/crs/bin/srvctl add cvu"
2015-08-26 15:36:57: add cvu ... success
2015-08-26 15:36:57: starting nodeapps...
2015-08-26 15:36:57: DHCP_flag=0
2015-08-26 15:36:57: nodes_to_start=beiku1
2015-08-26 15:37:18: exit value of start nodeapps/vip is 1
2015-08-26 15:37:18: output for start nodeapps is  PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node beiku1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'beiku1' failed
2015-08-26 15:37:18: output of startnodeapp after removing already started mesgs is PRCR-1013 : Failed to start resource ora.ons PRCR-1064 : Failed to start resource ora.ons on node beiku1 CRS-5016: Process "/u01/app/product/11.2.0/crs/opmn/bin/onsctli" spawned by agent "/u01/app/product/11.2.0/crs/bin/oraagent.bin" for action "start" failed: details at "(:CLSN00010:)" in "/u01/app/product/11.2.0/crs/log/beiku1/agent/crsd/oraagent_grid/oraagent_grid.log" CRS-2674: Start of 'ora.ons' on 'beiku1' failed
2015-08-26 15:37:18: /u01/app/product/11.2.0/crs/bin/srvctl start nodeapps -n beiku1 ... failed

检查I $GRID_HOME/opmn/logs/ons.log.*文件,看是否有以下错误:
1.

[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:37:02+08:00] [internal] getaddrinfo(::0, 6200, 1) failed (Hostname and service name not provided or found): Connection timed out

如果存在上面的错误信息,那么原因就是/etc/hosts文件中localhost对应的IP地址不是127.0.0.1。解决方法如就是确保DNS和/etc/hosts文件正确设置了localhost,DNS或/etc/hosts文件依赖于(/etc/nsswitch.conf, or /etc/netsvc.conf depend on platform),这些配置文件中的命名解决方案的设置,可以参考MOS中的ID 942166.1 or ID 969254.1文档来进行处理。

2.

[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:37:02+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:37:02+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:39:42+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:39:42+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] any: BIND (Address already in use)

原因是有其它的进程占用的ONS服务的端口

[grid@beiku1 logs]$ grep port $ORACLE_HOME/opmn/conf/ons.config
localport=6100          # line added by Agent
remoteport=6200         # line added by Agent

[root@beiku1 /]# lsof | grep 6200 | grep LISTEN
ons       16413      grid    6u     IPv6     162533                  TCP *:6200 (LISTEN)

可以看到进程ID16413的ons进程占用了6200端口,解决方法是确保这个端口不被其它进行所占用,如果是在执行 rootupgrade.sh脚本进行升级之前被占用,那么可能的原因是旧版本的ons进程还在运行。

3.

[grid@beiku1 oraagent_grid]$ cd $ORACLE_HOME/opmn/logs/
[grid@beiku1 logs]$ ls -lrt
total 8
-rw-r--r-- 1 grid oinstall 576 Aug 26 15:48 ons.log.beiku1
-rw-r--r-- 1 grid oinstall 267 Aug 26 15:48 ons.out
[grid@beiku1 logs]$ cat ons.log.beiku1
[2015-08-26T15:48:40+08:00] [ons] [NOTIFICATION:1] [104] [ons-internal] ONS server initiated
[2015-08-26T15:48:40+08:00] [ons] [ERROR:1] [17] [ons-listener] 0000:0000:0000:0000:0000:0000:0000:0001,6100: BIND (Cannot assign requested address)

这种情况可能是IPV6被部分配置了,11gR2 Grid Infrastructure不支持IPv6。解决方法就是在$GRID_HOME/opmn/conf/ons.config and ons.config.文件中设置下面的参数:
interface=ipv4

这里出现的错误是第2种,进程ID16413的ons进程占用了6200端口,解决方法是确保这个端口不被其它进行所占用

[root@beiku1 /]# lsof | grep 6200 | grep LISTEN
ons       16413      grid    6u     IPv6     162533                  TCP *:6200 (LISTEN)
[root@beiku1 /]# kill -9 16413

再重新执行root.sh脚本

[root@beiku1 /]# ./u01/app/product/11.2.0/crs/root.sh
Performing root user operation for Oracle 11g

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/product/11.2.0/crs

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/product/11.2.0/crs/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
PRKO-2190 : VIP exists for node beiku1, VIP name beiku1-vip
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

在kill掉占用6200端口的进程之后,root.sh脚本可以成功执行。

Redhat linux DNS配置指南

在oracle 11g的RAC中增加了SCAN IP,而使用 SCAN IP的一种方式就是使用DNS,这里介绍在Redhat Linux 5.4中DNS的详细配置操作
在配置DNS之前修改主机名
Redhat linux 5.4 DNS配置操作
在配置DNS之前修改主机名

[root@beiku1 etc]# hostname beiku1.sbyy.com
[root@beiku1 etc]# vi /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               beiku1.sbyy.com localhost
::1             localhost6.localdomain6 localhost6
10.138.130.161 beiku1

[root@beiku1 etc]# vi /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=beiku1.sbyy.com
GATEWAY=10.138.130.254

一.安装软件包
Redhat linux 5.4 下的dns服务所有的bind包如下:

bind-9.3.6-4.P1.el5 
bind-libbind-devel-9.3.6-4.P1.el5 
kdebindings-devel-3.5.4-6.el5 
kdebindings-3.5.4-6.el5 
bind-devel-9.3.6-4.P1.el5 
bind-utils-9.3.6-4.P1.el5 
bind-chroot-9.3.6-4.P1.el5 
ypbind-1.19-12.el5 
system-config-bind-4.0.3-4.el5 
bind-libs-9.3.6-4.P1.el5 
bind-sdb-9.3.6-4.P1.el5 

使用rpm –qa | grep bind来检查系统是否已经安装了以上软件包:

[root@beiku1 soft]# rpm -qa | grep bind
bind-chroot-9.3.6-4.P1.el5
kdebindings-3.5.4-6.el5
ypbind-1.19-12.el5
bind-libs-9.3.6-4.P1.el5
bind-9.3.6-4.P1.el5
system-config-bind-4.0.3-4.el5
bind-utils-9.3.6-4.P1.el5

对于没有安装的软件包执行以下命令进行安装

[root@beiku1 soft]# rpm -ivh bind-9.3.6-4.P1.el5.i386.rpm
warning: bind-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
        package bind-9.3.6-4.P1.el5.i386 is already installed
[root@beiku1 soft]# rpm -ivh caching-nameserver-9.3.6-4.P1.el5.i386.rpm
warning: caching-nameserver-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:caching-nameserver     ########################################### [100%]

[root@beiku1 soft]# rpm -ivh install kdebindings-devel-3.5.4-6.el5.i386.rpm
error: open of install failed: No such file or directory
warning: kdebindings-devel-3.5.4-6.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
[root@beiku1 soft]# rpm -ivh kdebindings-devel-3.5.4-6.el5.i386.rpm
warning: kdebindings-devel-3.5.4-6.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:kdebindings-devel      ########################################### [100%]
[root@beiku1 soft]# rpm -ivh bind-sdb-9.3.6-4.P1.el5.i386.rpm
warning: bind-sdb-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:bind-sdb               ########################################### [100%]
[root@beiku1 soft]# rpm -ivh bind-libbind-devel-9.3.6-4.P1.el5.i386.rpm
warning: bind-libbind-devel-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:bind-libbind-devel     ########################################### [100%]
[root@beiku1 soft]# rpm -ivh bind-devel-9.3.6-4.P1.el5.i386.rpm
warning: bind-devel-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:bind-devel             ########################################### [100%]

还要手动安装一个软件包caching-nameserver-9.3.6-4.P1.el5 ,不安装这个软件包named服务不能启动,会报错误信息 例如:

[root@beiku1 ~]# service named start
Locating /var/named/chroot//etc/named.conf failed:
[FAILED]

[root@beiku1 soft]# rpm -ivh caching-nameserver-9.3.6-4.P1.el5.i386.rpm
warning: caching-nameserver-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:caching-nameserver     ########################################### [100%]

[root@beiku1 soft]# service named start
Starting named: [  OK  ]

二.复制模板文件
由于安装了chroot环境,所以我们的DNS主配置文件应该在/var/named/chroot/etc目录下面

[root@beiku1 soft]# cd /var/named/chroot/
[root@beiku1 chroot]# ls
dev  etc  proc  var
[root@beiku1 chroot]# cd etc
[root@beiku1 etc]# ls
localtime  named.caching-nameserver.conf  named.rfc1912.zones  rndc.key
[root@beiku1 etc]#

named.caching-nameserver.conf文件内容如下:

[root@beiku1 etc]# cat named.caching-nameserver.conf
//
// named.caching-nameserver.conf
//
// Provided by Red Hat caching-nameserver package to configure the
// ISC BIND named(8) DNS server as a caching only nameserver 
// (as a localhost DNS resolver only). 
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//
// DO NOT EDIT THIS FILE - use system-config-bind or an editor
// to create named.conf - edits to this file will be lost on 
// caching-nameserver package upgrade.
//
options {
        listen-on port 53 { 127.0.0.1; };
        listen-on-v6 port 53 { ::1; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";

        // Those options should be used carefully because they disable port
        // randomization
        // query-source    port 53;
        // query-source-v6 port 53;

        allow-query     { localhost; };
        allow-query-cache { localhost; };
};
logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};
view localhost_resolver {
        match-clients      { localhost; };
        match-destinations { localhost; };
        recursion yes;
        include "/etc/named.rfc1912.zones";
};

这个文件告诉我们不要直接的编辑这个文件,去创建一个named.conf文件,然后编辑named.conf文件,当有了named.conf,将不在读取这个文件。现在就将named.caching-nameserver.conf文件复制成named.conf文件。

[root@beiku1 etc]# cp -p named.caching-nameserver.conf named.conf
[root@beiku1 etc]# ls
localtime  named.caching-nameserver.conf  named.conf  named.rfc1912.zones  rndc.key

可以看到,named.conf文件就被创建成功了。最好在copy的时候加上-P的参数,保留权限。否则启动服务的时候会报权限拒绝的。

三.编辑named.conf文件

[root@beiku1 etc]# vi named.conf
//
// named.caching-nameserver.conf
//
// Provided by Red Hat caching-nameserver package to configure the
// ISC BIND named(8) DNS server as a caching only nameserver
// (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//
// DO NOT EDIT THIS FILE - use system-config-bind or an editor
// to create named.conf - edits to this file will be lost on
// caching-nameserver package upgrade.
//
options {
        listen-on port 53 { any; };
        listen-on-v6 port 53 { ::1; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";

        // Those options should be used carefully because they disable port
        // randomization
        // query-source    port 53;
        // query-source-v6 port 53;

        allow-query     { 10.138.130.0/24; };
        allow-query-cache { any; };
};
logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};
view localhost_resolver {
        match-clients      { 10.138.130.0/24; };
        match-destinations { any; };
        recursion yes;
        include "/etc/named.rfc1912.zones";
};

解释这些语法参数的意思
options
代表全局配置
listen-on port 53 { any; };
DNS服务监听在所有接口
listen-on-v6 port 53 { ::1; };
ipv6监听在本地回环接口
directory “/var/named”;
zone文件的存放目录,指的是chroot环境下面的/var/named
dump-file “/var/named/data/cache_dump.db”;
存放缓存的信息
statistics-file “/var/named/data/named_stats.txt”;
统计用户的访问状态
memstatistics-file “/var/named/data/named_mem_stats.txt”;
每一次访问耗费了多数内存的存放文件
allow-query { 10.138.130.0/24 };
允许查询的客户端,现在修改成本地网段,
allow-query-cache {any; };
允许那些客户端来查询缓存,any表示允许任何人。
logging {
channel default_debug {
file “data/named.run”;
severity dynamic;
};
定义日志的存放位置在/var/named/chroot/var/named/data/目录下面
};
view localhost_resolver {
match-clients { 10.138.130.0/24; };
match-destinations { any; };
recursion yes;
include “/etc/named.rfc1912.zones”;
};

这里是定义视图的功能,
Match-clients 是指匹配的客户端
Match-destination 是指匹配的目标
到这里,named.conf文件就已经配置成功了,这个视图最后写include “/etc/named.rfc1912.zones”;接下面,就去配置这个文件。当然,我们可以匹配不同的客户端来创建不同的视图。

四.定义zone文件

[root@beiku1 etc]# vi  named.rfc1912.zones
// named.rfc1912.zones:
//
// Provided by Red Hat caching-nameserver package 
//
// ISC BIND named zone configuration for zones recommended by
// RFC 1912 section 4.1 : localhost TLDs and address zones
// 
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//
zone "." IN {
        type hint;
        file "named.ca";
};

zone "sbyy.com" IN {
        type master;
        file "sbyy.zone";
        allow-update { none; };
};

zone "130.138.10.in-addr.arpa" IN {
        type master;
        file "named.sbyy";
        allow-update { none; };
};

解释这些语法参数的意思
Zone “.” 根区域
Zone “sbyy.com” 定义正向解析的区域
zone “130.138.10.in-addr.arpa” 定义反向解析的区域
IN Internet记录
type hint 根区域的类型为hint
type master 区域的类型为主要的
file “named.ca” ; 区域文件是named,ca
file “sbyy.zone”; 指定正向解析的区域文件是sbyy.zone
file “named.sbyy”; 指定反向解析的区域文件是named,sbyy
allow-update { none; }; 默认情况下,是否允许客户端自动更新
在named.ca文件中就定义了全球的13台根服务器,
在sbyy.com文件中就定义DNS的正向解析数据库
在named.sbyy文件中就定义DNS反向解析的数据库
定义zone文件就完成了,下面来编辑DNS的数据库文件。

五.使用模板文件来创建数据库文件

[root@beiku1 etc]# cd /var/named/chroot/var/named/
[root@beiku1 named]# ls
data  localdomain.zone  localhost.zone  named.broadcast  named.ca  named.ip6.local  named.local  named.zero  slaves

可以看到,在chroot环境下面的/var/named/有很多模板文件。Named.ca就是根区域的数据库文件,我们将localhost.zone复制成sbyy.zone,这个是正向解析的数据库文件,将named.local复制成named.sbyy,这个是反向解析的数据库文件。数据库文件一定要和/etc/named.rfc1912.zones这个文件里面的匹配。

[root@beiku1 named]# cp -p localhost.zone sbyy.zone
[root@beiku1 named]# cp -p named.local named.sbyy
[root@beiku1 named]# ls 
data              named.broadcast  named.local  sbyy.zone
localdomain.zone  named.ca         named.sbyy   slaves
localhost.zone    named.ip6.local  named.zero

复制成功,正向解析和反向解析的数据库文件就创建完成了。

六.定义数据库文件
1. 定义正向解析数据库文件

[root@beiku1 named]# vi sbyy.zone
$TTL    86400
@               IN SOA  beiku1.sbyy.com.       root.sbyy.com. (
                                        44              ; serial (d. adams)
                                        3H              ; refresh
                                        15M              ; retry
                                        1W              ; expiry
                                        1D )            ; minimum

@              IN NS           beiku1.sbyy.com.


beikuscan      IN A            10.138.130.167
beikuscan      IN A            10.138.130.168
beikuscan      IN A            10.138.130.169
beiku2         IN A            10.138.130.162
beiku1         IN A            10.138.130.161

关于正向解析数据库中每一行参数的解释
$TTL 86400
最小的存活的时间是86400S(24H)

@ IN SOA @ root (
这是一笔SOA记录,只允许存在一个SOA记录
@是代表要解析的这个域本身()
IN是Internet记录。
SOA 是初始授权记录,指定网络中第一台DNS Server。
root是指管理员的邮箱。

44 ; serial (d. adams)
3H ; refresh
15M ; retry
1W ; expiry
1D ) ; minimum

这些部分主要是用来主DNS和辅助DNS做同步用的
44 序列号,当主DNS数据改变时,这个序列号就要被增加1,而辅助DNS通过序列号来和主DNS同步。
3H 刷新,主DNS和辅助DNS每隔三小时同步一次。
15M 重试,3H之内,没有同步,每隔15M在尝试同步
1W 过期,1W之内,还没有同步,就不同步了
1D 生存期,没有这条记录,缓存的时间。
@ IN NS beiku1.sbyy.com.

这是一笔NS记录,指定nameserver为beiku1.sbyy.com至少要有一笔NS记录

beiku1 IN A 10.138.130.161
指定beiku1的ip地址为10.138.130.161

beikuscan IN A 10.138.130.167
指定beikuscan的ip地址为10.138.130.167

beikuscan IN A 10.138.130.168
指定beikuscan的ip地址为10.138.130.168

beikuscan IN A 10.138.130.169
指定beikuscan的ip地址为10.138.130.169
beiku2 IN A 10.138.130.162
指定beiku2的ip地址为10.138.130.162

正向解析的数据库就完成了,下面定义反向解析的数据库。

2. 定义反向解析数据库

[root@beiku1 named]# vi named.sbyy
$TTL    86400
@       IN      SOA     beiku1.sbyy.com. root.sbyy.com.  (
                                      1997022702 ; Serial
                                      120      ; Refresh
                                      120      ; Retry
                                      3600000    ; Expire
                                      86400 )    ; Minimum
@        IN      NS     beiku1.sbyy.com.

167     IN      PTR     beikuscan.sbyy.com.
168     IN      PTR     beikuscan.sbyy.com.
169     IN      PTR     beikuscan.sbyy.com.
162     IN      PTR     beiku2.sbyy.com. 
161     IN      PTR     beiku1.sbyy.com.

其实反向解析的数据库文件的配置和正向解析的差不多,只需要将ip地址和域名换一个位置就可以了,把A换成PTR就ok了。
DNS的基本配置就完成了,在来看看DNS是否能够正常工作。
我们先重启一下DNS服务

[root@beiku1 etc]# service named restart
Stopping named: [  OK  ]
Starting named: [  OK  ]

可以看到,DNS服务启动成功了。
在查询以前,要在客户端来指定DNS Server,在/etc/resolv.conf这个文件中指定。

[root@beiku1 etc]# vi /etc/resolv.conf
search sbyy.com
nameserver       10.138.130.161


[root@beiku1 etc]# service named restart
Stopping named: [  OK  ]
Starting named: [  OK  ]

参数及意义:
nameserver 表明dns 服务器的ip 地址,可以有很多行的nameserver,每一个带一个ip地址。
在查询时就按nameserver 在本文件中的顺序进行,且只有当第一个nameserver 没有反应时才查询下面的nameserver.
domain 声明主机的域名。很多程序用到它,如邮件系统;当为没有域名的主机进行dns 查询时,也要用到。如果没有域名,主机名将被使,用删除所有在第一个点( . )前面的内容。
search 它的多个参数指明域名查询顺序。当要查询没有域名的主机,主机将在由search 声明的域中分别查找。
domain 和search 不能共存;如果同时存在,后面出现的将会被使用。
sortlist 允许将得到域名结果进行特定的排序。它的参数为网络/掩码对,允许任意的排列顺序。

再来使用nslookup工具来查询一下

[root@beiku1 named]# nslookup beiku1.sbyy.com
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beiku1.sbyy.com
Address: 10.138.130.161

[root@beiku1 named]# nslookup beiku2.sbyy.com
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beiku2.sbyy.com
Address: 10.138.130.162

[root@beiku1 named]# nslookup beikuscan.sbyy.com
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beikuscan.sbyy.com
Address: 10.138.130.169
Name:   beikuscan.sbyy.com
Address: 10.138.130.167
Name:   beikuscan.sbyy.com
Address: 10.138.130.168

[root@beiku1 named]# nslookup beiku1
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beiku1.sbyy.com
Address: 10.138.130.161

[root@beiku1 named]# nslookup beiku2
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beiku2.sbyy.com
Address: 10.138.130.162

[root@beiku1 named]# nslookup beikuscan
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beikuscan.sbyy.com
Address: 10.138.130.168
Name:   beikuscan.sbyy.com
Address: 10.138.130.169
Name:   beikuscan.sbyy.com
Address: 10.138.130.167

[root@beiku1 named]# nslookup 10.138.130.161
Server:         10.138.130.161
Address:        10.138.130.161#53

161.130.138.10.in-addr.arpa     name = beiku1.sbyy.com.

[root@beiku1 named]# nslookup 10.138.130.162
Server:         10.138.130.161
Address:        10.138.130.161#53

162.130.138.10.in-addr.arpa     name = beiku2.sbyy.com.

[root@beiku1 named]# nslookup 10.138.130.167
Server:         10.138.130.161
Address:        10.138.130.161#53

167.130.138.10.in-addr.arpa     name = beikuscan.sbyy.com.

[root@beiku1 named]# nslookup 10.138.130.168
Server:         10.138.130.161
Address:        10.138.130.161#53

168.130.138.10.in-addr.arpa     name = beikuscan.sbyy.com.

[root@beiku1 named]# nslookup 10.138.130.169
Server:         10.138.130.161
Address:        10.138.130.161#53

169.130.138.10.in-addr.arpa     name = beikuscan.sbyy.com.

可以看到,DNS解析一切正常,上面只是配置了主DNS服务器,而且主DNS服务器也工作正常,现在我们来配置一个辅助DNS服务器

配置辅助DNS服务器
主DNS的东西和辅助DNS东西其实是相同的
一.安装软件包

 [root@beiku2 soft]# rpm -qa | grep bind
bind-chroot-9.3.6-4.P1.el5
kdebindings-3.5.4-6.el5
system-config-bind-4.0.3-4.el5
ypbind-1.19-12.el5
bind-libs-9.3.6-4.P1.el5
bind-9.3.6-4.P1.el5
bind-utils-9.3.6-4.P1.el5
[root@beiku2 soft]# rpm -ivh kdebindings-devel-3.5.4-6.el5.i386.rpm
warning: kdebindings-devel-3.5.4-6.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:kdebindings-devel      ########################################### [100%]
[root@beiku2 soft]# rpm -ivh caching-nameserver-9.3.6-4.P1.el5.i386.rpm
warning: caching-nameserver-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:caching-nameserver     ########################################### [100%]
[root@beiku2 soft]# rpm -ivh bind-sdb-9.3.6-4.P1.el5.i386.rpm
warning: bind-sdb-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:bind-sdb               ########################################### [100%]
[root@beiku2 soft]# rpm -ivh bind-libbind-devel-9.3.6-4.P1.el5.i386.rpm
warning: bind-libbind-devel-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:bind-libbind-devel     ########################################### [100%]
[root@beiku2 soft]# rpm -ivh bind-devel-9.3.6-4.P1.el5.i386.rpm
warning: bind-devel-9.3.6-4.P1.el5.i386.rpm: Header V3 DSA signature: NOKEY, key ID 37017186
Preparing...                ########################################### [100%]
   1:bind-devel             ########################################### [100%]

二.复制模板文件

[root@beiku2 /]# cd /var/named/chroot/etc
[root@beiku2 etc]# ls -lrt
total 24
-rw-r--r-- 1 root root  3519 Feb 27  2006 localtime
-rw-r----- 1 root named  955 Jul 30  2009 named.rfc1912.zones
-rw-r----- 1 root named 1230 Jul 30  2009 named.caching-nameserver.conf
-rw-r----- 1 root named  113 Nov 15  2014 rndc.key

[root@beiku2 etc]# cp -p named.caching-nameserver.conf named.conf

三.编辑named.conf文件

[root@beiku2 etc]# vi named.conf
//
// named.caching-nameserver.conf
//
// Provided by Red Hat caching-nameserver package to configure the
// ISC BIND named(8) DNS server as a caching only nameserver
// (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//
// DO NOT EDIT THIS FILE - use system-config-bind or an editor
// to create named.conf - edits to this file will be lost on
// caching-nameserver package upgrade.
//
options {
        listen-on port 53 { any; };
        listen-on-v6 port 53 { ::1; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";

        // Those options should be used carefully because they disable port
        // randomization
        // query-source    port 53;
        // query-source-v6 port 53;

        allow-query     { 10.138.130.0/24; };
        allow-query-cache { any; };
};
logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};
view localhost_resolver {
        match-clients      { 10.138.130.0/24; };
        match-destinations { any; };
        recursion yes;
        include "/etc/named.rfc1912.zones";
};

和主DNS配置一样

四.定义zone文件

[root@beiku2 etc]# vi named.rfc1912.zones
// named.rfc1912.zones:
//
// Provided by Red Hat caching-nameserver package
//
// ISC BIND named zone configuration for zones recommended by
// RFC 1912 section 4.1 : localhost TLDs and address zones
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//

zone "sbyy.com" IN {
        type slave;
        masters {10.138.130.161;};
        file "slaves/sbyy.com";
};

zone "0.138.10.in-addr.arpa" IN {
        type slave;
        masters {10.138.130.161;};
        file "slaves/named.sbyy";
};

辅助DNS在定义zone文件的时候和主DNS有些不同
在辅助DNS里面 type要改为slave
master { 10.138.130.161; }; 而且必须指定主DNS的IP address
file “slaves/sbyy.com”;
file “slaves/named.sbyy”;
为什么要指定数据库文件在slaves目录下面呢,是因为slaves目录是拥有人和拥有组都是named用户,在启动DNS服务的时候,只有named有权限进行操作,所以我们要把数据库放在这个目录下面。

[root@beiku2 etc]# cd /var/named/chroot/var/named/
[root@beiku2 named]# ls -lrt
total 44
drwxrwx--- 2 named named 4096 Jul 27  2004 slaves
drwxrwx--- 2 named named 4096 Aug 26  2004 data
-rw-r----- 1 root  named  427 Jul 30  2009 named.zero
-rw-r----- 1 root  named  426 Jul 30  2009 named.local
-rw-r----- 1 root  named  424 Jul 30  2009 named.ip6.local
-rw-r----- 1 root  named 1892 Jul 30  2009 named.ca
-rw-r----- 1 root  named  427 Jul 30  2009 named.broadcast
-rw-r----- 1 root  named  195 Jul 30  2009 localhost.zone
-rw-r----- 1 root  named  198 Jul 30  2009 localdomain.zone
[root@beiku2 named]# cd slaves
[root@beiku2 slaves]# ls -lrt
total 0

可以看到,slaves目录的拥有人和拥有组是named,并且现在的slaves目录下面是什么东西都没有的。
现在我们重启一下DNS服务

[root@beiku2 slaves]# service named restart
Stopping named: [  OK  ]
Starting named: [  OK  ]

可以看到,服务启动成功了。在启动服务的同时,我们来查看一下日志信息,看看日志里面有什么提示

[root@beiku2 slaves]# tail /var/log/messages
Aug 25 23:41:49 beiku2 named[30421]: the working directory is not writable
Aug 25 23:41:49 beiku2 named[30421]: running
Aug 25 23:41:49 beiku2 named[30421]: zone 0.138.10.in-addr.arpa/IN/localhost_resolver: Transfer started.
Aug 25 23:41:49 beiku2 named[30421]: transfer of '0.138.10.in-addr.arpa/IN' from 10.138.130.161#53: connected using 10.138.130.162#44647
Aug 25 23:41:49 beiku2 named[30421]: zone 0.138.10.in-addr.arpa/IN/localhost_resolver: transferred serial 1997022700
Aug 25 23:41:49 beiku2 named[30421]: transfer of '0.138.10.in-addr.arpa/IN' from 10.138.130.161#53: end of transfer
Aug 25 23:41:49 beiku2 named[30421]: zone sbyy.com/IN/localhost_resolver: Transfer started.
Aug 25 23:41:49 beiku2 named[30421]: transfer of 'sbyy.com/IN' from 10.138.130.161#53: connected using 10.138.130.162#56490
Aug 25 23:41:49 beiku2 named[30421]: zone sbyy.com/IN/localhost_resolver: transferred serial 42
Aug 25 23:41:49 beiku2 named[30421]: transfer of 'sbyy.com/IN' from 10.138.130.161#53: end of transfer

在日志里面可以看到,主DNS与辅助DNS正在同步序列号,同步成功,这个日志里面的信息非常的详细。
接下来,我们在到slaves目录下面去看看

[root@beiku2 slaves]# ls -lrt
total 8
-rw-r--r-- 1 named named 414 Aug 25 23:41 sbyy.com
-rw-r--r-- 1 named named 451 Aug 25 23:41 named.sbyy

刚才slaves目录下面的是什么东西都没有,现在就多了两个文件,example.com和named.example这个两个文件。这个就是我们刚才在定义zone文件的时候在slaves目录下面定义的,文件名是随意写的,这个没有关系,但是里面东西是和主DNS一样的。
我们查看这两个文件的具体内容

[root@beiku2 slaves]# cat sbyy.com
$ORIGIN .
$TTL 86400      ; 1 day
sbyy.com                IN SOA  sbyy.com. root.sbyy.com. (
                                42         ; serial
                                10800      ; refresh (3 hours)
                                900        ; retry (15 minutes)
                                604800     ; expire (1 week)
                                86400      ; minimum (1 day)
                                )
                        NS      sbyy.com.
                        A       127.0.0.1
                        AAAA    ::1
$ORIGIN sbyy.com.
beiku1                  A       10.138.130.161
beikuscan1              A       10.138.130.167
beikuscan2              A       10.138.130.168
beikuscan3              A       10.138.130.169
beiku2                  A       10.138.130.162

[root@beiku2 slaves]# cat named.sbyy
$ORIGIN .
$TTL 86400      ; 1 day
0.138.10.in-addr.arpa   IN SOA  localhost. root.localhost. (
                                1997022700 ; serial
                                28800      ; refresh (8 hours)
                                14400      ; retry (4 hours)
                                3600000    ; expire (5 weeks 6 days 16 hours)
                                86400      ; minimum (1 day)
                                )
                        NS      localhost.
$ORIGIN 0.138.10.in-addr.arpa.
1                       PTR     localhost.
161                     PTR     beiku1.sbyy.com
167                     PTR     beikuscan1.sbyy.com
168                     PTR     beikuscan2.sbyy.com
169                     PTR     beikuscan3.sbyy.com
162                     PTR     beiku2.sbyy.com

这两个文件里面的内容和我们的主DNS的内容都是一样的。而且还帮我们整理的非常的漂亮。这些都是系统自动生成的。
现在我们来测试一下主DNS和辅助DNS可不可以正常的工作

[root@beiku2 slaves]# vi /etc/resolv.conf
search sbyy.com
nameserver 10.138.130.161
nameserver 10.138.130.162

现在我们将主DNS和辅助DNS都设置一下。然后在使用nslookup工具来测试

[root@beiku2 slaves]# nslooup beiku1
-bash: nslooup: command not found
[root@beiku2 slaves]# nslookup beiku1
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beiku1.sbyy.com
Address: 10.138.130.161

 [root@beiku2 slaves]# nslookup beiku2
Server:         10.138.130.161
Address:        10.138.130.161#53

Name:   beiku2.sbyy.com
Address: 10.138.130.162

现在解析没有问题,还是有10.138.130.161这台主DNS来解析的。
接下来,我们将10.138.130.161这台主DNS给down,看下10.138.130.162这台辅助DNS能否正常工作。

[root@beiku1 named]# service named stop
Stopping named: [  OK  ]

用nslookup来测试一下

[root@beiku2 slaves]# nslookup beiku1
Server:         10.138.130.162
Address:        10.138.130.162#53

Name:   beiku1.sbyy.com
Address: 10.138.130.161

现在解析照样成功了,现在并不是通过10.138.130.161这台主DNS来解析出来的,而是通过我们的10.138.130.162这台辅助DNS来解析出来的。当我们网络中的主DNSdown掉的时候,我们的辅助DNS照样能够正常的工作。我们还可以实现负载均衡,可以在网络中的一半客户端的主DNS指向10.138.130.161,辅助DNS指向10.138.130.161。将网络中的另一半客户端的主DNS指向10.138.130.162,辅助DNS指向10.138.130.161。这样两台服务器都可以正常的工作,正常的为客户端解析,当其中的一台DNSdown掉后,另一台DNS也会继续的工作,这样就实现了简单的负载均衡。到目前为止,我们的主DNS Server 和我们的辅助DNS Server都已经设置成功了,并且都可以正常的工作了。

接下来,我们在做一个试验,我们在主DNS添加一条记录,看下辅助DNS能否检测试到这条记录,不能够在辅助DNS上面添加记录,这样没有意义,我们的主DNS是检测不到这条记录的。

[root@beiku1 named]# vi sbyy.zone
$TTL    86400
@               IN SOA  @       root (
                                        43              ; serial (d. adams)
                                        2M              ; refresh
                                        2M              ; retry
                                        1W              ; expiry
                                        1D )            ; minimum

                IN NS           @
                IN A            127.0.0.1
                IN AAAA         ::1


beiku1          IN A            10.138.130.161
beikuscan      IN A            10.138.130.167
beikuscan      IN A            10.138.130.168
beikuscan      IN A            10.138.130.169
beiku2          IN A            10.138.130.162
www             IN A            10.138.130.170

增加了www IN A 10.138.130.170记录。在主DNS里面做了新的操作以后,一定要将主DNS的序列号加一。否则辅助DNS是不会来同步我们的主DNS的。我们已经将主DNS的序列号加一了,但是默认情况下,主DNS与辅助DNS的同步时间是3H,这样我们很难看到效果,我们将它改为2M,然后在将重试时间改为2M,这样就代表每隔两分钟主DNS和辅助DNS进行同步,如果同步不成功,在隔两分钟同步一次。接下来我们将反向解析里面的也来修改一下

[root@beiku1 named]# vi named.sbyy
$TTL    86400
@       IN      SOA     beiku1.sbyy.com. root.sbyy.com.  (
                                      1997022703 ; Serial
                                      120      ; Refresh
                                      120      ; Retry
                                      3600000    ; Expire
                                      86400 )    ; Minimum
@        IN      NS     beiku1.sbyy.com.

167     IN      PTR     beikuscan.sbyy.com.
168     IN      PTR     beikuscan.sbyy.com.
169     IN      PTR     beikuscan.sbyy.com.
162     IN      PTR     beiku2.sbyy.com.
161     IN      PTR     beiku1.sbyy.com.
170     IN      PTR     www.sbyy.com.

这样,反向解析里面也已经修改完成了。现在将DNS服务重启

[root@beiku1 named]# service named restart
Stopping named: [  OK  ]
Starting named: [  OK  ]

重启成功,等几分钟之后在来看下效果。现在我们查看辅助DNS的正向解析数据库文件的内容

[root@beiku2 slaves]# cat sbyy.com
$ORIGIN .
$TTL 86400      ; 1 day
sbyy.com                IN SOA  beiku1.sbyy.com. root.sbyy.com. (
                                45         ; serial
                                120        ; refresh (2 minutes)
                                120        ; retry (2 minutes)
                                604800     ; expire (1 week)
                                86400      ; minimum (1 day)
                                )
                        NS      beiku1.sbyy.com.
$ORIGIN sbyy.com.
beiku1                  A       10.138.130.161
beiku2                  A       10.138.130.162
beikuscan               A       10.138.130.167
                        A       10.138.130.168
                        A       10.138.130.169
www                     A       10.138.130.170

OK,可以看到,我们刚才在主DNS里面添加的一条新的记录现在已经被辅助DNS同步过去了,而且辅助DNS的序列号和刷新时间,重试时间都同步了。下来我们查看辅助DNS的反向解析数据库文件的内容

[root@beiku2 slaves]# cat named.sbyy
RIGIN .
$TTL 86400      ; 1 day
0.138.10.in-addr.arpa   IN SOA  localhost. root.localhost. (
                                1997022702 ; serial
                                28800      ; refresh (8 hours)
                                14400      ; retry (4 hours)
                                3600000    ; expire (5 weeks 6 days 16 hours)
                                86400      ; minimum (1 day)
                                )
                        NS      localhost.
$ORIGIN 0.138.10.in-addr.arpa.
1                       PTR     localhost.
161                     PTR     beiku1.sbyy.com
167                     PTR     beikuscan1.sbyy.com
168                     PTR     beikuscan2.sbyy.com
169                     PTR     beikuscan3.sbyy.com
162                     PTR     beiku2.sbyy.com
170                     PTR     www.sbyy.com

OK,也可以看到,辅助DNS也已经同步成功了,到此DNS的配置就完成了。

将SQL质量审计引入软件开发可以避免不必要的SQL优化工作

今天帮助兄弟部门优化五险统一征缴数据发送程序,优化其实很简单,主要是解决了原本不应该执行的全表扫描和笛卡尔积。但问题是为什么会出现全表扫描和笛卡尔积,是Oracle优化器选择错了执行计划吗,答案并不是,原因就是在设计表结构时的缺陷造成的,如果在设计表结构时能够根据业务合理设计,也就没有这次优化了。其实这个问题我在公司就提过,但不重视,现在我成了甲方,我又要当救火队员了。

下面是每个月社会保障系统向五险征缴系统发送每月所有单位各个险种的应缴数据的查询语句:

Select t.Pay_Object_Id,
       t.Pay_Object_Code,
       t.Pay_Object_Name,
       t.Insr_Detail_Code,
       t.asgn_tenet,
       t.asgn_order,
       t.use_pred_insr,
       Sum(t.Topay_Money) as topay_money,
       Sum(Pay_Money) as pay_money,
       Sum(Pred_Money) as pred_money,
       to_char(sysdate, 'yyyy-mm-dd') as pay_time,
       t.corp_type_code
  From (Select T1.Corp_Id As Pay_Object_Id,
               T1.Insr_Detail_Code,
               T1.Corp_Code As Pay_Object_Code,
               T1.Corp_Name As Pay_Object_Name,
               T1.asgn_tenet,
               T1.asgn_order,
               T1.use_pred_insr,
               Decode(Sign(T1.pay_Money),
                      -1,
                      T1.pay_Money,
                      Decode(Sign(T1.pay_Money -
                                  Decode(Sign(T1.pay_Money),
                                         -1,
                                         0,
                                         Nvl(T2.Pred_Money, 0))),
                             -1,
                             0,
                             T1.pay_Money -
                             Decode(Sign(T1.pay_Money),
                                    -1,
                                    0,
                                    Nvl(T2.Pred_Money, 0)))) As pay_Money,
               T1.toPay_Money,
               Nvl(T2.Pred_Money, 0) As Pred_Money,
               T1.corp_type_code
          from (select t11.Corp_Id,
                       t11.Corp_Code,
                       t11.Corp_Name,
                       t11.Insr_Detail_Code,
                       sum(t11.Topay_Money) as Topay_Money,
                       t11.corp_type_code,
                       sum(t11.Pay_Money) as Pay_Money,
                       t11.asgn_tenet,
                       t11.asgn_order,
                       t11.use_pred_insr
                  from (Select b.Corp_Id,
                               a.Corp_Code,
                               a.Corp_Name,
                               b.insr_detail_code,
                               a.corp_type_code,
                               Sum(b.Pay_Money - nvl(b.Payed_Money, 0)) As Topay_Money,
                               Sum(b.Pay_Money - nvl(b.Payed_Money, 0)) As Pay_Money,
                               c.asgn_tenet,
                               c.asgn_order,
                               c.use_pred_insr
                          From Bs_Corp a, Lv_Insr_Topay b, lv_scheme_detail c
                         Where a.Corp_Id = b. Corp_Id
                           and ((b.payed_flag = 0 and
                               nvl(b.busi_asg_no, 0) = 0) or
                               (b.payed_flag = 2))
                           and nvl(b.indi_pay_flag, 0) = 0
                           and c.scheme_id = 1
                           and b.insr_detail_code=c.insr_detail_code
                           and not exists
                         (select 'x'
                                  from lv_busi_bill lbb, lv_busi_record lbr
                                 where b.corp_id = lbr.pay_object_id
                                   and lbb.busi_bill_sn = lbr.busi_bill_sn
                                   and lbb.pay_object = 1
                                   and lbb.audit_flag = 0)
                           and c.insr_detail_code = b.insr_detail_code
                           and b.calc_prd < = '201508'
                           and b.insr_detail_code in
                               (select distinct insr_detail_code
                                  from lv_scheme_detail
                                 where scheme_id = 1)
                           and b.topay_type in
                               (select topay_type
                                  from lv_busi_type_topay
                                 where busi_type = 1)
                           and b.src_type = 1
                           and a.center_id = '430701'
                         Group By b.Corp_Id,
                                  b.Insr_Detail_Code,
                                  c.use_pred_insr,
                                  a.Corp_Code,
                                  a.Corp_Name,
                                  a.corp_type_code,
                                  c.asgn_tenet,
                                  c.asgn_order,
                                  c.use_pred_insr) t11
                 group by t11.Corp_Id,
                          t11.Corp_Code,
                          t11.Corp_Name,
                          t11.Insr_Detail_Code,
                          t11.corp_type_code,
                          t11.asgn_tenet,
                          t11.asgn_order,
                          t11.use_pred_insr) T1,
               (select t21.corp_id,
                       sum(t21.pred_money) as pred_money,
                       t21.Insr_Detail_Code
                  from (Select a.Corp_Id,
                               decode(c.use_pred_insr,
                                      null,
                                      b.insr_detail_code,
                                      c.use_pred_insr) as Insr_Detail_Code,
                               sum(decode(1, 0, 0, 1, b.Pred_Money)) as pred_money
                          From Bs_Corp a, Lv_Pred_Money b, lv_scheme_detail c
                         Where a.Corp_Id = b.Corp_Id
                           and c.insr_detail_code = b.insr_detail_code
                           and c.scheme_id = 1
                           and decode(c.use_pred_insr,
                                      null,
                                      c.insr_detail_code,
                                      c.use_pred_insr) = c.insr_detail_code
                         group by a.corp_id,
                                  c.use_pred_insr,
                                  b.insr_detail_code) t21
                 group by t21.corp_id, t21.Insr_Detail_Code) T2
         Where T1.Corp_Id = T2.Corp_Id(+)
           And T1.Insr_Detail_Code = T2.Insr_Detail_Code(+)) t
 where not exists (select 'X'
          from lv_busi_bill a, lv_busi_record b
         where a.busi_bill_sn = b.busi_bill_sn
           and a.audit_flag = 0
           and a.pay_object = 1
           and b.PAY_OBJECT_ID = t.PAY_OBJECT_ID
           and b.INSR_DETAIL_CODE = t.insr_detail_code)
 Group By t.pay_money,
          t.Pay_Object_Id,
          t.Pay_Object_Code,
          t.Pay_Object_Name,
          t.corp_type_code,
          t.insr_detail_code,
          t.asgn_tenet,
          t.asgn_order,
          t.use_pred_insr
Having Sum(t.pay_Money) = 0
 order by t.Pay_Object_Name, t.asgn_order
 

其执行计划的统计信息如下:
3
执行时间是1481秒,这个时间是不可接受的。

其执行计划如下:
4

执行计划中对表lv_busi_record执行全表扫描,该表记录有2000w,这明显是不对,为什么不走索引了,是因为表在设计和创建时就没有创建索引,这个表的数据是不断增加的,前期数据量少,执行全表扫描对性能的影响就根本体现不出来,但随着系统的运行,数据量的增加就会越来越慢。还有就是表lv_scheme_detail和Bs_Corp之间的笛卡尔积,为什么会出现笛卡尔积了,发现两个表之间根本就没有关联条件,一开始还以为开发人员忘记书写了,但经过查询表空间发现,两个表根本就没有可以关联的字段,而最后使用了group by来进行去重。

这里我只能对表lv_busi_record根据业务规则创建索引,但没有办法解决表lv_scheme_detail和Bs_Corp之间的笛卡尔积关联的问题
如果修改表结构就涉及到修改应用程序了。在对表lv_busi_record索引后的执行情况如下。
其执行计划的统计信息如下:
2

5
执行时间缩短为接近14秒,从1481到14是百倍的提升。其实处理方法很简单,但我想说的是,这本就不应该出现的,如果我们软件开发商在设计,开发和测试阶段能认真设计,编写SQL和测试,也就是引入SQL质量审计就能避免这种问题的发生。

oracle 10g data guard broker ORA-16607 故障处理案例

为了更简单的管理data guard可以配置data guard broker来进行管理,配置broker过程如下:

[oracle@oracle11g ~]$ dgmgrl xxx/xxxxx@xxx
DGMGRL for Linux: Version 10.2.0.5.0 - Production

Copyright (c) 2000, 2005, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> help

The following commands are available:

add            Add a standby database to the broker configuration
connect        Connect to an Oracle instance
create         Create a broker configuration
disable        Disable a configuration, a database, or Fast-Start Failover
edit           Edit a configuration, database, or instance
enable         Enable a configuration, a database, or Fast-Start Failover
exit           Exit the program
failover       Change a standby database to be the primary database
help           Display description and syntax for a command
quit           Exit the program
reinstate      Change a disabled database into a viable standby database
rem            Comment to be ignored by DGMGRL
remove         Remove a configuration, database, or instance
show           Display information about a configuration, database, or instance
shutdown       Shutdown a currently running Oracle instance
start          Start Fast-Start Failover observer
startup        Start an Oracle database instance
stop           Stop Fast-Start Failover observer
switchover     Switch roles between the primary database and a standby database

Use "help " to see syntax for individual commands

DGMGRL> show configuration
Error: ORA-16532: Data Guard broker configuration does not exist

Configuration details cannot be determined by DGMGRL
DGMGRL> help create

Create a broker configuration

Syntax:

  CREATE CONFIGURATION  AS
    PRIMARY DATABASE IS 
    CONNECT IDENTIFIER IS ;

创建broker配置文件

DGMGRL> create configuration 'broker_dg' as primary database is test connect identifier is test;
Configuration "broker_dg" created with primary database "test"
DGMGRL> show configuration

Configuration
  Name:                broker_dg
  Enabled:             NO
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    test - Primary database

Current status for "broker_dg":
DISABLED

DGMGRL> help show configuration

Display information about a configuration, database, or instance

Syntax:

  SHOW CONFIGURATION;

  SHOW DATABASE [VERBOSE]  [];

  SHOW INSTANCE [VERBOSE]  []
    [ON DATABASE ];

DGMGRL> help add

Add a standby database to the broker configuration

Syntax:

  ADD DATABASE  AS
    CONNECT IDENTIFIER IS 
    MAINTAINED AS {PHYSICAL|LOGICAL};

向配置文件添加备库(物理备库test_dg)

DGMGRL> add database test_dg as connect identifier is test_dg maintained as physical;
Database "test_dg" added
DGMGRL> show configuration

Configuration
  Name:                broker_dg
  Enabled:             NO
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    test    - Primary database
    test_dg - Physical standby database

Current status for "broker_dg":
DISABLED

启用broker配置

DGMGRL> enable configuration
Enabled.

显示broker配置信息,显示如下错误信息:

DGMGRL> show configuration

Configuration
  Name:                broker_dg
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    test    - Primary database
    test_dg - Physical standby database

Current status for "broker_dg":
Warning: ORA-16607: one or more databases have failed

显示主库test的状态报告

DGMGRL> show database test statusreport
STATUS REPORT
       INSTANCE_NAME   SEVERITY ERROR_TEXT

显示备库test_dg的状态报告,显示如下错误信息:

DGMGRL> show database test_dg statusreport
Error: ORA-16664: unable to receive the result from a remote database

显示主库test的详细信息

DGMGRL> show database verbose test

Database
  Name:            test
  Role:            PRIMARY
  Enabled:         YES
  Intended State:  ONLINE
  Instance(s):
    test

  Properties:
    InitialConnectIdentifier        = 'test'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    Dependency                      = ''
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '180'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '10'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = '/u03/app/oracle/oradata/test/, /u01/app/oracle/oradata/test/, /u03/app/oracle/oradata/test_ldg/, /u01/app/oracle/oradata/test/'
    LogFileNameConvert              = '/u03/app/oracle/oradata/test/, /u01/app/oracle/oradata/test/, /u03/app/oracle/oradata/test_ldg/, /u01/app/oracle/oradata/test/'
    FastStartFailoverTarget         = ''
    StatusReport                    = '(monitor)'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'xxxxxx'
    SidName                         = 'test'
    LocalListenerAddress            = '(ADDRESS=(PROTOCOL=tcp)(HOST=xxxxxx)(PORT=1521))'
    StandbyArchiveLocation          = '/u02/archive/'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    LatestLog                       = '(monitor)'
    TopWaitEvents                   = '(monitor)'

Current status for "test":
SUCCESS

显示备库test_dg的详细信息,显示如下错误:

DGMGRL> show database verbose test_dg

Database
  Name:            test_dg
  Role:            PHYSICAL STANDBY
  Enabled:         YES
  Intended State:  ONLINE
  Instance(s):
    test_dg

  Properties:
    InitialConnectIdentifier        = 'test_dg'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    Dependency                      = ''
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '180'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '2'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = '/u03/app/oracle/oradata/test_ldg/, /u03/app/oracle/oradata/test/, /u01/app/oracle/oradata/test/, /u03/app/oracle/oradata/test/'
    LogFileNameConvert              = '/u03/app/oracle/oradata/test_ldg/, /u03/app/oracle/oradata/test/, /u01/app/oracle/oradata/test/, /u03/app/oracle/oradata/test/'
    FastStartFailoverTarget         = ''
    StatusReport                    = '(monitor)'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'jingyong1'
    SidName                         = 'test_dg'
    LocalListenerAddress            = '(ADDRESS=(PROTOCOL=tcp)(HOST=jingyong1)(PORT=1521))'
    StandbyArchiveLocation          = '/u03/app/oracle/archive/'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    LatestLog                       = '(monitor)'
    TopWaitEvents                   = '(monitor)'

Current status for "test_dg":
Error: ORA-16664: unable to receive the result from a remote database

显然是物理备库test_dg出了故障,检查备库的drctest_dg.log该日志文件在oracle10g中存储bdump文件中:

DG 2015-08-04-17:07:48        0 2 0 NSV0: Failed to connect to remote database test. Error is ORA-12514
DG 2015-08-04-17:07:48        0 2 0 NSV0: Failed to send message to site test. Error code is ORA-12514.
DG 2015-08-04-17:07:48        0 2 0 DMON: Database test returned ORA-12514
DG 2015-08-04-17:07:48        0 2 0       for opcode = CTL_GET_STATUS, phase = BEGIN, req_id = 1.1.886847999
DG 2015-08-04-17:07:59        0 2 0 RSM 0 received GETPROP request: rid=0x02010000, pid=54
DG 2015-08-04-17:07:59        0 2 0 Database Resource: Get Property InconsistentProperties
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'ArchiveLagTarget' has inconsistent values:METADATA='0', SPFILE='', DATABASE='0'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property ArchiveLagTarget is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveMaxProcesses' has inconsistent values:METADATA='2', SPFILE='', DATABASE='2'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveMaxProcesses is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveMinSucceedDest' has inconsistent values:METADATA='1', SPFILE='', DATABASE='1'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveMinSucceedDest is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 SPFILE is missing value for property 'LogArchiveTrace' with sid='test_dg'
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveTrace' has inconsistent values:METADATA='0', SPFILE='(missing)', DATABASE='0'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveTrace is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 SPFILE is missing value for property 'LogArchiveFormat' with sid='test_dg'
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveFormat' has inconsistent values:METADATA='%t_%s_%r.dbf', SPFILE='(missing)', DATABASE='%t_%s_%r.dbf'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveFormat is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 Database Resource GetProperty succeeded
DG 2015-08-04-17:07:59  2010000 4 886848003 DMON: MON_PROPERTY operation completed
DG 2015-08-04-17:07:59        0 2 0 NSV0: Failed to connect to remote database test. Error is ORA-12514
DG 2015-08-04-17:07:59        0 2 0 NSV0: Failed to send message to site test. Error code is ORA-12514.
DG 2015-08-04-17:07:59        0 2 0 DMON: Database test returned ORA-12514
DG 2015-08-04-17:07:59        0 2 0       for opcode = MON_PROPERTY, phase = NULL, req_id = 1.1.886848003
DG 2015-08-04-17:08:03        0 2 0 DRCX: could not find task req_id=1.1.886847999 for PROBE.

从上面的信息中可以看到如下信息:

RSM Warning: Property 'ArchiveLagTarget' has inconsistent values:METADATA='0', SPFILE='', DATABASE='0'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property ArchiveLagTarget is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveMaxProcesses' has inconsistent values:METADATA='2', SPFILE='', DATABASE='2'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveMaxProcesses is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveMinSucceedDest' has inconsistent values:METADATA='1', SPFILE='', DATABASE='1'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveMinSucceedDest is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 SPFILE is missing value for property 'LogArchiveTrace' with sid='test_dg'
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveTrace' has inconsistent values:METADATA='0', SPFILE='(missing)', DATABASE='0'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveTrace is inconsistent with the database setting
DG 2015-08-04-17:07:59        0 2 0 SPFILE is missing value for property 'LogArchiveFormat' with sid='test_dg'
DG 2015-08-04-17:07:59        0 2 0 RSM Warning: Property 'LogArchiveFormat' has inconsistent values:METADATA='%t_%s_%r.dbf', SPFILE='(missing)', DATABASE='%t_%s_%r.dbf'
DG 2015-08-04-17:07:59        0 2 0 RSM0: HEALTH CHECK WARNING: ORA-16714: the value of property LogArchiveFormat is inconsistent with the database setting

这里显示
‘ArchiveLagTarget’ has inconsistent values:METADATA=’0′, SPFILE=”, DATABASE=’0′
这说明archive_lag_target参数spfile文件的值与database,metadata的值不相同(它们都为0)。

‘LogArchiveMaxProcesses’ has inconsistent values:METADATA=’2′, SPFILE=”, DATABASE=’2′ 这说明log_archive_max_processes参数spfile文件的值与database,metadata的值不相同(它们都为2)。

‘LogArchiveMinSucceedDest’ has inconsistent values:METADATA=’1′, SPFILE=”, DATABASE=’1′ 这说明log_archive_min_succeed_dest参数spfile文件的值与database,metadata的值不相同(它们都为1)。

‘LogArchiveTrace’ has inconsistent values:METADATA=’0′, SPFILE='(missing)’, DATABASE=’0′ 这说明log_archive_trace参数spfile文件的值与database,metadata的值不相同(它们都为0)。

‘LogArchiveFormat’ with sid=’test_dg’
DG 2015-08-04-17:07:59 0 2 0 RSM Warning: Property ‘LogArchiveFormat’ has inconsistent values:METADATA=’%t_%s_%r.dbf’, SPFILE='(missing)’, DATABASE=’%t_%s_%r.dbf’ 这说明log_archive_format参数spfile文件的值与database,metadata的值不相同(它们都为’%t_%s_%r.dbf’)。

对以上不一致参数进行修改

SQL> alter system set log_archive_max_processes=2 scope=spfile;

System altered.

SQL> alter system set archive_lag_target=0 scope=spfile;

System altered.

SQL> alter system set log_archive_min_succeed_dest=1 scope=spfile;

System altered.

SQL> alter system set log_archive_trace=0 scope=spfile;

System altered.


SQL> alter system set log_archive_format='%t_%s_%r.dbf' scope=spfile;

System altered.

再次检查broker配置

DGMGRL> show database verbose test_dg

Database
  Name:            test_dg
  Role:            PHYSICAL STANDBY
  Enabled:         YES
  Intended State:  ONLINE
  Instance(s):
    test_dg

  Properties:
    InitialConnectIdentifier        = 'test_dg'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'ASYNC'
    Dependency                      = ''
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '180'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '2'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = '/u03/app/oracle/oradata/test_ldg/, /u03/app/oracle/oradata/test/, /u01/app/oracle/oradata/test/, /u03/app/oracle/oradata/test/'
    LogFileNameConvert              = '/u03/app/oracle/oradata/test_ldg/, /u03/app/oracle/oradata/test/, /u01/app/oracle/oradata/test/, /u03/app/oracle/oradata/test/'
    FastStartFailoverTarget         = ''
    StatusReport                    = '(monitor)'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'jingyong1'
    SidName                         = 'test_dg'
    LocalListenerAddress            = '(ADDRESS=(PROTOCOL=tcp)(HOST=jingyong1)(PORT=1521))'
    StandbyArchiveLocation          = '/u03/app/oracle/archive/'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    LatestLog                       = '(monitor)'
    TopWaitEvents                   = '(monitor)'

Current status for "test_dg":
SUCCESS
DGMGRL> show configuration

Configuration
  Name:                broker_dg
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    test    - Primary database
    test_dg - Physical standby database

Current status for "broker_dg":
SUCCESS

现在已经能成功显示broker配置中的数据库信息。