oracle 11g health monitor健康监控

health monitor健康监控
从oracle 11g开始,oracle数据库包含对数据库运行诊断检查的健康监控框架

health check健康检查是对数据库的各层和组件乾地检查.健康检查会检测文件错误,物理和逻辑块错误,undo和redo错误,数据字典错误等等.健康检查会生成一个检查报告在许多情况下包含了问题解决的建议.健康检查有两种运行方式:
.reactive—故障诊断架构为了响应一个严重错误会自动运行健康检查.

.manual—–作为一个dba,可以使用dbms_hm包或企业管理器接口来手工运行健康检查.

健康检查会存储调查结果,建议并在ADR中存储其它的信息

健康检查能以以下两种方式来运行:
DB-online模式意味着当数据库处于打开状态(open或mount)时可以运行健康检查

DB-offline模式意味着当实例不可用且数据库处于关闭状态(nomount)时也能运行健康检查

所有的健康检查都能在DB-online模式下运行.只有重做日志完整性检查和数据库结构完整检查可以在DB-offline模式下运行.

健康检查的类型
健康监控运行以下检查:
DB Structure Integrity Check(数据库结构完整性检查)–这种检查验证数据文件的完整性,如果这些文件不能访问,文件错误
或者不一致时会报告这些错误信息.如果数据库在mount或者open状态,这种检查会检查控制文件中所列出的重做日志文件和数据文件.如果数据库在nomount状态,只会检查控制文件.

Data Block Integrity Check(数据块完整性检查)–这种检查会检查磁盘镜像块错误比如checksum故障,head/tail mismatch和数据块的逻辑不一致性.大多数的错误可以通过使用block media recovery来进行修复.错误块信息也会被v$database_block_corruption视图所捕获.这种检查不会检测inter-block或inter-segment错误.

Redo Integrity Check(重做完整性检查)–这种检查将会扫描重做日志内容的可访问性和错误信息,也能对归档日志文件进行检查.
这种检查会报告归档日志或重做日志的错误信息.

Undo Segment Integrity Check(回滚段完整性检查)–这种检查会发现逻辑undo错误.在定位一个undo错误之后,这种检查将使用PMON和SMON来尝试恢复这个错误事务.如果恢复失败,health monitor将会存储关于这个错误的信息到v$corrupt_xid_list中.大多数undo错误都可以通过强制提交来解决.

Transaction Interity Check–这种检查与undo segment integrity check是相同的只是它只检查一特定的事务.

Dictionary Integrity Check–这种检查会检查核心字典对象比如tab$或col$的完整性.它将执行以下操作:
.验证每一个字典对象的字典条目内容

.执行cross-row级别的检查.它将对字典中的行强制执行逻辑约束验证

.执行对象的关系检查,它将在字典对象之间强制执行父子关系验证

Dictionary Integrity Check操作会检查以下字典对象:
tab$, clu$, fet$, uet$, seg$, undo$, ts$, file$, obj$, ind$, icol$, col$, user$,
con$, cdef$, ccol$, bootstrap$, objauth$, ugroup$, tsq$, syn$, view$, typed_
view$, superobj$, seq$, lob$, coltype$, subcoltype$, ntab$, refcon$, opqtype$,
dependency$, access$, viewcon$, icoldep$, dual$, sysauth$, objpriv$, defrole$,ecol$.

手工运行health check
健康监控提供了两种方式来手工运行健康检查:
.使用dbms_hm包

.使用企业管理器接口

使用dbms_hm来运行健康检查
dbms_hm包中有一个run_check过程用来运行健康检查.为了调用run_check需要提供检查的名称和运行的名字比如:

SQL> exec dbms_hm.run_check('Dictionary Integrity Check','jy_dict_run');

PL/SQL procedure successfully completed.

为了获得一个健康检查名称列表执行以下查询:

SQL> select name from v$hm_check where internal_check='N';
NAME
----------------------------------------------------------------
DB Structure Integrity Check

CF Block Integrity Check

Data Block Integrity Check

Redo Integrity Check

Transaction Integrity Check

Undo Segment Integrity Check

Dictionary Integrity Check

ASM Allocation Check

大多数的健康检查接受输入参数.可以通过查询v$hm_check_param来查看参数名和描述.一些参数是强制性的其它的是可选的.
如果一个可选参数被忽略将会使用缺省值.下面的查询将会显示所有健康检查的参数信息:

SQL> SELECT c.name check_name, p.name parameter_name, p.type,
  2  p.default_value, p.description
  3  FROM v$hm_check_param p, v$hm_check c
  4  WHERE p.check_id = c.id and c.internal_check = 'N'
  5  ORDER BY c.name;

CHECK_NAME                       PARAMETER_NAME       TYPE                 DEFAULT_VALUE    DESCRIPTION
-------------------------------- -------------------- -------------------- ---------------- ------------------------------
ASM Allocation Check             ASM_DISK_GRP_NAME    DBKH_PARAM_TEXT                       ASM 组名
CF Block Integrity Check         CF_BL_NUM            DBKH_PARAM_UB4                        控制文件块号
Data Block Integrity Check       BLC_DF_NUM           DBKH_PARAM_UB4                        文件号
Data Block Integrity Check       BLC_BL_NUM           DBKH_PARAM_UB4                        块号
Dictionary Integrity Check       CHECK_MASK           DBKH_PARAM_TEXT      ALL              检查掩码
Dictionary Integrity Check       TABLE_NAME           DBKH_PARAM_TEXT      ALL_CORE_TABLES  表名
Redo Integrity Check             SCN_TEXT             DBKH_PARAM_TEXT      0                最新良好重做的 SCN (如果已知)
Transaction Integrity Check      TXN_ID               DBKH_PARAM_TEXT                       事务处理 ID
Undo Segment Integrity Check     USN_NUMBER           DBKH_PARAM_TEXT                       还原段号

输入参数通过input_params参数以name/value对用分号来分隔进行传递.下面的例子用事务ID作为参数进行事务完整性检查:

SQL>BEGIN
DBMS_HM.RUN_CHECK (
check_name => 'Transaction Integrity Check',
run_name => 'my_trans_run',
input_params => 'TXN_ID=8.31.4');
END;
/

PL/SQL procedure successfully completed.

查看检查报告
在一个检查运行完后可以查看它的执行报告.这个可报告包括了调查结果,建议和其它的信息.也可以使用企业管理器,ADRCI工具,或者
dbms_hm包.其中企业管理器查看的报告格式为html,dbms_hm包查看的报告格式为html,xml和text,检查DRCL工个查看报告的格式为XML

检查运行的结果被存储在ADR,但报告不会立即生成.当你请求查看报告时可以使用dbms_hm或企业管理器来生成.如果报告不存在,首先
得用ADR中的检查结果数据来生成并以xml或html格式来存储.如果使用ADRCI工具如果报告文件不存必须首先运行命令来生成报告文件然后运行其它的命令来显示它的内容.

使用dbms_hm来查看检查报告
使用dbms_hm.get_run_report函数可以查看健康检查报告.这个函数可以以html,xml或text格式来显示.缺省格式为text:

SQL> SET LONG 100000
SQL> SET LONGCHUNKSIZE 1000
SQL> SET PAGESIZE 1000
SQL> SET LINESIZE 512
SQL> select dbms_hm.get_run_report('jy_dict_run') from dual;

DBMS_HM.GET_RUN_REPORT('JY_DICT_RUN')
--------------------------------------------------------------------------------
Run Name : JY_DICT_RUN
Run Id : 1061
Check Name : Data Block Integrity Check
Mode : REACTIVE
Status : COMPLETED
Start Time : 2007-05-12 22:11:02.032292 -07:00
End Time : 2007-05-12 22:11:20.835135 -07:00
Error Encountered : 0
Source Incident Id : 7418
Number of Incidents Created : 0
Input Paramters for the Run
BLC_DF_NUM=1
BLC_BL_NUM=64349
Run Findings And Recommendations
Finding
Finding Name : Media Block Corruption
Finding ID : 1065
Type : FAILURE
Status : OPEN
Priority : HIGH
Message : Block 64349 in datafile 1:
'/u01/app/oracle/dbs/t_db1.f' is media corrupt
Message : Object BMRTEST1 owned by SYS might be unavailable
Finding
Finding Name : Media Block Corruption
Finding ID : 1071
Type : FAILURE
Status : OPEN
Priority : HIGH
Message : Block 64351 in datafile 1:
'/u01/app/oracle/dbs/t_db1.f' is media corrupt
Message : Object BMRTEST2 owned by SYS might be unavailable

使用ADRCI工具来查看检查报告
1.确保操作系统环境变量(比如 ORACLE_HOME)已经被设置.并输入以下命令:

[oracle@jyrac1 ~]$ adrci

ADRCI: Release 11.2.0.1.0 - Production on Wed Jun 11 17:18:53 2014

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/u01/app/oracle"

工具启动后会显示以下提示符
adrci>

可以改变当前的ADR home目录.使用show homes命令可以列出所有的ADR home目录,set homepath命令用来改变当前ADR home目录.

adrci> show homes
ADR Homes:
diag/tnslsnr/jyrac1/listener
diag/rdbms/cs/cs
diag/rdbms/jytest/jytest
diag/rdbms/jy/jy
diag/rdbms/jycs/jycs
adrci> set homepath diag/rdbms/jycs/jycs

2.输入以下命令:
show hm_run
这个命令会列出所有运行过并(存储在v$hm_run)注册到ADR档案库中的检查

adrci> show hm_run

ADR Home = /u01/app/oracle/diag/tnslsnr/jyrac1/listener:
*************************************************************************
0 rows fetched
<adr_relation name="">
<adr_home name="/u01/app/oracle/diag/tnslsnr/jyrac1/listener">

ADR Home = /u01/app/oracle/diag/rdbms/cs/cs:
*************************************************************************
0 rows fetched

<adr_home name="/u01/app/oracle/diag/rdbms/cs/cs">

ADR Home = /u01/app/oracle/diag/rdbms/jytest/jytest:
*************************************************************************
0 rows fetched

<adr_home name="/u01/app/oracle/diag/rdbms/jytest/jytest">

ADR Home = /u01/app/oracle/diag/rdbms/jy/jy:
*************************************************************************
0 rows fetched

<adr_home name="/u01/app/oracle/diag/rdbms/jy/jy">

ADR Home = /u01/app/oracle/diag/rdbms/jycs/jycs:
*************************************************************************

**********************************************************
HM RUN RECORD 1
**********************************************************
   RUN_ID                        1
   RUN_NAME                      HM_RUN_1
   CHECK_NAME                    DB Structure Integrity Check
   NAME_ID                       2
   MODE                          2
   START_TIME                    2014-03-24 13:32:58.872509 +08:00
   RESUME_TIME                   
   END_TIME                      2014-03-24 13:33:01.710518 +08:00
   MODIFIED_TIME                 2014-03-24 13:33:01.710518 +08:00
   TIMEOUT                       0
   FLAGS                         0
   STATUS                        5
   SRC_INCIDENT_ID               0
   NUM_INCIDENTS                 0
   ERR_NUMBER                    0
   REPORT_FILE                   

**********************************************************
HM RUN RECORD 2
**********************************************************
   RUN_ID                        61
   RUN_NAME                      HM_RUN_61
   CHECK_NAME                    DB Structure Integrity Check
   NAME_ID                       2
   MODE                          2
   START_TIME                    2014-04-08 10:22:43.295203 +08:00
   RESUME_TIME                   
   END_TIME                      2014-04-08 10:22:43.723241 +08:00
   MODIFIED_TIME                 2014-04-08 10:22:43.723241 +08:00
   TIMEOUT                       0
   FLAGS                         0
   STATUS                        5
   SRC_INCIDENT_ID               0
   NUM_INCIDENTS                 0
   ERR_NUMBER                    0
   REPORT_FILE                   

**********************************************************
HM RUN RECORD 3
**********************************************************
   RUN_ID                        81
   RUN_NAME                      jy_dict_run
   CHECK_NAME                    Dictionary Integrity Check
   NAME_ID                       24
   MODE                          0
   START_TIME                    2014-06-11 16:42:00.675293 +08:00
   RESUME_TIME                   
   END_TIME                      2014-06-11 16:42:02.950141 +08:00
   MODIFIED_TIME                 2014-06-11 17:14:33.658642 +08:00
   TIMEOUT                       0
   FLAGS                         0
   STATUS                        5
   SRC_INCIDENT_ID               0
   NUM_INCIDENTS                 0
   ERR_NUMBER                    0
   REPORT_FILE                   /u01/app/oracle/diag/rdbms/jycs/jycs/hm/HMREPORT_jy_dict_run.hm

3.定位你要创建报告的检查并注意检查运行的名称.如果检查报告已经存在那么report_file字段就会包含一个文件名比如上面的
/u01/app/oracle/diag/rdbms/jycs/jycs/hm/HMREPORT_jy_dict_run.hm.否则使用下面的命令来生成一个报告.

adrci> create report hm_run jy_dict_run_1
DIA-48448: This command does not support multiple ADR homes

这是因为当前存在多个ADR home目录

adrci> show homes
ADR Homes:
diag/tnslsnr/jyrac1/listener
diag/rdbms/cs/cs
diag/rdbms/jytest/jytest
diag/rdbms/jy/jy
diag/rdbms/jycs/jycs
adrci> set homepath diag/rdbms/jycs/jycs
adrci> create report hm_run jy_dict_run_1

4.执行以下命令来查看报告:

adrci> show report hm_run jy_dict_run_1
< ?xml version="1.0" encoding="US-ASCII"?>HM Report: jy_dict_run_1
    <run_info>
        <check_name>Dictionary Integrity Check
        <run_id>101
        <run_name>jy_dict_run_1
        <run_mode>MANUAL
        <run_status>COMPLETED
        <run_error_num>0
        <source_incident_id>0
        <num_incidents_created>0
        <run_start_time>2014-06-11 17:27:13.477462 +08:00
        <run_end_time>2014-06-11 17:27:14.734166 +08:00
    
    <run_parameters>
        <run_parameter>TABLE_NAME=ALL_CORE_TABLES
        <run_parameter>CHECK_MASK=ALL
    
    

health monitor视图
可以用视图来代替检查报告来查看特定检查的结果.可用的视图有v$hm_run,v$hm_finding,v$hm_recommendation
下面查询v$hm_run来判断运行检查的确良历史信息:

SQL> SELECT run_id, name, check_name, run_mode, src_incident FROM v$hm_run;

    RUN_ID NAME                             CHECK_NAME                       RUN_MODE SRC_INCIDENT
---------- -------------------------------- -------------------------------- -------- ------------
        61 HM_RUN_61                        DB Structure Integrity Check     REACTIVE            0
        81 jy_dict_run                      Dictionary Integrity Check       MANUAL              0
       101 jy_dict_run_1                    Dictionary Integrity Check       MANUAL              0
         1 HM_RUN_1                         DB Structure Integrity Check     REACTIVE            0

下面查询v$hm_finding来获得RUN_ID 61的详细信息:

SELECT type, description FROM v$hm_finding WHERE run_id = 61;
TYPE          DESCRIPTION
------------- -----------------------------------------------------------------------------
FAILURE      Block 64349 in datafile 1: '/u01/app/oracle/dbs/t_db1.f' is media corrupt
FAILURE      Block 64351 in datafile 1: '/u01/app/oracle/dbs/t_db1.f' is media corrupt

health check参数

Table 9–6 Parameters for Data Block Integrity Check
------------------------------------------------------------------------------------------------------------------
Parameter Name                Type                  Default Value           Description
------------------------------------------------------------------------------------------------------------------
BLC_DF_NUM                    Number                (none)                  Block datafile number
BLC_BL_NUM                    Number                (none)                  Data block number
------------------------------------------------------------------------------------------------------------------


Table 9–7 Parameters for Redo Integrity Check
------------------------------------------------------------------------------------------------------------------
Parameter Name                Type                   Default Value          Description
------------------------------------------------------------------------------------------------------------------
SCN_TEXT                      Text                   0                      SCN of the latest good redo (if known)
------------------------------------------------------------------------------------------------------------------


Table 9–8 Parameters for Undo Segment Integrity Check
------------------------------------------------------------------------------------------------------------------
Parameter Name                Type                   Default Value          Description
------------------------------------------------------------------------------------------------------------------
USN_NUMBER                    Text                   (none)                 Undo segment number
------------------------------------------------------------------------------------------------------------------


Table 9–9 Parameters for Transaction Integrity Check
------------------------------------------------------------------------------------------------------------------
Parameter Name                Type                   Default Value          Description
------------------------------------------------------------------------------------------------------------------              
TXN_ID                        Text                   (none)                 Transaction ID
------------------------------------------------------------------------------------------------------------------

Table 9–10 Parameters for Dictionary Integrity Check
------------------------------------------------------------------------------------------------------------------
Parameter Name                Type                   Default Value          Description
------------------------------------------------------------------------------------------------------------------
CHECK_MASK                    Text                   ALL                    Possible values are:
                                                                            COLUMN_CHECKS—Run column
                                                                            checks only. Verify column-level
                                                                            constraints in the core tables.

                                                                            ROW_CHECKS—Run row checks only.
                                                                            Verify row-level constraints in the
                                                                            core tables.

                                                                            REFERENTIAL_CHECKS—Run
                                                                            referential checks only. Verify
                                                                            referential constraints in the core
                                                                            tables.
                       
                                                                            ALL—Run all checks.

TABLE_NAME                    Text                   ALL_CORE_TABLES        Name of a single core table to check. If
                                                                            omitted, all core tables are checked.
------------------------------------------------------------------------------------------------------------------

发表评论

电子邮件地址不会被公开。