诊断oracle high version count(高版本游标)问题

什么是high version cursor(高版本游标)?
对于一个特定的游标有多少个版本就属于高版本游标是没有明确定义的.对于不同的系统有不同的数量界定.然而在awr报告中对于一个父游标超过20个子游标个数时就会被报告出来
1

然而当一个游标的版本数据达到成百上千,那么这些绝对是高版本游标.所以要检查这些sql有高版本的原因要尽量使用这些sql能够被共享.

什么是共享sql?
首先要记住的是所有sql语句都是式共享的.当一个sql语句被输入时,oracle将会对一个语句的文本创建一个hash value,oracle将使用这个hash value很容易地在共享池中查找是否已经存在有相同hash value的sql存在.

例如:select count*) from emp语句有一个hash value为 4085390015
那么oracle就会对这个sql语句创建一个父游标和一个子游标.如果一个sql语句永远也不会被共享也没关系-当它第一次被解析时会创建一个父游标和一个子游标.可以简单地认为这个父游标代表这个hash value,子游标代表sql的元数据.

什么是元数据?
元数据是能让sql语句运行的所有信息.例如,在上面的例子中给定的emp表属于scott用户,因此它有一个object_id来指示这个emp表.当scott用户登录时,对于运行这个语句的会话优化器参数会被初始化,所以优化器的使用也属于元数据.

当scott用户重新登录后运行相同的命令(相同的sql语句),这时在共享池中已经存在相同的sql,(但是我们是不知道的),将对这个sql生成hash value并且在共享池搜索这个hash value.如果找到这个hash value,就会通过子游标来进行搜索来判断是否存在子游标可以被重用(元数据相同).如果是那么就可以共享这个sql语句.

现在在共享池中这个sql语句只有一个子游标,因为元数据相同能让我们使用已经存在的子游标来共享sql语句.父游标不是判断能不能共享的基础子游标才能决定是否共享.

现在如果另一个用户test也有一个emp表.如果这个用户也运行上面的查询语句将会发生什么:
1. 会对这个语句创建一个hash value.它的hash value为4085390015
2. 这个sql在共享池中被找到
3. 搜索子游标(在这时已经有一个子游标了)
4. 因为test用户的emp表的object_id与scott用户的emp表的object_id是不同的所以会有一个’mismatch’
(本质上这里会依次搜索子游标链表,使用所有的子游标与当前sql的元数据进行比较.如果已经搜索了100个子游标直到找到一个可以共享的子游标为止.如果没有找到可以共享的子游标那么就会重新创建一个子游标

5. 因此创建一个新的子游标所以现在有1个父游标和2个子游标.

为什么要关注high version cursor(高版本游标)
可以共享但没有被共享的sql和合成版本的sql是造成library cache竞争的主要原因.竞争会降低数据库的性能.在极端情况下会使用数据库hang住.当一个游标有太多个不必要的版本时,每次游标被执行时,这个解析引擎为了找到你所想要的游标不得不搜索整个游标链表.这是非常消耗CPU资源的.

怎样查看high version cursor(高版本游标)以及为何不能被共享
一种最简单查看high version cursor(高版本游标)的方法是使用脚本High SQL Version Counts – Script to determine reason(s) (文档 ID 438755.1)
现在这个脚本的版本为 version_rpt3_23.sql
下载这个脚本后需要进行安装
SQ>conn / as sysdba
SQL>@F:\ version_rpt3_23.sql
使用方法如下:
对于10g及以后的版本来收集version超过100的所有游标

SQL> set pages 2000 lines 100
SQL> 
SQL> SELECT b.*
  2  FROM v$sqlarea a ,
  3    TABLE(version_rpt(a.sql_id)) b
  4  WHERE loaded_versions >=100;
 
COLUMN_VALUE
---------------------------------------------------------------------

根据hash value来收集version超过100的所有游标

SQL> set pages 2000 lines 100
SQL> SELECT b.*
  2  FROM v$sqlarea a ,
  3    TABLE(version_rpt(NULL,a.hash_value)) b
  4  WHERE loaded_versions>=100;
 
COLUMN_VALUE
---------------------------------------------------------------------

使用sql_id来收集游标报告

SQL> set pages 2000 lines 100
SQL> SELECT * FROM TABLE(version_rpt('g7vpupcuqd9zz'));
 
COLUMN_VALUE
--------------------------------------------------------------------------------
Note:438755.1 Version Count Report Version 3.2.3 -- Today's Date 22-4月 -14 15:44
RDBMS Version :10.2.0.4.0 Host: IBMP740-1 Instance 1 : RLZY
==================================================================
Addr: 070000066F6659E8  Hash_Value: 895920127  SQL_ID g7vpupcuqd9zz
Sharable_Mem: 135775 bytes   Parses: 110924   Execs:621954
Stmt:
0 insert into mt_fee( hospital_id,serial_no,serial_fee,stat_type,f
1 ee_batch,medi_item_type,item_code,item_name,his_item_code,his_it
2 em_name,serial_apply,fee_date,model,factory,standard,unit,price,
3 dosage,money,reduce_money,usage_flag,usage_days,opp_serial_fee,i
4 nput_staff,input_man,input_date,calc_flag,frozen_flag,frozen_ser
5 ial_fee,trans_date,recipe_no,hos_serial,doctor_no,doctor_name,au
6 dit_flag,trans_flag,defray_type ) values ( :1,:2,:3,:4,:5,:6,:7,
7 :8,:9,:10,:11,:12,:13,:14,:15,:16,:17,:18,:19,:20,:21,:22,:23,:2
8 4,:25,:26,:27,:28,:29,:30,:31,:32,:33,:34,:35,:36,:37)
9
Versions Summary
----------------
AUTH_CHECK_MISMATCH :1
BIND_MISMATCH :13
TRANSLATION_MISMATCH :1
ROLL_INVALID_MISMATCH :3
Total Versions:12
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cursor_sharing = exact
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Plan Hash Value Summary
-----------------------
Plan Hash Value Count
=============== =====
              0 8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for AUTH_CHECK_MISMATCH :
  # of Ver PARSING_USER_ID PARSING_SCHEMA_ID PARSING_SCHEMA_NAME
========== =============== ================= ===================
         8              211               211 INSUR_CHANGDE
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for BIND_MISMATCH :
Consolidated details for BIND* columns:
BIND_MISMATCH,USER_BIND_PEEK_MISMATCH,BIND_UACS_DIFF,BIND_LENGTH_UPGRADEABLE,etc
BIND_EQUIV_FAILURE (Mislabled as ROW_LEVEL_SEC_MISMATCH BY bug 6964441 in 11gR1)
from v$sql_bind_capture
COUNT(*) POSITION MIN(MAX_LENGTH) MAX(MAX_LENGTH) DATATYPE BIND GRADUATION (PREC
======== ======== =============== =============== ======== =============== =====
      13        1              32              32        1     No           (,)
      13        2              32              32        1     No           (,)
      13        3              32              32        1     No           (,)
      13        4              32              32        1     No           (,)
      13        5              32              32        1     No           (,)
      13        6              32              32        1     No           (,)
      13        7              32              32        1     No           (,)
      13        8              32             128        1     Yes          (,)
      13        9              32             128        1     Yes          (,)
      13       10              32             128        1     Yes          (,)
      13       11              32              32        1     No           (,)
      13       12              11              11      180     No           (,)
      13       13              32              32        1     No           (,)
      13       14              32             128        1     Yes          (,)
      13       15              32             128        1     Yes          (,)
      13       16              32             128        1     Yes          (,)
       9       17              32             128        1     Yes          (,)
       4       17              22              22        2     No           (,)
      13       18              32             128        1     Yes          (,)
      13       19              32              32        1     No           (,)
      13       20              32              32        1     No           (,)
      13       21              32              32        1     No           (,)
      13       22              32              32        1     No           (,)
      13       23              32              32        1     No           (,)
      13       24              32              32        1     No           (,)
      13       25              32              32        1     No           (,)
      13       26              11              11      180     No           (,)
      13       27              32              32        1     No           (,)
      13       28              32              32        1     No           (,)
      13       29              32              32        1     No           (,)
      13       30               7               7       12     No           (,)
      13       31              32              32        1     No           (,)
      13       32              32             128        1     Yes          (,)
      13       33              32              32        1     No           (,)
      13       34              32              32        1     No           (,)
      13       35              32              32        1     No           (,)
      13       36              32              32        1     No           (,)
      13       37              32              32        1     No           (,)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for TRANSLATION_MISMATCH :
No objects in the plans with same name and different owner were found.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for ROLL_INVALID_MISMATCH :
No details available
####
To further debug Ask Oracle Support for the appropiate level LLL.
alter session set events
 'immediate trace name cursortrace address 895920127, level LLL';
To turn it off do use address 1, level 2147483648
================================================================

如果不能使用这个脚本可以使用下面的方法从基本视图中来查询相同的信息
下面使用scott用户来运行select count(*) from emp 语句,并运行下面的查询来查看这个语句的父游标和它的hash value和address

SQL>select sql_text, hash_value,address from v$sqlarea where sql_text like 'select count(*) from emp%'; 

SQL_TEXT                 HASH_VALUE    ADDRESS 
------------------------ ------------ ---------------- 
select count(*) from emp 4085390015   0000000386BC2E58

为了查看子游标:
对于oracle 9.2.x.x及以下版本
SQL>select * from v$sql_shared_cursor where kglhdpar = ‘0000000386BC2E58’;
对于oracle 10.0.x.x及以上版本
SQL> select * from v$sql_shared_cursor where address = ‘0000000386BC2E58’;
对于oracle 9.2.x.x及以下版本查询的输出如下:

ADDRESS          KGLHDPAR         U S O O S L S E B P I S T A B D L T R I I R L I O S M U T N F 
---------------- ---------------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
0000000386BC2D08 0000000386BC2E58 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 

可以看到有一个子游标(address: 0000000386BC2D080).mismatch信息都为N因为这是第一个子游标.如果另一个用户运行相同的语句(select count(*) from emp)再次执行上面查询输出如下:

ADDRESS          KGLHDPAR         U S O O S L S E B P I S T A B D L T R I I R L I O S M U T N F 
---------------- ---------------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
0000000386BC2D08 0000000386BC2E58 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N 
0000000386A91AA0 0000000386BC2E58 N N N N N N N N N N N N N Y N N N Y N N N N N N N N N N N N N 

现在可以看到第二个子游标(address: 0000000386A91AA0)且为什么与第一个子游标不能共享(‘Y’表示不匹配).原因如下:
(1) AUTH_CHECK_MISMATCH and (2) TRANSLATION_MISMATCH
这是因为新用户下的emp对象与scott用户下的emp对象不匹配.当不能访问scott用户的对象时且因为在每个用户方案下有一个emp对象而object_id不同所以翻译失败发生了一次mismatch.

在v$SQL_SHARED_CURSOR中给出了不能共享游标的原因
下面介绍一些游标不能共享的原因:
.UNBOUND_CURSOR–现有的子游标没有完全创建(换句话说不能被优化)
.SQL_TYPE_MISMATCH—sql类型与现有的子游标不匹配
.OPTIMIZER_MISMATCH—优化器环境与现有的子游标不匹配
例如:
SQL>select count(*) from emp; ->> 1 PARENT, 1 CHILD
SQL>alter session set optimizer_mode=ALL_ROWS
SQL>select count(*) from emp; ->> 1 PARENT, 2 CHILDREN
因为 optimizer_mode被改变,因为现有的子游标不能被共享
如果使用10046跟踪事件将会得到optimizer_mismatch和第三个子游标
使用cursortrace将会看到更详细的原因比如:
Optimizer mismatch(12)
其中括号内的数字给出了原因

1 = Degree used is not the default DOP
2 = In (RAC) cases where instance count is not the same, or session CPU count is not the same, or thread count is not the same
3 =  _parallel_syspls_obey_force is FALSE
4 = The PQ mode does not match.
5 = The degree does not match.
6 = The parallel degree policy does not match.
7 = The session limit is not the same as the cursor limit but the cursor limit is the same as the degree used.
8 = The cursor limit is greater than the degree used and the session limit is less than the cursor limit
9 = The cursor limit is less than the degree used and the session limit is not the same as the cursor limit
10 = Optimizer mode difference
11 =  Materialized View mismatch
12 = Optimizer environment mismatch (ie an optimizer parameter is different)
13 = Cardinality Feedback is use
.OUTLINE_MISMATCH—The outlines do not match the existing child cursor 

If my user had created stored outlines previously for this command and they were stored in seperate categories (say "OUTLINES1" and "OUTLINES2") running:-
SQL>alter session set use_stored_outlines = OUTLINES1; 
SQL>select count(*) from emp; 
SQL>alter session set use_stored_oulines= OUTLINES2; 
SQL>select count(*) from emp;

.STATS_ROW_MISMATCH—The existing statistics do not match the existing child cursor. Check that 10046/sql_trace is not set on all sessions as this can cause this.
.LITERAL_MISMATCH—Non-data literal values do not match the existing child cursor
.SEC_DEPTH_MISMATCH—Security level does not match the existing child cursor
.EXPLAIN_PLAN_CURSOR—The child cursor is an explain plan cursor and should not be shared. Explain plan statements will generate a new child by default - the mismatch will be this
.BUFFERED_DML_MISMATCH—Buffered DML does not match the existing child cursor
.PDML_ENV_MISMATCH—PDML environment does not match the existing child cursor
.INST_DRTLD_MISMATCH—Insert direct load does not match the existing child cursor
.SLAVE_QC_MISMATCH—The existing child cursor is a slave cursor and the new one was issued by the coordinator (or, the existing child cursor was issued by the coordinator and the new one is a slave cursor).
.TYPECHECK_MISMATCH—The existing child cursor is not fully optimized
.AUTH_CHECK_MISMATCH— Authorization/translation check failed for the existing child cursor 
The user does not have permission to access the object in any previous version of the cursor. A typical example would be where each user has it's own copy of a table
.BIND_MISMATCH—The bind metadata does not match the existing child cursor. For example:
SQL>variable a varchar2(100); 
SQL>select count(*) from emp where ename = :a ->> 1 PARENT, 1 CHILD 
SQL>variable a varchar2(400); 
SQL>select count(*) from emp where ename = :a ->> 1 PARENT, 2 CHILDREN 

.DESCRIBE_MISMATCH—The typecheck heap is not present during the describe for the child cursor
.LANGUAGE_MISMATCH—The language handle does not match the existing child cursor
.TRANSLATION_MISMATCH—The base objects of the existing child cursor do not match. 
The definition of the object does not match any current version. Usually this is indicative of the same issue as "AUTH_CHECK_MISMATCH" where the object is different.
.ROW_LEVEL_SEC_MISMATCH—The row level security policies do not match
.INSUFF_PRIVS— Insufficient privileges on objects referenced by the existing child cursor
.INSUFF_PRIVS_REM-- Insufficient privileges on remote objects referenced by the existing child cursor
.REMOTE_TRANS_MISMATCH—The remote base objects of the existing child cursor do not match  
USER1: select count(*) from table@remote_db 
USER2: select count(*) from table@remote_db 

  (Although the SQL is identical, the dblink pointed to 
   by remote_db may be a private dblink which resolves 
   to a different object altogether)
.LOGMINER_SESSION_MISMATCH
.INCOMP_LTRL_MISMATCH
.OVERLAP_TIME_MISMATCH—error_on_overlap_time_msimatch
.SQL_REDIRECT_MISMATCH—sql redirection mismatch
.MV_QUERY_GEN_MISMATCH—materialized view query generation
.USER_BIND_PEEK_MISMATCH—user bind peek mismatch
.TYPCHK_DEP_MISMATCH—cursor has typecheck dependencies
.NO_TRIGGER_MISMATCH— no trigger mismatch
.FLASHBACK_CURSOR—No cursor sharing for flashback
.ANYDATA_TRANSFORMATION - anydata transformation change
.INCOMPLETE_CURSOR - incomplete cursor.
When bind length is upgradeable (i.e. we found a child cursor that matches everything
else except that the bind length is not long enough), we mark the old cursor is not usable
and build a new one.  This means the version can be ignored.
.TOP_LEVEL_RPI_CURSOR - top level/rpi cursor 
In a Parallel Query invocation this is expected behaviour (we purposely do not share)
.DIFFERENT_LONG_LENGTH - different long length
.LOGICAL_STANDBY_APPLY - logical standby apply mismatch
.DIFF_CALL_DURN - different call duration
.BIND_UACS_DIFF - bind uacs mismatch
.PLSQL_CMP_SWITCHS_DIFF - plsql compiler switches mismatch
.CURSOR_PARTS_MISMATCH - cursor-parts executed mismatch
.STB_OBJECT_MISMATCH - STB object different (now exists)
.ROW_SHIP_MISMATCH - row shipping capability mismatch
.PQ_SLAVE_MISMATCH - PQ slave mismatch 
Check you want to be using PX with this reason code, as the problem could be caused by running lots of small SQL statements which do not really need PX. If you are on < 11i you may be hitting Bug:4367986
.TOP_LEVEL_DDL_MISMATCH - top-level DDL cursor
.MULTI_PX_MISMATCH - multi-px and slave-compiled cursor
.BIND_PEEKED_PQ_MISMATCH - bind-peeked PQ cursor
.MV_REWRITE_MISMATCH - MV rewrite cursor
.ROLL_INVALID_MISMATCH - rolling invalidation window exceeded
.OPTIMIZER_MODE_MISMATCH - optimizer mode mismatch
.PX_MISMATCH - parallel query mismatch 
If running 11.1.0.6 and RAC see Bug:7352775. Check that if (on each instance) parallel_instance_groups is set then instance_groups is set to the same.
.MV_STALEOBJ_MISMATCH - mv stale object mismatch
.FLASHBACK_TABLE_MISMATCH - flashback table mismatch
.LITREP_COMP_MISMATCH - literal replacement compilation mismatch 
New in 11g :
PLSQL_DEBUG - debug mismatch Session has debugging parameter plsql_debug set to true
.LOAD_OPTIMIZER_STATS  - Load optimizer stats for cursor sharing
.ACL_MISMATCH   -  Check ACL mismatch
.FLASHBACK_ARCHIVE_MISMATCH  - Flashback archive mismatch
.LOCK_USER_SCHEMA_FAILED  - Failed to lock user and schema
.REMOTE_MAPPING_MISMATCH  - Remote mapping mismatch
.LOAD_RUNTIME_HEAP_FAILED  - Runtime heap mismatch
.HASH_MATCH_FAILED  - Hash mismatch
Set to "Y" if sharing fails due to a hash mismatch, such as the case with mismatched histogram data or a range predicate marked as unsafe by literal replacement (See Bug 3461251)
New in 11.2  :
PURGED_CURSOR - cursor marked for purging
         The cursor has been marked for purging with dbms_shared_pool.purge

.BIND_LENGTH_UPGRADEABLE - bind length upgradeable
          Could not be shared because a bind variable size was smaller than the new value beiing inserted    (marked as BIND_MISMATCH in earlier versions).

.USE_FEEDBACK_STATS - cardinality feedback
         Cardinality feedback is being used and therefore a new plan could be formed for the current execution.

.BIND_EQUIV_FAILURE - The bind value's selectivity does not match that used to optimize the existing child cursor. When adaptive cursor sharing is used and the cursor is bind aware, then if the selectivity is outside of the current ranges and a new plan is desirable then a new child is raised with this as the reason code for non-sharing of the previous plan. For an example, see Document 836256.1. After each execution in the example, run:
select sql_id, address, child_address, child_number, BIND_EQUIV_FAILURE from v$sql_shared_cursor where sql_id='19sxt3v07nzm4';
         ... once the cursor is marked as bind aware and a second plan is seen then the following will be the resultant output:

SQL_ID         ADDRESS          CHILD_ADDRESS    CHILD_NUMBER     B
------------- ---------------- ---------------- ------------ -
19sxt3v07nzm4 000000007A1C0DE0 000000007A1BF980            0    N
19sxt3v07nzm4 000000007A1C0DE0 000000007A10DDB0            1    Y

       As can be seen, the new version is created due to BIND_EQUIV_FAILURE

There is no longer  ROW_LEVEL_SEC_MISMATCH in 11.2.

可以进一步跟踪
在oracle10g及以上版本中可以使用cursortrace来查找游标不能被共享的原因.

SQL>alter system set events 
'immediate trace name cursortrace level 577, address hash_value';

其中可以使用三个level,level 1为577,level 2为578,level 3为580
当重用这个游标时将会向user_dump_dest目录中写一个跟踪文件.
关闭cursortrace:

SQL>alter system set events 
'immediate trace name cursortrace level 2147483648, address 1';

注意在10.2.0.4以下版本存在Bug 5555371导致cursortrace无法彻底关闭的情况.最终导致其trace文件不停的增长,从而可能导致oracle文件系统被撑爆的现象

在11.2中有了cursordump可以使用如下方式进行cursor dump:

SQL>alter system set events ‘immediate trace name cursordump level 16’;

这种方式收集的信息比较全:例如它可以采集部分别的方式无法看到的px_mismatch以及它会进一步展开optimizer_mismatch的信息等

尽管使用绑定变量还是会存在high version cursor
当cursor_sharing为similar时
select /* TEST */ * from emp where sal > 100;
select /* TEST */ * from emp where sal > 101;
select /* TEST */ * from emp where sal > 102;
select /* TEST */ * from emp where sal > 103;
select /* TEST */ * from emp where sal > 104;

SELECT sql_text,version_count,address
FROM V$SQLAREA
WHERE sql_text like ‘select /* TEST */%’;

SELECT * FROM V$SQL_SHARED_CURSOR WHERE kglhdpar = ‘&my_addr’;
将会出现多个子游标
Cursor_sharing设置为similar或force都可能导致high version count可以参考:
High Version Count with CURSOR_SHARING=SIMILAR or FORCE(文档ID 261020.1)

在11g中引入的adaptive cursor sharing特性很容易导致high version count的问题
可以参考:Bug 12334286 High version count with CURSOR_SHARING=FORCE(BIND_MISMATCH and INCOMP_LTRL_MISMATCH)
Document 740052.1 Adaptive Cursor Sharing Overview
Document 7213010.8 Bug 7213010 – Adaptive cursor sharing generates lots of child cursors

Document 8491399.8 Bug 8491399 – Adaptive Cursor Sharing does not match the correct cursor version for queries using CHAR datatype

在oracle 11g中可以通过其它的一些手段限制child cursor的数量
Document 10187168.8 Enhancement to obsolete parent cursors if Version Count exceeds a threshold
引入了一个隐含参数_cursor_obsolete_threshold该参数用来限制单个parent cursor下的child cursor的数量,默认值为100.如果child cursor的数量超过这个阈值就会触发cursor obsolescence特性.这个时候parent cursor就会被丢弃并同时重新创建一个新的parent cursor.

1. If 11.2.0.3 and above, set the following parameters:
“_cursor_obsolete_threshold” to 100 (this is the number of child cursor after which we obsolete it)
2. If 11.2.0.2.2, then set:
SQL>alter system set “_cursor_features_enabled”=1026 scope=spfile;
SQL>alter system set event=’106001 trace name context forever,level 1024’ scope=spfile;

3.If 11.2.0.1:
SQL>alter system set “_cursor_features_enabled”=34 scope=spfile;
SQL>alter system set event=’106001 trace name context forever,level 1024’ scope=spfile;

发表评论

电子邮件地址不会被公开。