Lin Hong's TECH Blog! 刀不磨要生锈,人不学习要落后 - Thinking ahead

Oracle Index - Statics Tips

2021-07-06

Oracle Index - Statics Tips

统计信息

一般关闭自带JOB收集统计信息,自定义收集统计信息

OLAP一般不用job去收集统计信息(数量太大,消耗大量IO和CPU资源),一般用HINT去固定执行计划

Part V Optimizer Statistics

Optimizer statistics include the following:

  • Table statistics
    Number of rows
    Number of blocks
    Average row length
    
  • Column statistics
    Number of distinct values (NDV) in a column
    Number of nulls in a column
    Data distribution (histogram)
    Extended statistics
    
  • Index statistics
    Number of leaf blocks
    Number of levels
    Index clustering factor
    
  • System statistics
    I/O performance and utilization
    CPU performance and utilization
    

dbms_stats包可以收集schema/table/index/temporarytable(从12c开始支持)

具体参数说明: Table 169-59 GATHER_TABLE_STATS Procedure Parameters

常用参数: estimate_percent:采样率百分比 大表可以设置为30%等,系统强悍无所谓,默认值即可(也可以用 dbms_stats.AUTO_SAMPLE_SIZE)

method_opt:收集直方图方法,如果是稳定系统收集统计信息推荐使用repeat方式。 repeat表示以前收集过直方图,现在收集统计信息时候就收集统计信息;如果以前没有收集过直方图信息,现在统计信息收集时候就不收集直方图信息。

auto表示通过谓词过滤条件信息来自动判断是否收集统计信息

METHOD_OPT => 'FOR ALL COLUMNS SIZE REPEAT'

数据倾斜收集直方图信息例子:

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS( 'sh','customers',
  METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY ' ||
                'FOR COLUMNS SIZE SKEWONLY (cust_state_province,country_id)' );
END;
/

degree: 并行度,根据cpu资源

cascade:表示收集表的统计信息时候同时收集索引信息

no_invalidate: 表示收集统计信息之后共享池中的相关SQL游标是否失效。设置为false表示立即失效。

表信息和列信息:

Table Statistics history:

TABLE_NAME           OWNER           VERSION SAVTIME              ANALYZETI     ROWCNT SAMPLESIZE       PERC     BLKCNT     AVGRLN                                                                                                                        
-------------------- --------------- ------- -------------------- --------- ---------- ---------- ---------- ---------- ----------                                                                                                                        
SALES                SH              CURRENT                      21-MAY-18     918843     918843        100       1876         29                                              

Table Columns info:

COLUMN_NAME                      NDV NUL  NUM_NULLS DATA_TYPE  LOW_VAL_25                HIGH_VAL_25               LAST_ANALYZED        HISTOGRAM                                   
------------------------- ---------- --- ---------- ---------- ------------------------- ------------------------- -------------------- ---------                                   
AMOUNT_SOLD                     3586 N            0 NUMBER     C10729                    C2125349                  2018-05-21 18:00:55  NONE                                        
CHANNEL_ID                         4 N            0 NUMBER     C103                      C10A                      2018-05-21 18:00:55  NONE                                        
CUST_ID                         7059 N            0 NUMBER     C103                      C30B0B                    2018-05-21 18:00:55  NONE                                        
PROD_ID                           72 N            0 NUMBER     C10E                      C20231                    2018-05-21 18:00:55  FREQUENCY                                   
PROMO_ID                           4 N            0 NUMBER     C122                      C20A64                    2018-05-21 18:00:55  NONE                                        
QUANTITY_SOLD                      1 N            0 NUMBER     C102                      C102                      2018-05-21 18:00:55  NONE                                        
TIME_ID                         1460 N            0 DATE       77C60101010101            78650C1F010101            2018-05-21 18:00:55  NONE    

method_opt如果是FOR COLUMNS SIZE SKEWONLY不管where条件是否有对应列都收集直方图信息,有点浪费cpu资源,而且也可能导致绑定变量窥探

method_opt如果是FOR COLUMNS SIZE AUTO的话,只要执行过sql,出现在where条件的列才收集直方图

FLUSH_DATABASE_MONITORING_INFO Procedure

The database immediately updates corresponding entries in the *_TAB_MODIFICATIONS, *_TAB_STATISTICS and *_IND_STATISTICS views.

测试case:

create table test01 as select * from dba_objects;
column table_name format a20
column index_name format a20
select table_name, num_rows, blocks, avg_row_len, to_char(last_analyzed, 'YYYY/MM/DD HH24:MI:SS'), stattype_locked, stale_stats from dba_tab_statistics where table_name = 'TEST01';


CTAS: 加载数据时候会自动收集统计信息:

TABLE_NAME             NUM_ROWS     BLOCKS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ STATT STA
-------------------- ---------- ---------- ----------- ------------------- ----- ---
TEST01                    91857       1576         115 2021/07/06 22:07:26       NO

exec dbms_stats.flush_database_monitoring_info; -- 保存现有统计信息
select count(*) from test01;
delete from test01 where rownum <= 91857*0.11;


LYN@DB01> select count(*) from test01;

  COUNT(*)
----------
     91857

LYN@DB01> delete from test01 where rownum <= 91857*0.11;

10104 rows deleted.

LYN@DB01> 

删除了11%的数据量,统计信息依旧,stale_statsYES表示失效了

select table_name, num_rows, blocks, avg_row_len, to_char(last_analyzed, 'YYYY/MM/DD HH24:MI:SS'), stattype_locked, stale_stats from dba_tab_statistics where table_name = 'TEST01';


TABLE_NAME             NUM_ROWS     BLOCKS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ STATT STA
-------------------- ---------- ---------- ----------- ------------------- ----- ---
TEST01                    91857       1576         115 2021/07/06 22:07:26       YES

检查哪些对象是没有收集统计信息或者哪些对象统计信息失效一般通过下面条件来过滤和判断

超过10%以上收据有变更就表示失效

(stale_stats='YES' or last_analyzed is null)

查看全部对象的统计信息情况:

column table_name format a30
column owner format a20
SELECT OWNER,TABLE_NAME,OBJECT_TYPE,STALE_STATS,LAST_ANALYZED 
FROM DBA_TAB_STATISTICS 
WHERE (STALE_STATS='YES' OR LAST_ANALYZED IS NULL)
AND OWNER NOT IN 
('SYS', 'SYSTEM', 'SYSMAN', 'DMSYS', 'OLAPSYS', 'XDB','EXFSYS', 'CTXSYS','WMSYS', 'DBSNMP', 'ORDSYS', 'OUTLN', 'TSMSYS', 'MDSYS') AND TABLE_NAME NOT LIKE 'BIN%';

除了stale_stats和last_analyzed,还需要关注统计信息收集的采样率:ROWCNT/SAMPLESIZE/PERC

另外一个知识点:

stattype_locked : all表示对象统计信息锁住了。

select table_name from user_tab_statistics where stattype_locked is not null
exec dbms_stats.unlock_table_stats('user_name','table_name');
->更新统计信息之后再锁住
exec dbms_stats.lock_table_stats('user_name','table_name');

另外一个知识点:

Oracle 11g 增加了系统信息的备份和恢复功能,分别是在系统级别,用户级别,表级别 当我们对oracle 系统对象进行统计信息收集时,为了防止新的统计信息之后的执行计划不如之前的,我们可以利用下面package进行导入导出:

dbms_stats.export_(database/schema/table)_stats --导出统计信息到创建的表中

dbms_stats.import_(database/schema/table)_stats --导入统计信息到系统中

-- 数据库的统计信息备份恢复(只能是sys用户下)
exec dbms_stats.create_stat_table('sys','stat_sys_tab');
exec dbms_stats.export_database_stats('stat_sys_tab');
exec dbms_stats.import_database_stats('stat_sys_tab');

-- schema的统计信息备份恢复(只能在方案schema的用户下,user01为用户名)
exec dbms_stats.create_stat_table('user01','stat_user01_tab');
exec dbms_stats.export_schema_stats('user01','stat_user01_tab');
exec dbms_stats.import_schema_stats('user01','stat_user01_tab');

-- 表的统计信息备份恢复(只能在表的用户下) TEST1表
exec dbms_stats.create_stat_table('user01','stat_user01_tab');
exec dbms_stats.export_table_stats('user01','TEST1',null,'stat_user01_tab');
exec dbms_stats.import_table_stats('user01','TEST1',null,'stat_user01_tab');

网上相关脚本:

[DBA任务—确保统计信息准确性]

sosi.txt

脚本:sosi.sql

@sosi schema_name table_name

set echo off
set scan on
set lines 277
set pages 9999
set verify off
set feedback off
set termout off
column uservar new_value Table_Owner noprint
select user uservar from dual;
set termout on
--column TABLE_NAME heading "Tables owned by &Table_Owner" format a30
--select table_name from dba_tables where owner=upper('&Table_Owner') order by 1;

undefine table_name
undefine owner
--accept owner prompt 'Please enter Name of Table Owner (Null = &Table_Owner): '
--accept table_name  prompt 'Please enter Table Name to show Statistics for: '
set termout off
column uservar1 new_value Owner noprint
column uservar2 new_value Table_name noprint
select '&1' uservar1 , '&2' uservar2 from dual;
set termout on
set newp 0

column TABLE_NAME heading "Table|Name" format a15
column PARTITION_NAME heading "Partition|Name" format a15
column SUBPARTITION_NAME heading "SubPartition|Name" format a15
column NUM_ROWS heading "Number|of Rows" format 9,999,999,990
column BLOCKS heading "Blocks" format 999,999,990
column EMPTY_BLOCKS heading "Empty|Blocks" format 999,999,990

column AVG_SPACE heading "Average|Space" format 999,990
column CHAIN_CNT heading "Chain|Count" format 999,990
column AVG_ROW_LEN heading "Average|Row Len" format 999,990
column COLUMN_NAME  heading "Column|Name" format a25
column NULLABLE heading Null|able format a4
column NUM_DISTINCT heading "Distinct|Values" format 999,999,990
column NUM_NULLS heading "Number|Nulls" format 9,999,990
column NUM_BUCKETS heading "Number|Buckets" format 990
column DENSITY heading "Density" format 0.09999999
column INDEX_NAME heading "Index|Name" format a25
column UNIQUENESS heading "Unique" format a9
column BLEV heading "B|Tree|Level" format 90
column LEAF_BLOCKS heading "Leaf|Blks" format 99000
column DISTINCT_KEYS heading "Distinct|Keys" format 9,999,999,990
column AVG_LEAF_BLOCKS_PER_KEY heading "Average|Leaf Blocks|Per Key" format 99,990
column AVG_DATA_BLOCKS_PER_KEY heading "Average|Data Blocks|Per Key" format 99,990
column CLUSTERING_FACTOR heading "Cluster|Factor" format 999,999,990
column COLUMN_POSITION heading "Col|Pos" format 990
column col heading "Column|Details" format a24
column COLUMN_LENGTH heading "Col|Len" format 9,990
column GLOBAL_STATS heading "Global|Stats" format a6
column USER_STATS heading "User|Stats" format a6
column SAMPLE_SIZE heading "Sample|Size" format 9,999,999,990
--column to_char(t.last_analyzed,'YYYY-MM-DD') heading "Last_Analyzed|MM-DD-YYYY" format a10
--column to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS') heading "Last_Analyzed|YYYY-MM-DD" format a20
--column LAST_ANALYZED heading "Last_Analyzed|YYYY-MM-DD" format a20

prompt
prompt ******************************************
prompt Table Level  Parameter: schema tablename
prompt ******************************************
prompt
select
    TABLE_NAME,
    NUM_ROWS,
    BLOCKS,
    EMPTY_BLOCKS,
    AVG_SPACE,
    CHAIN_CNT,
    AVG_ROW_LEN,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
        t.last_analyzed
--    to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS')
from dba_tables t
where
    owner = upper(nvl('&&Owner',user))
and table_name = upper('&&Table_name')
/
prompt
select
    COLUMN_NAME,
    decode(t.DATA_TYPE,
           'NUMBER',t.DATA_TYPE||'('||
           decode(t.DATA_PRECISION,
                  null,t.DATA_LENGTH||')',
                  t.DATA_PRECISION||','||t.DATA_SCALE||')'),
                  'DATE',t.DATA_TYPE,
                  'LONG',t.DATA_TYPE,
                  'LONG RAW',t.DATA_TYPE,
                  'ROWID',t.DATA_TYPE,
                  'MLSLABEL',t.DATA_TYPE,
                  t.DATA_TYPE||'('||t.DATA_LENGTH||')') ||' '||
    decode(t.nullable,
              'N','NOT NULL',
              'n','NOT NULL',
              NULL) col,
    NUM_DISTINCT,
    DENSITY,
    NUM_BUCKETS,
    NUM_NULLS,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
        t.last_analyzed,
        HISTOGRAM
--    to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS')
from dba_tab_columns t
where
    table_name = upper('&Table_name')
and owner = upper(nvl('&Owner',user))
/

prompt

select
    INDEX_NAME,
    UNIQUENESS,
    BLEVEL BLev,
    LEAF_BLOCKS,
    DISTINCT_KEYS,
    NUM_ROWS,
    AVG_LEAF_BLOCKS_PER_KEY,
    AVG_DATA_BLOCKS_PER_KEY,
    CLUSTERING_FACTOR,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
        t.last_analyzed
    --to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS')
from
    dba_indexes t
where
    table_name = upper('&Table_name')
and table_owner = upper(nvl('&Owner',user))
/
prompt
break on index_name
select
    i.INDEX_NAME,
    i.COLUMN_NAME,
    i.COLUMN_POSITION,
    decode(t.DATA_TYPE,
           'NUMBER',t.DATA_TYPE||'('||
           decode(t.DATA_PRECISION,
                  null,t.DATA_LENGTH||')',
                  t.DATA_PRECISION||','||t.DATA_SCALE||')'),
                  'DATE',t.DATA_TYPE,
                  'LONG',t.DATA_TYPE,
                  'LONG RAW',t.DATA_TYPE,
                  'ROWID',t.DATA_TYPE,
                  'MLSLABEL',t.DATA_TYPE,
                  t.DATA_TYPE||'('||t.DATA_LENGTH||')') ||' '||
           decode(t.nullable,
                  'N','NOT NULL',
                  'n','NOT NULL',
                  NULL) col
from
    dba_ind_columns i,
    dba_tab_columns t
where
    i.table_name = upper('&Table_name')
and owner = upper(nvl('&Owner',user))
and i.table_owner=t.owner
and i.table_name = t.table_name
and i.column_name = t.column_name
order by index_name,column_position
/

prompt
prompt ***************
prompt Partition Level
prompt ***************

select
    PARTITION_NAME,
    NUM_ROWS,
    BLOCKS,
    EMPTY_BLOCKS,
    AVG_SPACE,
    CHAIN_CNT,
    AVG_ROW_LEN,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
        t.last_analyzed
    --to_char(t.last_analyzed,'MM-DD-YYYY')
from
    dba_tab_partitions t
where
    table_owner = upper(nvl('&&Owner',user))
and table_name = upper('&&Table_name')
order by partition_position
/


break on partition_name
select
    PARTITION_NAME,
    COLUMN_NAME,
    NUM_DISTINCT,
    DENSITY,
    NUM_BUCKETS,
    NUM_NULLS,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
        t.last_analyzed
    --to_char(t.last_analyzed,'MM-DD-YYYY')
from
    dba_PART_COL_STATISTICS t
where
    table_name = upper('&Table_name')
and owner = upper(nvl('&Owner',user))
/

break on partition_name
select
    t.INDEX_NAME,
    t.PARTITION_NAME,
    t.BLEVEL BLev,
    t.LEAF_BLOCKS,
    t.DISTINCT_KEYS,
    t.NUM_ROWS,
    t.AVG_LEAF_BLOCKS_PER_KEY,
    t.AVG_DATA_BLOCKS_PER_KEY,
    t.CLUSTERING_FACTOR,
    t.GLOBAL_STATS,
    t.USER_STATS,
    t.SAMPLE_SIZE,
        t.last_analyzed
    --to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS')
from
    dba_ind_partitions t,
    dba_indexes i
where
    i.table_name = upper('&Table_name')
and i.table_owner = upper(nvl('&Owner',user))
and i.owner = t.index_owner
and i.index_name=t.index_name
/


prompt
prompt ***************
prompt SubPartition Level
prompt ***************

select
    PARTITION_NAME,
    SUBPARTITION_NAME,
    NUM_ROWS,
    BLOCKS,
    EMPTY_BLOCKS,
    AVG_SPACE,
    CHAIN_CNT,
    AVG_ROW_LEN,
    GLOBAL_STATS,
    USER_STATS,
    SAMPLE_SIZE,
        t.last_analyzed
    --to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS')
from
    dba_tab_subpartitions t
where
    table_owner = upper(nvl('&&Owner',user))
and table_name = upper('&&Table_name')
order by SUBPARTITION_POSITION
/
break on partition_name
select
    p.PARTITION_NAME,
    t.SUBPARTITION_NAME,
    t.COLUMN_NAME,
    t.NUM_DISTINCT,
    t.DENSITY,
    t.NUM_BUCKETS,
    t.NUM_NULLS,
    t.GLOBAL_STATS,
    t.USER_STATS,
    t.SAMPLE_SIZE,
        t.last_analyzed
    --to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS')
from
    dba_SUBPART_COL_STATISTICS t,
    dba_tab_subpartitions p
where
    t.table_name = upper('&Table_name')
and t.owner = upper(nvl('&Owner',user))
and t.subpartition_name = p.subpartition_name
and t.owner = p.table_owner
and t.table_name=p.table_name
/

break on partition_name
select
    t.INDEX_NAME,
    t.PARTITION_NAME,
    t.SUBPARTITION_NAME,
    t.BLEVEL BLev,
    t.LEAF_BLOCKS,
    t.DISTINCT_KEYS,
    t.NUM_ROWS,
    t.AVG_LEAF_BLOCKS_PER_KEY,
    t.AVG_DATA_BLOCKS_PER_KEY,
    t.CLUSTERING_FACTOR,
    t.GLOBAL_STATS,
    t.USER_STATS,
    t.SAMPLE_SIZE,
        t.last_analyzed
    --to_char(t.last_analyzed,'YYYY-MM-DD HH24:MI:SS')
from
    dba_ind_subpartitions t,
    dba_indexes i
where
    i.table_name = upper('&Table_name')
and i.table_owner = upper(nvl('&Owner',user))
and i.owner = t.index_owner
and i.index_name=t.index_name
/

prompt
clear breaks
set echo off

Have a good work&life! 2021/07 via LinHong


Similar Posts

Comments