一、什么是统计信息
oracle数据库里的统计信息是如下的一组数据:他们存储在数据字典里,且从多个维度描述了oracle数据库数据对象的详细信息。
oracle数据库里的统计信息主要分为以下6种情况:
(1)表的统计信息。
(2)列的统计信息。
(3)索引的统计信息。
(4)系统统计信息。
(5)数据字典统计信息。
(6)内部对象统计信息。
二、oracle收集和查看统计信息的方法
oracle数据库收集统计信息一般有以下2种方法:
(1)analyze命令。
(2)dbms_stats包。
针对以上6种统计信息,其中“表的统计信息”,“索引统计信息”,“列统计信息”,“数据字典统计信息”使用analyze或dbms_stats包收集均可以,但是“系统统计信息”和“内部对象统计信息”必须要dbms_stats包来收集才可以。
1、使用analyze命令收集统计信息
从oralce7开始,analyze命令就用来收集表、索引和列的统计信息。从oracle10g开始,创建索引后oracle会自动为您收集目标索引统计信息。analyze命令收集统计信息不会抹掉之间analyze结果。
创建测试表:
SQL>create table t1 as select * from dba_objects; SQL>create index idx_t1 on t1(object_id);
(1)analyze索引统计信息:
SQL>analyze index idx_t1 delete statistics;
(2)对表收集统计信息,并且以估算模式,采样比为15%:
SQL>analyze table t1 estimate statistics sample 15 percent for table;
(3)对表收集统计信息,以统计模式:
SQL>analyze table t1 compute statistics for table;
(4)对列收集统计信息,以计算模式:
SQL>analyze table t1 compute statistics for columns object_name,object_id;
(5)以计算模式对表和列同时收集统计信息:
SQL>analyze table t1 compute statistics for t1 for columns object_name,object_id;
(6)以计算模式对索引收集统计信息:
SQL>analyze index idx_t1 compute statistics;
(7)删除表、表上的索引、表的所有列的统计信息:
SQL>analyze table t1 delete statistics;
(8)以计算模式,同时收集表、表上的列、表上的索引的统计信息:
SQL>analyze table t1 compute statistics;
2、使用dbms_stats包收集统计信息
从oracle 8.1.5开始,dbms_stats包就被广泛用于统计信息的收集,用dbms_stats包收集统计信息也是oracle官方推荐的方式。在收集CBO所需要的统计信息方面,可以简单的将dbms_stats包理解成是analyze命令的增强版。
DBMS_STATS包最常见的4个存储过程:
(1)dbms_stats.gather_table_stats:用于收集目标表,目标表上列及目标表上索引的统计信息。
(2)dbms_stats.gather_index_stats:用于收集指定索引的统计信息。
(3)dbms_stats.gather_schema_stats:用于收集schema下所有对象的统计信息。
(4)dbms_stats.gather_database_stats:用于收集全库统计对象的统计信息。
以下是dbms_stats包的具体用法:
(1)对表收集统计信息,并且以估算模式,采样比为15%:
SQL>exec dbms_stats.gather_table_stats(ownname=>'SCOTT',tabname=>'T1',estimate_percent=>15,method_opt=>'FOR TABLE',cascade=>FALSE);
注意:method_opt参数指定了FOR TABLE不是在所有版本oracle下都是好用的。
(2)对表收集统计信息,以计算模式:
SQL>exec dbms_stats.gather_table_stats(ownname=>'SCOTT',tabname=>'T1',estimate_percent=>100,method_opt=>'FOR TABLE',cascade=>FALSE);
或
SQL>exec dbms_stats.gather_table_stats(ownname=>'SCOTT',tabname=>'T1',estimate_percent=>NULL,method_opt=>'FOR TABLE',cascade=>FALSE);
(3)对列收集统计信息,以计算模式:
SQL>exec dbms_stats.gather_table_stats(ownname=>'SCOTT',tabname=>'T1',estimate_percent=>100,method_opt=>'FOR ALL CULUMNS SIZE 1 OBJECT_NAME OBJECT_ID',cascade=>FALSE);
注意:以上方法收集了列objec_name、object_id的统计信息,同时也会收集表的统计信息。
(4)以计算模式对索引收集统计信息:
SQL>exec dbms_stats.gather_index_stats(ownname=>'SCOTT',indname=>'INDEX_T1',estimate_percent=>100);
(5)删除表、表上的索引、表的所有列的统计信息:
SQL>exec dbms_stats.delete_table_stats(ownname=>'SCOTT',tabname=>'T1');
(6)以计算模式,同时收集表、表上的列、表上的索引的统计信息:
SQL>exec dbms_stats.gather_table_stats(ownname=>'SCOTT',tabname=>'T1',estimate_percent=>15 ,cascade=>TRUE);
3、analyze和dbms_stats的区别
(1)analyze命令不能正确的收集分区表的统计信息,而dbms_stats包缺可以。
(2)analyze命令不能以并行收集统计信息,而dbms_stats包缺可以。
SQL>exec dbms_stats.gather_table_stats(ownname=>'SCOTT',tabname=>'T1',estimate_percent=>100, cascade=>FALSE,degree=>4);
(3)dbms_stats包只能收集与CBO相关的统计信息,而与CBO无关的额外信息,比如行迁移/行链接的数量(chain_cnt),校验表和索引的结构信息等,dbms_stats包就无能为力了,而analyze命令是可以用来分析和收集上述额外信息。比如:
SQL>analyze table XXX list chained rows into YYY; --用来分析和收集行迁移/行链接的数量。 SQL>analyze index XXX validate structure; --用来分析索引结构。
4、查看统计信息
oracle里的统计信息存储在数据字典表中,可以通过脚本来查询对象的统计信息。
sosi.sh脚本如下(可以查看表、索引、列的统计信息):
set echo off set scan on set lines 150 set pages 66 set verify off set feedback off set termout off column uservar new_value Table_Owner noprint select user uservar from dual; set termout on column TABLE_NAME heading "Tables owned by &Table_Owner" format a30 select table_name from dba_tables where owner=upper('&Table_Owner') order by 1 / undefine table_name undefine owner prompt accept owner prompt 'Please enter Name of Table Owner (Null = &Table_Owner): ' accept table_name prompt 'Please enter Table Name to show Statistics for: ' column TABLE_NAME heading "Table|Name" format a15 column PARTITION_NAME heading "Partition|Name" format a15 column SUBPARTITION_NAME heading "SubPartition|Name" format a15 column NUM_ROWS heading "Number|of Rows" format 9,999,999,990 column BLOCKS heading "Blocks" format 999,990 column EMPTY_BLOCKS heading "Empty|Blocks" format 999,999,990 column AVG_SPACE heading "Average|Space" format 9,990 column CHAIN_CNT heading "Chain|Count" format 999,990 column AVG_ROW_LEN heading "Average|Row Len" format 990 column COLUMN_NAME heading "Column|Name" format a25 column NULLABLE heading Null|able format a4 column NUM_DISTINCT heading "Distinct|Values" format 999,999,990 column NUM_NULLS heading "Number|Nulls" format 9,999,990 column NUM_BUCKETS heading "Number|Buckets" format 990 column DENSITY heading "Density" format 990 column INDEX_NAME heading "Index|Name" format a15 column UNIQUENESS heading "Unique" format a9 column BLEV heading "B|Tree|Level" format 90 column LEAF_BLOCKS heading "Leaf|Blks" format 990 column DISTINCT_KEYS heading "Distinct|Keys" format 9,999,999,990 column AVG_LEAF_BLOCKS_PER_KEY heading "Average|Leaf Blocks|Per Key" format 99,990 column AVG_DATA_BLOCKS_PER_KEY heading "Average|Data Blocks|Per Key" format 99,990 column CLUSTERING_FACTOR heading "Cluster|Factor" format 999,999,990 column COLUMN_POSITION heading "Col|Pos" format 990 column col heading "Column|Details" format a24 column COLUMN_LENGTH heading "Col|Len" format 9,990 column GLOBAL_STATS heading "Global|Stats" format a6 column USER_STATS heading "User|Stats" format a6 column SAMPLE_SIZE heading "Sample|Size" format 9,999,999,990 column to_char(t.last_analyzed,'MM-DD-YYYY') heading "Date|MM-DD-YYYY" format a10 prompt prompt *********** prompt Table Level prompt *********** prompt select TABLE_NAME, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tables t where owner = upper(nvl('&&Owner',user)) and table_name = upper('&&Table_name') / select COLUMN_NAME, decode(t.DATA_TYPE, 'NUMBER',t.DATA_TYPE||'('|| decode(t.DATA_PRECISION, null,t.DATA_LENGTH||')', t.DATA_PRECISION||','||t.DATA_SCALE||')'), 'DATE',t.DATA_TYPE, 'LONG',t.DATA_TYPE, 'LONG RAW',t.DATA_TYPE, 'ROWID',t.DATA_TYPE, 'MLSLABEL',t.DATA_TYPE, t.DATA_TYPE||'('||t.DATA_LENGTH||')') ||' '|| decode(t.nullable, 'N','NOT NULL', 'n','NOT NULL', NULL) col, NUM_DISTINCT, DENSITY, NUM_BUCKETS, NUM_NULLS, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tab_columns t where table_name = upper('&Table_name') and owner = upper(nvl('&Owner',user)) / select INDEX_NAME, UNIQUENESS, BLEVEL BLev, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS, AVG_LEAF_BLOCKS_PER_KEY, AVG_DATA_BLOCKS_PER_KEY, CLUSTERING_FACTOR, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_indexes t where table_name = upper('&Table_name') and table_owner = upper(nvl('&Owner',user)) / break on index_name select i.INDEX_NAME, i.COLUMN_NAME, i.COLUMN_POSITION, decode(t.DATA_TYPE, 'NUMBER',t.DATA_TYPE||'('|| decode(t.DATA_PRECISION, null,t.DATA_LENGTH||')', t.DATA_PRECISION||','||t.DATA_SCALE||')'), 'DATE',t.DATA_TYPE, 'LONG',t.DATA_TYPE, 'LONG RAW',t.DATA_TYPE, 'ROWID',t.DATA_TYPE, 'MLSLABEL',t.DATA_TYPE, t.DATA_TYPE||'('||t.DATA_LENGTH||')') ||' '|| decode(t.nullable, 'N','NOT NULL', 'n','NOT NULL', NULL) col from dba_ind_columns i, dba_tab_columns t where i.table_name = upper('&Table_name') and owner = upper(nvl('&Owner',user)) and i.table_name = t.table_name and i.column_name = t.column_name order by index_name,column_position / prompt prompt *************** prompt Partition Level prompt *************** select PARTITION_NAME, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tab_partitions t where table_owner = upper(nvl('&&Owner',user)) and table_name = upper('&&Table_name') order by partition_position / break on partition_name select PARTITION_NAME, COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_BUCKETS, NUM_NULLS, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_PART_COL_STATISTICS t where table_name = upper('&Table_name') and owner = upper(nvl('&Owner',user)) / break on partition_name select t.INDEX_NAME, t.PARTITION_NAME, t.BLEVEL BLev, t.LEAF_BLOCKS, t.DISTINCT_KEYS, t.NUM_ROWS, t.AVG_LEAF_BLOCKS_PER_KEY, t.AVG_DATA_BLOCKS_PER_KEY, t.CLUSTERING_FACTOR, t.GLOBAL_STATS, t.USER_STATS, t.SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_ind_partitions t, dba_indexes i where i.table_name = upper('&Table_name') and i.table_owner = upper(nvl('&Owner',user)) and i.owner = t.index_owner and i.index_name=t.index_name / prompt prompt *************** prompt SubPartition Level prompt *************** select PARTITION_NAME, SUBPARTITION_NAME, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tab_subpartitions t where table_owner = upper(nvl('&&Owner',user)) and table_name = upper('&&Table_name') order by SUBPARTITION_POSITION / break on partition_name select p.PARTITION_NAME, t.SUBPARTITION_NAME, t.COLUMN_NAME, t.NUM_DISTINCT, t.DENSITY, t.NUM_BUCKETS, t.NUM_NULLS, t.GLOBAL_STATS, t.USER_STATS, t.SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_SUBPART_COL_STATISTICS t, dba_tab_subpartitions p where t.table_name = upper('&Table_name') and t.owner = upper(nvl('&Owner',user)) and t.subpartition_name = p.subpartition_name and t.owner = p.table_owner and t.table_name=p.table_name / break on partition_name select t.INDEX_NAME, t.PARTITION_NAME, t.SUBPARTITION_NAME, t.BLEVEL BLev, t.LEAF_BLOCKS, t.DISTINCT_KEYS, t.NUM_ROWS, t.AVG_LEAF_BLOCKS_PER_KEY, t.AVG_DATA_BLOCKS_PER_KEY, t.CLUSTERING_FACTOR, t.GLOBAL_STATS, t.USER_STATS, t.SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_ind_subpartitions t, dba_indexes i where i.table_name = upper('&Table_name') and i.table_owner = upper(nvl('&Owner',user)) and i.owner = t.index_owner and i.index_name=t.index_name / clear breaks set echo on
附:查看表历史收集的统计信息情况
SELECT b.OWNER, b.OBJECT_NAME TABLE_NAME, TO_CHAR(a.ANALYZETIME, 'YYYY-MM-DD HH24:MI:SS') LAST_ANALYZETIME, TO_CHAR(a.SAVTIME, 'YYYY-MM-DD HH24:MI:SS') CURR_ANALYZETIME, a.ROWCNT FROM SYS.WRI$_OPTSTAT_TAB_HISTORY a, DBA_OBJECTS b WHERE a.OBJ# = b.OBJECT_ID AND b.OBJECT_NAME ='ZB_WHOLE_ORDERS_KAFKA_DISPATCH' and b.OWNER='ZJOPEN' ORDER BY a.OBJ#, a.SAVTIME;