Thursday, May 18, 2017

A Tour To AWR Configuration | Disabling AWR data purging


Question:  I want to keep my AWR data long-term and disable the jobs that periodically purge the AWR snapshots.  What are the best practices for keeping AWR data for many months and years?
Answer:  There are several ways to keep long-term AWR performance data.  Keeping AWR data long-term involves these steps:
1 - Truncate the SQL tables periodically
2 - Modify the snapshot retention period
3 - Remove the automatic flush job

One question is whether or not to compress the AWR data by removing the SQL table data periodically.

Compressing AWR data

If you choose to compress the AWR data, you will need to periodically truncate these "unnecessary" tables.  These tables contain SQL that is rarely useful after a month.  Note:  You must be connected as SYSDBA to truncate these tables.
  • wrh$_sqlstat contains a history for SQL execution statistics and stores snapshots of v$sql view.
  • wrh$_sqltext stores actual text for SQL statements captured from v$sql.
  • wrh$_sql_plan stores execution plans for SQL statements available indba_hist_sqlstat.
connect sys/manager as sysdba;
truncate table wrh$_sqlstat;
truncate table wrh$_sqltext;
truncate table wrh$_sql_plan;

You can schedule this periodic truncate of the SQL tables using either dbms_scheduler or with a crontab job.

Change retention period for snapshots:

The AWR has a special background process, MMON, which is responsible for gathering regular snapshots. The DBA is able to specify the frequency at which MMON gathers snapshots via the dbms_workload_repository.modify_snapshot_settings procedure:

SQL> desc dbms_workload_repository
PROCEDURE MODIFY_SNAPSHOT_SETTINGS
Argument Name Type In/Out Default?
----------------- --------------- ------ --------
RETENTION NUMBER IN DEFAULT
INTERVAL NUMBER IN DEFAULT
DBID NUMBER IN DEFAULT
The interval parameter sets the time interval, in minutes, between the snapshots. The default interval between snapshots is 60 minutes. The valid range of values for this parameter ranges from 10 minutes to 52,560,000 minutes (100 years).

The dbms_workload_repository package has the global variables min_intervaland max_interval , which set the lower and upper limits for this parameter. If the value specified for the interval is zero, automatic and manual snapshots will be prohibited.
The first dbms_workload_repository.modify_snapshot_settings procedure parameter, retention , allows the DBA to specify the time period, in minutes. The AWR will preserve that particular snapshot in the repository. The default value for this parameter is 10,080 minutes (seven days).

The valid range of values for this parameter also ranges from 10 minutes to 52,560,000 minutes (100 years). The dbms_workload_repository package has global variables min_retention and max_retention, which set up the lower and upper limits for the retention parameter.

NOTE:  If a zero value is specified for retention, snapshots will be stored for an unlimited time.
The current settings for AWR retention and interval parameters can be viewed using dba_hist_wr_control data dictionary view.

select
   extract( day from snap_interval) *24*60+
   extract( hour from snap_interval) *60+
   extract( minute from snap_interval ) "Snapshot Interval",
   extract( day from retention) *24*60+
   extract( hour from retention) *60+
   extract( minute from retention ) "Retention Interval"
from
   dba_hist_wr_control;
 This script returns the current AWR interval values in minutes:
Snapshot Interval Retention Interval
----------------- ------------------
60                10080
Here are examples of modifying the snapshot collection thresholds
dbms_workload_repository.modify_snapshot_settings(retention = > 0)
AWR flushing and the MMON background process
The MMON Oracle background process is responsible for periodically flushing the oldest AWR tables, using a LIFO queue method. Here, we see the flush_level for an AWR installation:
SQL> desc dbms_workload_repository

PROCEDURE CREATE_SNAPSHOT
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
FLUSH_LEVEL VARCHAR2 IN DEFAULT

FUNCTION CREATE_SNAPSHOT RETURNS NUMBER
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
FLUSH_LEVEL VARCHAR2 IN DEFAULT
The only parameter listed in the procedures is the flush_level , which can have either the default value of TYPICAL or a value of ALL. When the statistics level is set to ALL, the AWR gathers the maximum amount of performance data.
The MMON background process is responsible for removing old historical data from the AWR. The amount of retention time after which data will be removed from database is determined by the retention setting.
However, data can be cleared from the AWR tables by using thedbms_workload_repository.drop_snapshot_range procedure. The starting and ending snapshots for the history to be removed from the AWR will need to be set to run the following script, drop_snapshot_range.
desc dbms_workload_repository

PROCEDURE DROP_SNAPSHOT_RANGE
Argument Name Type In/Out Default?
------------------------------ -------------------- ------ --------
LOW_SNAP_ID NUMBER IN
HIGH_SNAP_ID NUMBER IN
DBID NUMBER IN DEFAULT


There are a number of AWR views (all starting DBA_HIST_) which I will go into in other posts but the one we are interested in to display the AWR settings is DBA_HIST_WR_CONTROL. This view only has 4 columns which are:

Using dbms_workload_repository.modify_snapshot_settings one can modify retention, interval and topnsql.
– get the dbid which is needed to passs to dbms_workload_repository.modify_snapshot_settings
SQL> select dbid from v$database;
DBID
———-
1992878807
– retention=>value in minutes so (45 days * 24 (hours per day) * 60 minutes per hour = 64800), max value can be set by passing a value of 0 which means forever
– internal=>60min (snap at this interval), a value of 0 will turn off AWR
– topnsql – top N sql size, specify value of NULL will keep the current setting
SQL> exec dbms_workload_repository.modify_snapshot_settings(retention=>64800, interval=>60, topnsql=>100, dbid=>1992878807);
PL/SQL procedure successfully completed.
– shows retention and interval after it was modified
SQL> select extract( day from snap_interval) *24*60+extract( hour from snap_interval) *60+extract( minute from snap_interval ) snapshot_interval,
extract( day from retention) *24*60+extract( hour from retention) *60+extract( minute from retention ) retention_interval,
topnsql
from dba_hist_wr_control;
Snapshot Interval Retention Interval topnsql
—————– —————— ———-
60 64800 100
— Change snapshot interval to 30mins
SQL> EXEC dbms_workload_repository.modify_snapshot_settings(interval=>30);
PL/SQL procedure successfully completed.

SQL> desc dba_hist_wr_control
Name Null? Type
——————————- ——– —————————-
DBID NOT NULL NUMBER
SNAP_INTERVAL NOT NULL INTERVAL DAY(5) TO SECOND(1)
RETENTION NOT NULL INTERVAL DAY(5) TO SECOND(1)
TOPNSQL VARCHAR2(10)
If we select from it we can see the defaults
SQL> set linesize 100
SQL> col snap_interval format a20
SQL> col retention format a20
SQL> col topnsql format a20
SQL> select * from dba_hist_wr_control;
DBID SNAP_INTERVAL RETENTION TOPNSQL
———- ——————– ——————– ——————–
3566081556 +00000 01:00:00.0 +00007 00:00:00.0 DEFAULT
So, the defaults are automatic snapshots are taken every 1 hour and snapshots are kept for 7 days.
So, what if we want to change these? Oracle have supplied a package DBMS_WORKLOAD_REPOSITORY which is used to take manual snapshots, create baselines and the important one for us, change snapshot settings. So, for example, if we wanted to change the settings so we took snapshots every 10 minutes and kept these snapshots for 10 days we would issue the following sql:
execute dbms_workload_repository.modify_snapshot_settings
( interval => 10,
retention => 14400);
All the values are in minutes, so for the retention 14400 is 10 days. So, if we now look at dba_hist_wr_control we can see our change has been reflected
SQL> select * from dba_hist_wr_control;
DBID SNAP_INTERVAL RETENTION TOPNSQL
———- ——————– ——————– ——————–
3566081556 +00000 00:10:00.0 +00010 00:00:00.0 DEFAULT
It is always worth reviewing the size of your SYSAUX tablespace before you make any changes. Oracle have provided a handy script for this, which is called utlsyxsz.sql (held in $ORACLE_HOME/rdbms/admin) which will estimate the required size for the SYSAUX tablespace based on the retention period, snapshot interval you are going to use and a few other variables like number of active sessions and number of tables.
Disabling AWR data purging
Compressing AWR data
If you choose to compress the AWR data, you will need to periodically truncate these “unnecessary” tables. These tables contain SQL that is rarely useful after a month. Note: You must be connected as SYSDBA to truncate these tables.
wrh$_sqlstat contains a history for SQL execution statistics and stores snapshots of v$sql view.
wrh$_sqltext stores actual text for SQL statements captured from v$sql.
wrh$_sql_plan stores execution plans for SQL statements available in dba_hist_sqlstat.
connect sys/manager as sysdba;
truncate table wrh$_sqlstat;
truncate table wrh$_sqltext;
truncate table wrh$_sql_plan;
You can schedule this periodic truncate of the SQL tables using either dbms_scheduler or with a crontab job.
Change retention period for snapshots:
The AWR has a special background process, MMON, which is responsible for gathering regular snapshots. The DBA is able to specify the frequency at which MMON gathers snapshots via the dbms_workload_repository .modify_snapshot_settings procedure:
SQL> desc dbms_workload_repository
PROCEDURE MODIFY_SNAPSHOT_SETTINGS
Argument Name Type In/Out Default?
—————– ————— —— ——–
RETENTION NUMBER IN DEFAULT
INTERVAL NUMBER IN DEFAULT
DBID NUMBER IN DEFAULT
The interval parameter sets the time interval, in minutes, between the snapshots. The default interval between snapshots is 60 minutes. The valid range of values for this parameter ranges from 10 minutes to 52,560,000 minutes (100 years).
The dbms_workload_repository package has the global variables min_interval and max_interval , which set the lower and upper limits for this parameter. If the value specified for the interval is zero, automatic and manual snapshots will be prohibited.
The first dbms_workload_repository.modify_snapshot_settings procedure parameter, retention , allows the DBA to specify the time period, in minutes. The AWR will preserve that particular snapshot in the repository. The default value for this parameter is 10,080 minutes (seven days).
The valid range of values for this parameter also ranges from 10 minutes to 52,560,000 minutes (100 years). The dbms_workload_repository package has global variables min_retention and max_retention, which set up the lower and upper limits for the retention parameter.
NOTE: If a zero value is specified for retention, snapshots will be stored for an unlimited time.
The current settings for AWR retention and interval parameters can be viewed using dba_hist_wr_control data dictionary view.
select
extract( day from snap_interval) *24*60+
extract( hour from snap_interval) *60+
extract( minute from snap_interval ) “Snapshot Interval”,
extract( day from retention) *24*60+
extract( hour from retention) *60+
extract( minute from retention ) “Retention Interval”
from
dba_hist_wr_control;
This script returns the current AWR interval values in minutes:
Snapshot Interval Retention Interval
—————– ——————
60 10080
Here are examples of modifying the snapshot collection thresholds
dbms_workload_repository.modify_snapshot_settings(retention = > 0)
AWR flushing and the MMON background process
The MMON Oracle background process is responsible for periodically flushing the oldest AWR tables, using a LIFO queue method. Here, we see the flush_level for an AWR installation:
SQL> desc dbms_workload_repository
PROCEDURE CREATE_SNAPSHOT
Argument Name Type In/Out Default?
—————————— ———————– —— ——–
FLUSH_LEVEL VARCHAR2 IN DEFAULT
FUNCTION CREATE_SNAPSHOT RETURNS NUMBER
Argument Name Type In/Out Default?
—————————— ———————– —— ——–
FLUSH_LEVEL VARCHAR2 IN DEFAULT
The only parameter listed in the procedures is the flush_level , which can have either the default value of TYPICAL or a value of ALL. When the statistics level is set to ALL, the AWR gathers the maximum amount of performance data.
The MMON background process is responsible for removing old historical data from the AWR. The amount of retention time after which data will be removed from database is determined by the retention setting.
However, data can be cleared from the AWR tables by using the dbms_workload_repository.drop_snapshot_range procedure. The starting and ending snapshots for the history to be removed from the AWR will need to be set to run the following script, drop_snapshot_range.
desc dbms_workload_repository
PROCEDURE DROP_SNAPSHOT_RANGE
Argument Name Type In/Out Default?
—————————— ——————– —— ——–
LOW_SNAP_ID NUMBER IN
HIGH_SNAP_ID NUMBER IN
DBID NUMBER IN DEFAULT
Disabling and enabling automatic AWR flushing
You can disable the AWR automatic data flushing mechanism at either the system level or disable flushing for individual tables.
For system-wide disable of AWR table flushing you can use these commands but beware that the AWR tablespace will continue to grow ad-infinitum:
alter session set events ‘immediate trace name awr_flush_table_off level 99′
alter session set events ‘immediate trace name awr_flush_table_off level 106′;
To return to the default of weekly flushing you can issue these commands:
alter session set events ‘immediate trace name awr_flush_table_on level 99′;
alter session set events ‘immediate trace name awr_flush_table_on level 106′;
If you need to disable flushing the run time statistics for an AWR workload table, you can get the underlying WRH tables with this query:
select
table_id_kewrtb,
table_name_kewrtb
from
x$kewrtb
order by
table_id_kewrtb;
Once you identify specific AWR tables to disable flushing, you can use an ALTER SYSTEM command:
alter system set “_awr_disabled_flush_tables”=WRH$_FILESTATXS_BL;
alter system set “_awr_disabled_flush_tables”=WRH$_TEMPSTATXS;
alter system set “_awr_disabled_flush_tables”=WRH$_DATAFILE;
alter system set “_awr_disabled_flush_tables”=WRH$_TEMPFILE;
alter system set “_awr_disabled_flush_tables”=WRH$_COMP_IOSTAT;
alter system set “_awr_disabled_flush_tables”=WRH$_SQLSTAT_BL;
You can run a script like this to mark all of the important AWR tables:
WARNING: SETTING UNDOCUMENTED PARAMETERS REQUIRE NOTIFYING MOSC.
set pages 9999;
spool runme.sql
select
‘alter system set “_awr_disabled_flush_tables”=’||table_name||’;’
from
dba_tables
where
tablespace_name = ‘SYSAUX’
and
table_name like ‘WRH_%’;
spool off
runme.sql
Removing and Disabling AWR Information
There may be times when a DBA might desire to disable AWR (Automatic Workload Repository).
One reason might be to avoid licensing issues, because AWR isn’t part of the standard or even enterprise database – as it requires the optional (extra cost) Oracle Enterprise Manager (OEM) Diagnostic pack. So even though your database automatically collects AWR data every sixty minutes and retains it for a week – you cannot legally use the Oracle supplied PL/SQL packages (i.e. DBMS_WORKLOAD_REPOSITORY), the OEM screens for AWR, ADMM and ASH, or even the AWR data dictionary views (i.e. DBA_HIST_*) if you’re not licensed.
If you query the DBA_HIST_* data dictionary views, you better have purchased the OEM Diagnostic pack! For those of you who prefer to directly access the SYS data dictionary tables – that means don’t even select from tables with names like WRM$*, WRH$* or WRI$*!
So assuming that you prefer to disable AWR so as not to accidentally (or purposefully) violate your Oracle licensing agreement, here are some ways to disable AWR for a given database (you’ll need to do one of these to every database you manage):
Many Ways to Disable AWR:
1. Download Meta-Link script dbms_awr.plb, compile this package, then execute the PL/SQL package dbms_awr.disable_awr() [Metalink Note 436386.1].
2. Set your init.ora parameter STATISTICS_LEVEL = BASIC
3. Execute the Oracle provided PL/SQL package: dbms_workload_repository.modify_snapshot_settings(interval=>0)
4. Execute the Oracle provided PL/SQL package: dbms_scheduler.disable(‘GATHER_STATS_JOB’)
5. You can use Toad for #3: Main Menu->Database->Monitor->ADDM/AWR Reports screen, choose the Snapshot Management tab, set the interval to all zeroes, and then press the green checkmark in upper left corner to commit the change.
6. You can use Toad for #4: Main Menu->Schema Browser, choose the Sched. Job tab and disable the GATHER_STATS_JOB job.
7. You can use OEM for #4: Main Menu->Workload->Automatic Workload Repository, select the “Edit” button and then select the last radio group item labeled: Turn off Snapshot Collection, finally press OK
8. You can use OEM for #5: Main Menu->Scheduler->Jobs, select the data grid row for GATHER_STATS_JOB, choose the disable drop-down action, then finally press OK
9. Create your own database creation scripts (i.e. do not use DBCA) and make sure not to run the CATAWRTB.sql script [Note – Oracle upgrade process may undo this]
10. Run the $ORACLE_HOME\rdbms\admin\catnoawr.sql script to drop the AWR Repository tables [Note – Oracle upgrade process may undo this]
If you want to rebuild the AWR Repository Tables later, you need to perform the following:
– Execute (again) the script $ORACLE_HOME/rdbms/admin/catnoawr.sql
– Execute the script $ORACLE_HOME/rdbms/admin/catawrtb.sql
– Bounce the database.
– On re-start of the database instance, the AWR tables will be populated with the required data.

References from MOSC:
- General Guidelines for SYSAUX Space Issues [Document 552880.1]

- SYSAUX Tablespace Grows Heavily Due To AWR [Document 852028.1]

- SYSAUX Grows Because Optimizer Stats History is Not Purged [Document 1055547.1]

- Space issue in Sysaux tablespace due to Unexpected AWR size [Document 1218413.1]

- Space Management In Sysaux Tablespace with AWR in Use [Document 287679.1]

- Usage and Storage Management of SYSAUX tablespace occupants SM/AWR, SM/ADVISOR, SM/OPTSTAT and SM/OTHER [Document 329984.1]

Sunday, April 23, 2017

SPID | SID | PID Explained


--------------------------------
SPID | SID | PID Explained
--------------------------------


To dump trace and errorstacks, either the Operating system process id or Oracle process id for a slave process must be determined.
Assuming the Oracle SID for the process is known then the following select can be
used to find the Operating system process id :
SELECT p.pid, p.SPID,s.SID
FROM v$process p, v$session s
WHERE s.paddr = p.addr
AND s.SID = &SID;

SPID is the operating system identifier
SID is the Oracle session identifier
PID is the Oracle process identifier

and then do the following:

Lets assume that the slave process id to be dumped has an o/s pid of 9834
and an Oracle pid of 34

login to SQL*Plus:
connect / as sysdba
ALTER SESSION SET tracefile_identifier = 'STACK_10046';
oradebug setospid 9834
oradebug unlimit

oradebug event 10046 trace name context forever,level 12

oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug dump errorstack 3
oradebug tracefile_name

To us the Oracle Pid instead, replace the :
oradebug setospid 9834

with
oradebug setorapid 34

Remember to change the PIDs to the actual values on your system!

The final line outputs the trace file name which will include the string 'STACK_10046' for easy identification.
To disable the tracing once tracing is finished:
oradebug event 10046 trace name context off

This will produce a trace file in the relevant trace destination matching the process traced.

Gathering Errorstacks on Your Current session

If you are trying to gather stacks on your current session, you can determine the PID values by running selects such as the following in that session:
SELECT p.pid, p.SPID,s.SID
FROM v$process p, v$session s
WHERE s.paddr = p.addr
AND s.audsid = userenv('SESSIONID') ;

or
SELECT p.pid, p.SPID,s.SID
FROM v$process p,v$session s
WHERE s.paddr = p.addr
AND s.SID =
(SELECT DISTINCT SID
FROM V$MYSTAT);

Command Explanation

The following is a brief outline of the purpose of the commands:

  • connect / as sysdba
    This provides the simplest way of gathering the required access to dump the information.
  • ALTER SESSION SET tracefile_identifier = 'STACK_10046';
    Adds a search-able string to the trace file for easy identification later.
  • oradebug setospid XXXX
    Connects the oradebug tool to a particular O/S (operating system) PID (Process ID)
  • oradebug setorapid XXXX
    Connects the oradebug tool to a particular oracle PID (Process ID)
  • oradebug unlimit
    Unrestricts the trace file size so as not to lose valuable diagnostics.
  • oradebug event 10046 trace name context forever,level 12
    Collects 10046 trace showing waits and binds. This trace is ofter useful when collected in conjunction with errorstacks and providing context. This is optional.
  • oradebug dump errorstack 3
    Dumps the process stack and process state for the current process.
  • oradebug tracefile_name
    Writes the identifier to the trace file.

CURSOR_SHARING

Here's another question that was submitted during the OpenWorld Optimizer Roundtable. It's a common question that we've discussed a little bit in a couple other posts, but we wanted to summarize everything in one place. First, let's quickly review what the different values for the cursor_sharing parameter mean. We discussed this behavior in some detail in an earlier post about cursor_sharing. Below is a summary of the behavior of the different values in different cases (copied from the earlier post):

CURSOR_SHARING VALUESPACE USED IN SHARED POOLQUERY PERFORMANCE
EXACT (No literal replacement)Worst possible case - each stmt issued has its own parent cursorBest possible case as each stmt has its own plan generated for it based on the value of the literal value present in the stmt
FORCEBest possible case as only one parent and child cursor for each distinct stmtPotentially the worst case as only one plan will be used for each distinct stmt and all occurrences of that stmt will use that plan
SIMILAR without histogram presentBest possible case as only one parent and child cursor for each distinct stmtPotentially the worst case as only one plan will be used for each distinct stmt and all occurrences of that stmt will use that plan
SIMILAR with histogram presentNot quite as much space used as with EXACT but close. Instead of each stmt having its own parent cursor they will have their own child cursor (which uses less space)Best possible case as each stmt has its own plan generated for it based on the value of the literal value present in the stmt

Adaptive cursor sharing (ACS) is another feature we've blogged about before, which allows the optimizer to generate a set of plans that are optimal for different sets of bind values. A common question is how the two interact, and whether users should consider changing the value of cursor_sharing when upgrading to 11g to take advantage of ACS. The simplest way to think about the interaction between the two features for a given query is to first consider whether literal replacement will take place for a query. Consider a query containing a literal:

select * from employees where job = 'Clerk' 

As we see from the table above, the treatment of this query by literal replacement will depend on the value of the cursor_sharing parameter and whether there is a histogram on the job column. Here are the interesting cases:

  1. Cursor_sharing = exact. No literal replacement will take place, and the optimizer will see the query as is.
  2. Cursor_sharing = force. Whether there is a histogram or not, literal replacement will take place, and the optimizer will optimize the query as if it were: select * from employees where job = :b Bind peeking will take place, so that the value "Clerk" is used to generate cardinality estimates for the query. Subsequent executions of this query differing only in the literal value will share the same plan.
  3. Cursor_sharing = similar. There are two different cases for this:
    1. There is a histogram on the job column. In this case, literal replacement will not take place. The presence of a histogram indicates that the column is skewed, and the optimal plan may depend on the literal value. Hence, the optimizer sees the query as: select * from employees where job = 'Clerk' and subsequent executions with a different literal will not necessarily use the same plan.
    2. There is no histogram on the job column. This indicates that the column is not skewed, and the optimizer is likely to choose the same plan no matter the literal, so literal replacement takes place.
Now that we know when literal replacement will take place, and what the query looks like to the optimizer, we can consider adaptive cursor sharing. If literal replacement takes place, and the query that the optimizer optimizes contains a bind, then adaptive cursor sharing can take place. To adaptive cursor sharing, a bind variable is a bind variable, whether it comes from the user query or is inserted by literal replacement. On the other hand, if the query contains only literals (no binds), adaptive cursor sharing will not take place. In our example above, adaptive cursor sharing can be considered for cases 2 and 3.2. For case 3.2, it is likely that the optimizer will choose the same plan for different values of the literal. In case 2, if there is a histogram, then the optimizer may choose different plans depending on how popular the literal value is. 

This example shows that if you use histograms, and want the optimizer to choose an optimal plan for different literal values using ACS, then you should set cursor_sharing to force. If it is set to similar, then literal replacement will not take place, and a child cursor will be created for each value of the literal. Setting cursor_sharing to similar effectively disables ACS for these kinds of queries. By setting cursor_sharing to force and letting adaptive cursor sharing kick in, the optimizer can choose optimal plans for different values, but if the same plan is appropriate for several values, they will share a single child cursor. Historically, cursor_sharing=similar has been recommended as a middle ground between no literal replacement (which causes a lot of cursors to be generated) and forced literal replacement (which causes a potentially sub-optimal plan to be shared for all literals). We now recommend using adaptive cursor sharing along with cursor_sharing=force instead. 

So far we have only discussed cursor sharing in the presence of histograms. There are other cases where the optimizer's choice of plan can depend on the specific literal that appears in the query, for instance when binds appear in range predicates or when a bind value falls outside of a column's range (according to the optimizer statistics). Binds appearing in these kinds of predicates are also considered by adaptive cursor sharing, whereas they are not considered by cursor_sharing=similar.

Monday, April 3, 2017

CLUSTERING FACTOR DEMYSTIFIED PART – I

In this post, I will explain what is clustering factor, how and why does it affect the performance and then demonstrate it with the help of an example.
What is  Clustering Factor?
The Index clustering factor is a number which represents the degree to which the data in table is synchronized with the entries in the index. It  gives a rough measure of  how many I/Os the database would perform if it were to read every row in that table via the index in index order.  If the rows of a table on disk are sorted in the same order as the index keys, the database will perform a minimum number of I/Os on the table to read the entire table via the index.
Let’s try to understand it by taking an analogy.  I want to locate all the books (records)  by (for)  an author (a key value)  in the library (table)  using the catalog card (index) .  If books (records) are arranged authorwise (sorted by key)  in racks (table blocks) , my catalog card (index)  will show me information like this :
Book1(Record1)    Rack1(Block1)
Book2(Record2)    Rack1(Block1)
Book3(Record3)    Rack1(Block1)
Bookn(Recordn)    Rack1(Block1)
It means that I need to visit only one rack (block) provided all the books (records)  fit in one rack (block) i.e. Rack1(Block1) to get all the required books (records) . If one rack (block) is insufficient to hold all the books (records) ,Hence,
No. of racks (blocks) to be visited = number of distinct racks (blocks in rowids) listed in the card(index)
Consider a scenario where books (records) of the same author (key value) are scattered across various racks (blocks) . In the worst scenario i.e. each book is in a different rack, my catalog card (index)  will be  like this:
Book1(Record1)    Rack1(Block1)
Book2(Record2)    Rack2(Block2)
Book3(Record3)    Rack3(Block3)
Bookn(Recordn)    Rackn(Blockn)
Now, to get all the books (records) I will have to gather them from multiple racks (blocks) .
In the worst case,  no. of racks (blocks)  visited = no. of books  (records)
To calculate the clustering factor of an index during the gathering of index statistics, Oracle does the following :
For each entry in the index
(
   Compare the entry’s table rowid block with the block of the previous index entry.
If the block is different, Oracle increments the clustering factor by 1.
)
The minimum possible clustering factor is equal to the number of distinct  blocks identified through the index’s list of rowid’s.  An index with a low clustering_factor is closely aligned with the table and related rows reside together inside each data block, making indexes very desirable for optimal access.
The maximum clustering factor is the number of entries in the index i.e. each rowid points to a different block in the table.  An index with a high clustering factor is out-of-sequence with the rows in the table and large index range scans will consume lots of I/O.
How to find the clustering factor of an index 
Oracle provides a column called clustering_factor in the dba_indexes view that provides information on how the table rows are synchronized with the index. A low value is desirable.
Let’s demonstrate the concept:
- Create a table organized which contains two columns  –  id(number) and txt (char)
- Populate the table insert 35 records for each value of id where id ranges from 1 to 100
- In this case as records are added sequentially,  records for a key value are stored together
SQL> drop table organized purge; 
            create table organized (id number, txt char(900));

           begin
           for i in 1..100 loop
               insert into organized select i, lpad('x', 900, 'x')
               from    dba_objects where rownum < 35;
           end loop;
          end;
          /
-  create another table unorganized which is a replica of the ‘organized’ table but records are inserted in a random manner so that records for a key value may be scattered across different blocks (order by dbms_random.random).
SQL> drop table unorganized purge;
           create table unorganized as select * from organized order by dbms_random.random;
- Find out no. of blocks across which records of a key value are spread in the two tables.
- Note that in ‘unorganized’  table,  records  are scattered where in ‘organized’ table, records are clustered   
SQL> select org.id,  org.cnt organized_blocks , unorg.cnt unorganized_blocks
         from
            (select id, count(distinct(dbms_rowid.ROWID_BLOCK_NUMBER(rowid))) cnt
              from organized
              group by id)org ,
           ( select id,  count(distinct(dbms_rowid.ROWID_BLOCK_NUMBER(rowid))) cnt
             from unorganized
              group by id) unorg
        where org.id = unorg.id
        order by id;
     ID    ORGANIZED_BLOCKS UNORGANIZED_BLOCKS
———- —————- ——————
         1                5                 34
         2                6                 34
         3                6                 32
                      …..
        98                5                 32
        99                5                 34
       100               6                 33
-- Create index on id column on both the tables and gather statistics for both the tables
create index organized_idx on organized(id);
           create index unorganized_idx on unorganized(id);
          exec dbms_stats.gather_table_stats(USER, 'organized', estimate_percent => 100, method_opt=> 'for all indexed columns size 254');
        exec dbms_stats.gather_table_stats(USER, 'unorganized', estimate_percent => 100, method_opt=> 'for all indexed columns size 254');
– Check that both the tables contain same no. of rows/blocks.
SQL> select table_name, blocks, num_rows
           from user_tables
           where table_name like '%ORGANIZED'
           order by 1;
TABLE_NAME                         BLOCKS   NUM_ROWS
——————————      ———- ———-
ORGANIZED                                 488       3400
UNORGANIZED                           486       3400
– Check the index statistics
– Note that the index on table ‘organized‘ has a clustering factor (488) which is equal to the no. of blocks in the table i.e.  to fetch all the records for various key values using index,  blocks need not be switched unless all the records in the earlier block have been fetched
– Note that the index on table ‘unorganized‘ has a clustering factor (3299) which is close to the no. of rows   in the table (3400) i.e. almost every record for a key  value is in a different block and  to fetch all the records for a  key values using index,  block  needs to be switched almost for every record
SQL>set line 500
          col table_name for a15
          col index_name for a15
          select blevel,  leaf_blocks, table_name, index_name, clustering_factor
          from user_indexes
          where table_name like '%ORGANIZED%'
          order by 1;
    BLEVEL LEAF_BLOCKS TABLE_NAME      INDEX_NAME      CLUSTERING_FACTOR
———- ———– ————— ————— —————–
         1           7 ORGANIZED              ORGANIZED_IDX                 488
         1           7 UNORGANIZED     UNORGANIZED_IDX              3299
The  clustering factor is a measure of the no. of I/Os the database will perform against the table in order to read every row via the index. We can verify this fact by 
 executing a query with tracing enabled that will, in fact, read every row of the table via the index. We’ll do that by using an index hint to force the optimizer to use the index and count the non-null occurrences of a nullable column (text) that is not in the index. That will force the database to go from index to table for every single row:
SQL> alter session set tracefile_identifier = 'cluster_factor';
           alter session set sql_trace=true;
           select /*+ index(organized organized_idx) */ count(txt) from organized where id=id;
           select /*+ index(unorganized unorganized_idx) */ count(txt) from unorganized where id=id;
           alter session set sql_trace=false;
– Find out the name of trace file generated
SQL>col trace_file for a100
           select  value trace_file from v$diag_info
           where upper(name) like '%TRACE FILE%';
TRACE_FILE
—————————————————————————————————-
/u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_29116_cluster_factor.trc
– Run tkprof utility on the trace file generated 
cd /u01/app/oracle/diag/rdbms/orcl/orcl/trace
   tkprof orcl_ora_29116_cluster_factor.trc cluster_factor.out
   vi cluster_factor.out
– let’s analyze the tkprof output 
********************************************************************************
SQL ID: c3r8241qas3rn
Plan Hash: 2296712572
select /*+ index(organized organized_idx) */ count(txt)
from
 organized where id=id
Rows     Row Source Operation
——-  —————————————————
      1  SORT AGGREGATE (cr=496 pr=0 pw=0 time=0 us)
   3400   TABLE ACCESS BY INDEX ROWID ORGANIZED (cr=496 pr=0 pw=0 time=37892 us cost=496 size=3073600 card=3400)
   3400    INDEX FULL SCAN ORGANIZED_IDX (cr=8 pr=0 pw=0 time=9189 us cost=8 size=0 card=3400)(object id 75116)
********************************************************************************
As you can see
Total no. of I/Os performed against the index on ‘organized’ table = 8 (cr=8 in the INDEX FULL SCAN ORGANIZED_IDX row source)
Total I/O’s performed by the query = 496 (cr = 496 in the TABLE ACCESS BY INDEX ROWID ORGANIZED)
Hence , No. of I/O’s made against the table = 496 – 8 = 488 which is equal to the clustering factor of the index
********************************************************************************
SQL ID: d979q0sqdbabb
Plan Hash: 3317439903
select /*+ index(unorganized unorganized_idx) */ count(txt)
from
 unorganized where id=id
Rows     Row Source Operation
——-  —————————————————
      1  SORT AGGREGATE (cr=3307 pr=272 pw=0 time=0 us)
   3400   TABLE ACCESS BY INDEX ROWID UNORGANIZED (cr=3307 pr=272 pw=0 time=40536 us cost=3308 size=3073600 card=3400)
   3400    INDEX FULL SCAN UNORGANIZED_IDX (cr=8 pr=0 pw=0 time=9693 us cost=8 size=0 card=3400)(object id 75117)
********************************************************************************
Similarly, if I do the same analysis on the UNORGANIZED index,
Total no. of I/Os performed against the index on ‘unorganized’ table = 8 (cr=8 in the INDEX FULL SCAN UNORGANIZED_IDX row source)
Total I/O’s performed by the query = 3307 (cr = 3307 in the TABLE ACCESS BY INDEX ROWID ORGANIZED)
Hence , No. of I/O’s made against the table = 3307 – 8 = 3299 which is equal to the clustering factor of the index
So, for one table, the database performs 496 total I/O’s to retrieve exactly the same data as for the other table—which needed 3307 I/O’s.
Obviously, one of these indexes is going to be more useful for retrieving a larger number of rows than the other.
We can verify this  by using autotrace on a query against both  the tables:
SQL>set autotrace traceonly explain
          select /*+ index(organized organized_idx) */ count(txt)
          from  organized where id=id;
          set autotrace off
———————————————————————————————-
| Id  | Operation                                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
———————————————————————————————-
|   0 | SELECT STATEMENT             |               |     1 |   904 |   496   (0)| 00:00:06 |
|   1 |  SORT AGGREGATE                |               |     1 |   904 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| ORGANIZED     |  3400 |  3001K|   496   (0)| 00:00:06 |
|*  3 |    INDEX FULL SCAN           | ORGANIZED_IDX |  3400 |       |    8   (0)| 00:00:01 |
———————————————————————————————-
Note that there is a cost of 8 for using the index for the ORGANIZED table and index—about 8 I/O’s against the index i.e. the query will hit one root block (1)and the leaf blocks (7) .
Then the query will be doing 488 more I/Os against the table(= number of blocks in table), because the rows needed are all next to each other on a few database blocks, for a total cost of 496.
SQL>set autotrace traceonly explain
         select /*+ index(unorganized unorganized_idx) */ count(txt)
         from      unorganized where id=id;
         set autotrace off;
————————————————————————————————–
| Id  | Operation                    | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
————————————————————————————————
|   0 | SELECT STATEMENT             |                 |     1 |   904 |  3308   (1)| 00:00:40 |
|   1 |  SORT AGGREGATE              |                 |     1 |   904 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| UNORGANIZED     |  3400 |  3001K|  3308   (1)| 00:00:40 |
|*  3 |    INDEX FULL SCAN           | UNORGANIZED_IDX |  3400 |       |     8   (0)| 00:00:01 |
————————————————————————————————
In the case of UNORGANIZED index, the plan still has the same 8 I/Os against the index.But because the rows needed from the table are not next to each other, the optimizer estimates that the query will have to switch the block in  the table for every row it retrieves, and its estimated cost for 3400 row is 3299 (clustering factor) rows + 8 I/Os which comes to 3307 (almost = 3308 (listed)).
As you can see, both the plans expect to return the same number of rows: 1. Both the plans are using an index full scan. But the two plans have radically different costs: one has a low cost of 496 and the other a much higher cost of 3308—even though both plans are going after exactly the same set of rows from two tables that contain the same data!
The reason of this observation is quite clear : the data in the ‘unorganized’  table is not sorted in the same fashion as the data in the index. This leads to an  increase the clustering factor  and the database must do reads and rereads of the table thereby increasing the cost. With the ‘organized’  table, the table data and the index data were sorted identically.
How to resolve the performance issues due to high clustering factor?
 To improve the CF, the table must be rebuilt (and reordered). The data retrieval can be considerably speeded up by physically sequencing the rows in the same order as the key column. If we can group together the rows for a key value,  we can get all of the row with a single block read because the rows are together.  T o achieve this goal, various methods may be used :
    . Single table index clusters : Clusters related rows together onto the same data block
    . Manual row re-sequencing (CTAS with Order by) : Pre-orders data to avoid expensive disk sorts after retrieval.
. Using dbms_redefinition, for an in-place table reorganization
In my post Clustering Factor Demystified : Part – III, I have demonstrated the use of single table index and hash clusters to improve the clustering factor of an unorganized table.
Note :
- Rebuilding of index cannot  improve the Clustering Factor.
- If table has multiple indexes, careful consideration needs to be given by which index to order table.
- Row re-sequencing does not help queries that perform full-scans or index unique scans, and careful attention must be given to the nature of the queries against the target table.
- The degree to which resequencing improves the performance depends on how far out of sequence the rows are when you begin and how many rows you will be accessing in sequence.
Summary:
- A good Clustering Factor is equal (or near) to the values of number of blocks of table.- A bad Clustering Factor is equal (or near) to the number of rows of table.
– A low clustering factor is good and reflects strong clustering.
– A high clustering factor is bad and reflects weak clustering.
– The clustering factor may be lower than the number of blocks if there are empty blocks in the table below HWM and/or there are many rows that have null values for the indexed column(s).
– The clustering factor can never be greaterr than the no. of rows in the table.In my next post, I will demonstrate how to use CTAS order by to resequence the rows physically in the table.
Thanx for your time! I hope you found the post useful. Your comments and suggestions are always welcome!