Undo Space  

Posted by Mawahid in

Overview

Undo logs. Traditionally transaction undo information was stored in Rollback Segments until a commit or rollback statement was issued. Automatic undo management allows the DBA to specify how long undo information should be retained after commit, preventing “snapshot too old” errors on long running queries.

This is done by setting the UNDO_RETENTION parameter. The default is 900 seconds (5 minutes), and you can set this parameter to guarantee that Oracle keeps undo logs for extended periods of time.

Rather than having to define and manage rollback segments, you can simply define an Undo tablespace and let Oracle take care of the rest. Turning on automatic undo management is easy. All you need to do is create an undo tablespace and set UNDO_MANAGEMENT = AUTO.
With the below information gathered by SQL query, you may have a idea to prevent undo space is running out so quickly.

Undo Segments

In Undo Segments there are three types of extents, they are

Unexpired Extents – Undo data whose age is less than the undo retention period.
Expired Extents – Undo data whose age is greater than the undo retention period.
Active Extents – Undo data that is part of the active transaction.

The sequence for using extents is as follows,

1. A new extent will be allocated from the undo tablespace when the requirement arises. As undo is being written to an undo segment, if the undo reaches the end of the current extent and the next extent contains expired undo then the new undo (generated by the current transaction) will wrap into that expired extent, in preference to grabbing a free extent from the undo tablespace free extent pool.

2. If this fails because of no available free extents and we cannot autoextend the datafile, then Oracle tries to steal an expired extent from another undo segment.

3. If it still fails because there are no extents with expired status then Oracle tries to reuse an unexpired extent from the current undo segment.

4. If even that fails, Oracle tries to steal an unexpired extent from another undo segment.

5. If all the above fails, an Out-Of-Space error will be reported.

Check the overall status for undos:

select tablespace_name, status, sum(blocks) * 8192/1024/1024/1024 GB from dba_undo_extents group by tablespace_name, status;
 
TABLESPACE_NAME                STATUS            GB
------------------------------ --------- ----------
UNDOTBS1                       UNEXPIRED 2.29626465
UNDOTBS2                       UNEXPIRED 11.0892944
UNDOTBS1                       EXPIRED   7.20245361
UNDOTBS2                       EXPIRED   1.80932617
UNDOTBS2                       ACTIVE       .015625
 
 
Undo Blocks per Second :

SELECT MAX(undoblks/((end_time-begin_time)*3600*24))
      "UNDO_BLOCK_PER_SEC"
  FROM v$undostat;
 
Optimal Undo Retention:

SELECT d.undo_size/(1024*1024) "ACTUAL UNDO SIZE [MByte]",
       SUBSTR(e.value,1,25) "UNDO RETENTION [Sec]",
       ROUND((d.undo_size / (to_number(f.value) *
       g.undo_block_per_sec))) "OPTIMAL UNDO RETENTION [Sec]"
  FROM (
       SELECT SUM(a.bytes) undo_size
          FROM v$datafile a,
               v$tablespace b,
               dba_tablespaces c
         WHERE c.contents = 'UNDO'
           AND c.status = 'ONLINE'
           AND b.name = c.tablespace_name
           AND a.ts# = b.ts#
       ) d,
       v$parameter e,
       v$parameter f,
       (
       SELECT MAX(undoblks/((end_time-begin_time)*3600*24))
              undo_block_per_sec
         FROM v$undostat
       ) g
WHERE e.name = 'undo_retention'
  AND f.name = 'db_block_size';
 
 
Calculate Needed UNDO Size for given Database Activity :

SELECT d.undo_size/(1024*1024) "ACTUAL UNDO SIZE [MByte]",
       SUBSTR(e.value,1,25) "UNDO RETENTION [Sec]",
       (TO_NUMBER(e.value) * TO_NUMBER(f.value) *
       g.undo_block_per_sec) / (1024*1024)
      "NEEDED UNDO SIZE [MByte]"
  FROM (
       SELECT SUM(a.bytes) undo_size
         FROM v$datafile a,
              v$tablespace b,
              dba_tablespaces c
        WHERE c.contents = 'UNDO'
          AND c.status = 'ONLINE'
          AND b.name = c.tablespace_name
          AND a.ts# = b.ts#
       ) d,
      v$parameter e,
       v$parameter f,
       (
       SELECT MAX(undoblks/((end_time-begin_time)*3600*24))
         undo_block_per_sec
         FROM v$undostat
       ) g
 WHERE e.name = 'undo_retention'
  AND f.name = 'db_block_size';

Library Cache  

Posted by Mawahid in

Oracle's library cache is nothing more than an area in memory, specifically one of three parts inside the shared pool. The library cache is composed of shared SQL areas, PL/SQL packages and procedures, various locks & handles, and in the case of a shared server configuration stores private SQL areas. Whenever an application wants to execute SQL or PL/SQL (collectively called code), that code must first reside inside Oracle's library cache. When applications run and reference code, Oracle will first search the library cache to see if that code already exists in memory. If the code already exists in memory then Oracle can reuse that existing code (also known as a soft parse). If the code does not exists Oracle must then load the code into memory (also known as a hard parse, or library cache miss). There are various criteria as to whether code being requested actually matches code already in the library cache but that is beyond the scope of this article. Just be aware that a configured library cache area, since it is allocated a specific amount of memory, can actively only hold so much code before it must age out some to make room for code that is required by applications. This is not necessarily a bad thing but we must be aware of the size of your library cache as well as how many misses or hard parses that are occurring. If too many we may need to increase the amount of memory allocated to the library cache.

To monitor and gain an understanding of how your current library cache has performed since your last startup of the Oracle database you can issue the following SQL. Obviously if you are experiencing immediate performance problems you will want to take a look at the results of this query over a period of time. Regardless, each of the rows returned represents a specific type of code (namespace) kept in the library cache and their particular performance activity.

SQL> select namespace,
            pins,
            pinhits,
            reloads,
            invalidations,
            pinhitratio * 100                     pinhitratio
       from v$librarycache;

NAMESPACE             PINS    PINHITS    RELOADS INVALIDATIONS PINHITRATIO
--------------- ---------- ---------- ---------- ------------- -----------
SQL AREA           9510054    9475115       1920          1731  99.6326099
TABLE/PROCEDURE    2173571    2158422       2525             0  99.3030363
BODY                977001     976796         16             0  99.9790174
TRIGGER              28566      28491         11             0  99.7374501
INDEX                 9735       8359          5             0   85.865434
CLUSTER               2294       2281          2             0  99.4333043
OBJECT                   0          0          0             0         100
PIPE                     0          0          0             0         100
JAVA SOURCE              0          0          0             0         100
JAVA RESOURCE            0          0          0             0         100
JAVA DATA              493        489          0             0   99.188641

The way to look at these results are to first look at how many times a particular namespace was executed (PINS), then take a look at how many times something tried to execute but wasn't in the library cache (RELOADS). Another very important statistic is the number of INVALIDATIONS that have occurred for a particular namespace. INVALIDATIONs are those pieces of code that for some reason, typically through a DDL operation, have become invalid and required a reparse. All of this can be summarized in the hit ratio (PINHITRATIO). So in our example SQL above we can see that our particular library cache seems to be doing quite well and our applications are reusing SQL quite effectively.

You can also get an overall feel for the library cache with the following SQL. Again we can see that our library cache is performing well.

select sum(pins) pins,
       sum(pinhits) pinhits,
       sum(reloads) reloads,
       sum(invalidations) invalidations,
       100-(sum(pinhits)/sum(pins)) *100 reparsing
 from v$librarycache;

      PINS    PINHITS    RELOADS INVALIDATIONS  REPARSING
---------- ---------- ---------- ------------- ----------
  12703182   12651415       4479          1731 .407512071


It is easy to say, when code is required by an application, “just put the code in the library cache”. It is another thing to actually have it done. There are internal locking mechanisms that must be adheired to for all this to happen. All in the name of making sure that the queries and code that get executed are actually valid and are referencing valid objects. In a nutshell, those locks, and subsequently wait events, are the following:

library cache load lock

As the name states, the library cache load lock is concerned with obtaining a lock for a database object so that it can load it into the library cache. This lock is always obtained in an EXCLUSIVE mode so that no one else can load the same object at the same time. If your session is trying to obtain this lock and someone else has already obtained the lock, you will wait until they are done loading that object in the library cache.

Library cache lock

Once objects have been loaded into the library cache this lock controls the concurrent access to objects between clients. This lock is acquired on an object for the serialization of access and allows for a single application to use an object. the length of this lock is dependent on what is being done by an application or user. This lock is also obtained to simply locate an object in the library cache also.

Library cache pin

The library cache pin event is responsible for concurrent access within the library cache. The acquistion of a library cache pin is required to load an object's heap to be loaded into memory. Also, if somone wants to modify or examine an object they must acquire this lock.

In order for Oracle to move objects into the library cache it uses both locks and pins to access or load object handles and heaps. Locks are used to manage the serialization or concurrency between different users or application processes. This means that, if needed, a process can lock out and prevent other processes from accessing an object. Locks must also be acquired to locate an object in the cache. After acquireing a lock on the handle of an object, if the process actually wants to examine or modify the objct then it must acquire a pin on the object. The pinning of an object results in the objects heap being loaded into memory if it isn't already there.  Both locks and pins are required if compilation or parsing of code is ever going to happen—all for the purpose of making sure that no changes to an object's definition occurs. So for any type of code that needs to be loaded into the library cache, the session must first qcquire a library cache lock on the objects being quiered. After the library cache lock the session must acquire a library cache pin to pin the object heap into the library cache.

There monitoring of Oracle's library cache is essential to making sure that objects and SQL code are executing efficiently and available for subsequent applications to execute. The library cache could be under stress if specific wait events are occuring and limiting the concurrent access to code. Most of the problems that occure in the library cache are application or user induced—but that will be addressed next time. Until then, run the fore-mentioned SQL and get an idea of the overall and general health of your library cache. Next time we will look at where an issue can exist, how to find them, and obviously offer solutions to fixing them.



The areas that give occasional trouble are various locks that occur, and subsequently are the following wait events:

library cache load lock—As the name states, the library cache load lock is concerned with obtaining a lock for a database object so that it can load it into the library cache. When a user is experiencing this form of lock there is someone else that has the load lock for the object and thus must wait until the lock  becomes available.

Library cache lock—Once objects have been loaded into the library cache this lock controls the concurrent access to objects between clients. This allows one client to have sole access to an object and block out other clients. For instance when two clients want to compile the same piece of code.

Library cache pin—The library cache pin event is responsible for concurrent access within the library cache. When a pin is acquired it causes the object’s heap to be loaded into memory if it is not already there. A wait for a "library cache pin" implies some other session holds that PIN in an incompatible mode.

Regardless of the type of wait within the library cache, the diagnosis to determine who is waiting on a resource and what that resource is can be done through the querying of the V$SESSION_WAIT view. For instance if you wanted to find those sessions that were waiting for the “library cache pin” you could execute the following query. For a library cache pine, the key part of this query is the P1RAW column that gives the handle address of the object that is blocking this particular session. For other types of waits you should reference Oracle documentation to find which P value corresponds to an object or resource being waited on.


SELECT sid, event, p1raw
  FROM sys.v_$session_wait
 WHERE event = 'library cache pin'
   AND state = 'WAITING';

We can then execute the following query to find the library cache object being waited for. If you don’t currently have any particular waits occurring, this query is just fun to execute without the where clause to see what is in the library cache.


SELECT kglnaown AS owner, kglnaobj as Object
  FROM sys.x$kglob
 WHERE kglhdadr='&P1RAW';

To find those users that are waiting on the object in question you can use the DBA_WAITERS view and issue the following query. This is a very simple query but can be used to find the blocking session quite nicely. I basically try and find the waiting session that matches the above session we got from the V$SESSION_WAIT query and then see what the holding_session is. It is also good to note how many other sessions are waiting behind this blocking session also. If there are a lot you may need to take action quickly.


SELECT waiting_session, holding_session FROM dba_waiters;


If you want to investigate the actual SQL being done by the holding session you can issue the following SQL statement.


select sesion.sid,
       sql_text
  from v$sqlarea sqlarea, v$session sesion
 where sesion.sql_hash_value = sqlarea.hash_value
   and sesion.sql_address    = sqlarea.address;

This is sometimes good to see what the originating issue really is. It might not be the mix of application code but actually one particular statement that is giving Oracle a problem. For instance, for me, I have had problems with SQL that reference objects through a view that had a database link and was querying an external non-Oracle database. This type of problem can only be seen and remedied through tracing the originating SQL statement.


Now that you have determined the actual session and object being waited on, along with the session causing the problem and its SQL, it is time to determine what to do. If the wait has been happening for quite some time you could be experiencing a bug or malfunction within the library cache. I have seen this happen occasionally and the only way to remedy was to kill off those processes that where holding the lock. To do this within Oracle you can issue the alter system kill session command. Depending on the connection type this may or may not work. There have been too many times that the connection needed to be terminated through an operating system kill command, or shutting down a set of applications. You will have to investigate and see what works best for your system under a complete lockup in the library cache. The only advice I can give is always attempt the least obtrusive kills first before bringing down systems and databases.

When working with the library cache, just remember it is nothing more than a section of memory that Oracle must load code into before it can execute. Getting that code into the library cache can be limited and cause wait events that literally bring a system to a standstill. The quick determination of the SQL causing the hang is needed to take action through a kill of session or process or alteration of code. But don’t loose sight of the fact that this is memory and we might just need to re-allocate some to this cause to allow Oracle to work more efficiently.

Flushing a cursor out of the Library Cache  

Posted by Mawahid in

In order to demonstrate the problem and the fix we need to have something to play around with.
SQL> create table skew(c1 number(6), c2 char(20));

Table created.

SQL> insert into skew select 1,1 from dual connect by level <= 10000;

10000 rows created.

SQL> update skew set c1 = 2 where rownum <= 10;

10 rows updated.

SQL> create index skew_idx on skew(c1);

Index created.

SQL> exec dbms_stats.gather_table_stats(null,'skew', -
> method_opt => 'for all columns size 2')

PL/SQL procedure successfully completed.

SQL> select c1, count(*) from skew group by c1;

        C1   COUNT(*)
---------- ----------
         1       9990
         2         10

We now have an indexed table with skewed data in it with current object statistics in place including a histogram on the skewed column. Lets execute a query using a bind variable on the skewed column and see what the query optimizer expects and what execution plan it considers optimal.
SQL> var x number
SQL> exec :x := 1;

PL/SQL procedure successfully completed.

SQL> select count(c2) from skew where c1 = :x;

 COUNT(C2)
----------
      9990

SQL> select *
  2  from table(dbms_xplan.display_cursor(null,null,'basic rows'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select count(c2) from skew where c1 = :x

Plan hash value: 568322376

-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|   2 |   TABLE ACCESS FULL| SKEW |  9990 |
-------------------------------------------

The example above shows that the query optimizer predicted the cardinality correctly and choosed the optimal execution plan based upon this information. It could do so because there is a histogram available to describe the data skew in the table. Now see what happens if we bind the value 2 and execute the query again:
SQL> exec :x := 2;

PL/SQL procedure successfully completed.

SQL> select count(c2) from skew where c1 = :x;

 COUNT(C2)
----------
        10

SQL> select *
  2  from table(dbms_xplan.display_cursor(null,null,'basic rows'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select count(c2) from skew where c1 = :x

Plan hash value: 568322376

-------------------------------------------
| Id  | Operation          | Name | Rows  |
-------------------------------------------
|   0 | SELECT STATEMENT   |      |       |
|   1 |  SORT AGGREGATE    |      |     1 |
|   2 |   TABLE ACCESS FULL| SKEW |  9990 |
-------------------------------------------

Because the statement is not hard parsed again the same execution plan is used based on the prediction of 9990 rows. Because the query only returns 10 rows this execution plan is no longer optimal for the given value for the bind variable. If this query gets executed many times with this value of the bind variable we do have a performance problem for as long as this execution plan remains in the library cache. If this is indeed the case it might be beneficial to flush this cursor out of the shared pool. Starting with 10.2.0.4.0 this can be done using the PURGE procedure in the DBMS_SHARED_POOL package as demonstrated below:
SQL> @?/rdbms/admin/dbmspool

Package created.

Grant succeeded.

View created.

Package body created.

SQL> select address, hash_value from v$sqlarea
  2  where sql_text = 'select count(c2) from skew where c1 = :x';

ADDRESS  HASH_VALUE
-------- ----------
27308318 2934790721

SQL> exec sys.dbms_shared_pool.purge('&address, &hash_value','c')

PL/SQL procedure successfully completed.

Because the DBMS_SHARED_POOL package is not installed at database creation time, it has to be installed manually as shown above. The PURGE procedure needs the ADDRESS and HASH_VALUE of the cursor being flushed and the flag ‘C’ to indicate that we are flushing a cursor. This knowledge comes out of the dbmspool.sql script. The ADDRESS and HASH_VALUE can be retrieved from V$SQLAREA as shown in the example. A successful execution of the PURGE procedure indicates that the parent cursor is gone among with its children. A next execution of the query will force a hard parse and the creation of a new execution plan as we can see below:
SQL> select count(c2) from skew where c1 = :x;

 COUNT(C2)
----------
        10

SQL> select *
  2  from table(dbms_xplan.display_cursor(null,null,'basic rows'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select count(c2) from skew where c1 = :x

Plan hash value: 3493361220

---------------------------------------------------------
| Id  | Operation                    | Name     | Rows  |
---------------------------------------------------------
|   0 | SELECT STATEMENT             |          |       |
|   1 |  SORT AGGREGATE              |          |     1 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SKEW     |    10 |
|   3 |    INDEX RANGE SCAN          | SKEW_IDX |    10 |
---------------------------------------------------------

This time the query optimizer predicted the correct number of rows for the given value of the bind variable and selected the optimal execution plan for the given situation. The difficulty is of course to detect these situations before we can correct them. An indication could be a difference in the predicted number of rows and the actual number of rows in an execution plan, but therefore we need to set the STATISTICS_LEVEL parameter to ALL or add the GATHER_PLAN_STATISTICS hint to all possible affected statements which might be difficult to do. Once a possible affected statement has been found we can see the used bind value in the execution plan by using the PEEKED_BINDS options in the format specifier in the call to DBMS_XPLAN.
SQL> exec :x := 1;

PL/SQL procedure successfully completed.

SQL> select count(c2) from skew where c1 = :x;

 COUNT(C2)
----------
      9990

SQL> select *
  2  from table(dbms_xplan.display_cursor(null,null,'basic rows peeked_binds'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select count(c2) from skew where c1 = :x

Plan hash value: 3493361220

---------------------------------------------------------
| Id  | Operation                    | Name     | Rows  |
---------------------------------------------------------
|   0 | SELECT STATEMENT             |          |       |
|   1 |  SORT AGGREGATE              |          |     1 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SKEW     |    10 |
|   3 |    INDEX RANGE SCAN          | SKEW_IDX |    10 |
---------------------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

   1 - :X (NUMBER): 2

In this final example we bounded the value 1 again and executed the query which retrieved 9990 rows whilst the execution plan shows a prediction of only 10 rows. By using PEEKED_BINDS we receive extra information from DBMS_XPLAN telling us that this particular execution plan is based on the value 2 of the first bind variable in the statement which is named ‘:x’ and is a number data type.

Conclusion: By using the PURGE procedure in the DBMS_SHARED_POOL package we can flush a cursor out of the Library Cache when the execution plan causes performance problems due to an unlucky bind variable value. However this is only a temporary solution. The definitive solution is Adaptive Cursor Sharing which is introduced in Oracle11g.

Dynamic Sampling in Oracle  

Posted by Mawahid in

In oracle9i R2, dynamic sampling query is introduced and it has ability to sample the table and collect the better statistics dynamically during the hard parse. This sampling takes place only at hard parse time and is used to dynamically generate better statistics for the optimizer to use, hence the name dynamic sampling.

Oracle optimizer use the default statistic values when table does not have any statistics and dynamic sampling query feature is disabled.

How do we set Dynamic sampling parameter? There are couple of ways, we can set the parameter.

1. We can set this in session level or instance level.
2. Dynamic sampling query hint in query level.

SQL> alter session set OPTIMIZER_DYNAMIC_SAMPLING = 2;

Session altered.

SQL> alter system set OPTIMIZER_DYNAMIC_SAMPLING = 2;

System altered.

SQL> select /*+ dynamic_sampling(emp 0) */ * from scott.emp;


When does dynamic sampling is useful?
1. It is useful when table data is drastically changing and statistic becomes stale very often.
2. Global temporary table is good place to use dynamic sampling feature.

Dynamic sampling Levels? Optimizer Dynamic Sampling has 11 levels between 0 to 10.

In Oracle9i R2, the default setting for dynamic sampling level is 1. At this setting, the optimizer will tend to use dynamic sampling only in some cases when a table is unanalyzed. This includes the case of global temporary tables without statistics but doesn’t ensure that dynamic sampling will be used against all unanalyzed tables. In Oracle Database 10g Release 1 and above, the default setting is 2. This setting ensures that the optimizer will have a good estimate whenever it optimizes a query that accesses a table with no statistics.



Cursor Sharing in Oracle  

Posted by Mawahid in

Before oracle8.1.6, When application sends a SQL, oracle first check the shared pool to see if there is an exact statement in the shared pool. If the exact statement is not found, then a hard parse is necessary to generate the executable form of the statement. Setting CURSOR_SHARING to EXACT allows SQL statements to share the SQL area only when their texts match exactly. This is the default behavior. Using this setting, similar statements cannot shared; only textually exact statements can be shared

Let me demonstrate CURSOR_SHARING=EXACT

SQL> select count(*),empno from emphist group by empno;

COUNT(*) EMPNO
---------- ----------
1 7369
1 7902
4194304 7934

SQL> create index idxemphist on emphist(empno);

Index created.

SQL> execute DBMS_STATS.GATHER_TABLE_STATS(OWNNAME => 'SCOTT', TABNAME => 'EMPHIST',ESTIMATE_PERCENT
=> 10, METHOD_OPT => 'FOR ALL COLUMNS SIZE 1',CASCADE => TRUE);

PL/SQL procedure successfully completed.

SQL> alter system flush shared_pool;

System altered.
SQL> set autotrace traceonly explain

SQL> alter session set cursor_sharing='EXACT';

Session altered.

SQL> select count(ename) from emphist where empno=7934;

COUNT(ENAME)
------------
4194304

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2297 Card=1 Bytes=11
)

1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'EMPHIST' (Cost=2297 Card=4193270
Bytes=46125970)

SQL> select count(ename) from emphist where empno=7902;

COUNT(ENAME)
------------
1
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=11)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMPHIST' (Cost=4 Card=
1 Bytes=11)

3 2 INDEX (RANGE SCAN) OF 'IDXEMPHIST' (NON-UNIQUE) (Cost=
3 Card=1)

SQL> select count(ename) from emphist where empno=7902;

COUNT(ENAME)
------------
1

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=11)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMPHIST' (Cost=4 Card=
1 Bytes=11)

3 2 INDEX (RANGE SCAN) OF 'IDXEMPHIST' (NON-UNIQUE) (Cost=
3 Card=1)

SQL> select count(ename) from emphist where empno=7369;

COUNT(ENAME)
------------
1

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=11)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMPHIST' (Cost=4 Card=
1 Bytes=11)

3 2 INDEX (RANGE SCAN) OF 'IDXEMPHIST' (NON-UNIQUE) (Cost=
3 Card=1)

SQL> select count(ename) from emphist where empno=7369;

COUNT(ENAME)
------------
1

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4 Card=1 Bytes=11)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'EMPHIST' (Cost=4 Card=
1 Bytes=11)

3 2 INDEX (RANGE SCAN) OF 'IDXEMPHIST' (NON-UNIQUE) (Cost=
3 Card=1)

SQL> set autotrace off
SQL> select sql_text
from v$sql
where sql_text like 'select count(ename) from emphist where empno=%';

SQL_TEXT
--------------------------------------------------------------------------------
select count(ename) from emphist where empno=7369
select count(ename) from emphist where empno=7902
select count(ename) from emphist where empno=7934

Conclusion : Oracle does not use bind variable and every unique SQL statement, execution plan will be generated. It forced to hardparse every uniqe SQL statement and it leads to consume lot of CPU cycles. Oracle can not hardparse hundreds of SQL statement concurrently and it end up waiting for shared pool. To overcome this problem, oracle introduced FORCE paramter to use bind variable.

In oracle8.1.6, Oracle introduced new parameter FORCE. This is accomplished by transforming the statement into a new one by replacing the literals with system generated bind variables. The problem with bind variables is that the optimizer cannot predict precise selectivity.

Using cursor_sharing = force will parse the FIRST sql statement loaded into memory and will use this execution plan for the subsequent similar sql statements. This might be a problem if the FIRST statement's literals have good selectivity but those of the subsequent statements don't. In this case the statements with poor selectivity will also use the index whereas a full table scan would perform better. However, if the firstly run statement doesn't have good selectivity and a full table scan is used, the subequent similar statements will also use FTS.

Let me demonstrate CURSOR_SHARING=FORCE

SQL> select count(ename) from employee where empno=7934;

COUNT(ENAME)
------------
2097152
SQL>

The below plan is extracted from tkprof output file. Optimizer use the FTS which is right choice.

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
2097152 TABLE ACCESS FULL OBJ#(30800)

Let us run the another query and see the execution plan.

SQL> select count(ename) from employee where empno=7902;

COUNT(ENAME)
------------
1

SQL>

The below plan is extracted from tkprof output file. Optimizer again use the FTS for this subsequent query which is bad choice.

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
1 TABLE ACCESS FULL OBJ#(30800)

SQL> select sql_text from v$sql
where sql_text like 'select count(ename) from emphist where empno=%';

SQL_TEXT
--------------------------------------------------------------------------------
select count(ename) from emphist where empno=:"SYS_B_0"

SQL>

Conclusion : Oracle generate only one plan and use for all the SQL code. This would turn the hard parse into soft parse. It would require fewer resource. FORCE option is good when the query table is not skewed and all the queries require same execution plan. But in real world, this is not a good option.

To overcome this problem, In Oracle9i we can use CURSOR_SHARING=SIMILAR setting which makes the right plan based on the statistics.

Let me demonstrate CURSOR_SHARING=SIMILAR

SQL> ALTER SYSTEM FLUSH SHARED_POOL;
System altered.
SQL>

SQL> select count(ename) from employee where empno=7934;

COUNT(ENAME)
------------
2097152

Let us run the query and see the actual execution plan. Optimizer use the FTS which is right choice.

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
2097152 TABLE ACCESS FULL OBJ#(30800)


Let us run the another query and see the actual execution plan.

SQL> select count(ename) from employee where empno=7902;

COUNT(ENAME)
------------
1

SQL>

Rows Row Source Operation
------- ---------------------------------------------------
1 SORT AGGREGATE
1 TABLE ACCESS BY INDEX ROWID OBJ#(30800)
1 INDEX RANGE SCAN OBJ#(30801) (object id 30801)


SQL> select sql_text
from v$sql
where sql_text like 'select count(ename)%';

SQL_TEXT
--------------------------------------------------------------------------------
select count(ename) from emphist where empno=:"SYS_B_0"
select count(ename) from emphist where empno=:"SYS_B_0"
SQL>

Optimizer use the Index range scan which is right choice. Optimizer makes the right choice based on the statistics.

Conclusion : Optimizer replace the literal with bind variable and check if it requires different plan for the query based on the current statistics. If so then, optimizer mark the bind variable as unsafe and store the value in the signature of the query. Next time, user submit the query with the same bind variable value, then oracle resuse the plan. Optimizer would use the same execution plan based on the statistics if user submit the same query with different literal .

SIMILAR option resolve the issues which we had in EXACT. Exact does not use bind variable. But here it is using bind variable. In FORCE option, optimizer use the same execution plan for same query with different literal. But SIMILAR option, it use the different plan for the same query with different literal values based on the current statistics.

Please read the below link if you need more info.

http://www.oracle.com/technology/oramag/oracle/06-jan/o16asktom.html

When does oracle use Full Table Scan?  

Posted by Mawahid in

Full Table Scans reads all the blocks under the High Water Mark and filter out those that do not meet the selection criteria. Each row is examined to determine whether it satisfies the statement’s WHERE clause.

During full table scan, the blocks are read sequentially. Because the blocks are adjacent, I/O calls larger than a single block can be used to speed up the process. The size of the read calls range from one block to the number of blocks indicated by the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT. Using multiblock reads means a full table scan can be performed very efficiently.

Full table scans are cheaper than index range scans when accessing a large fraction of the blocks in a table. This is because full table scans can use larger I/O calls, and making fewer large I/O calls is cheaper than making many smaller calls.

When the Optimizer Uses Full Table Scans

Lack of Index
If the query is unable to use any existing indexes, then it uses a full table scan. For example, if there is a function used on the indexed column in the query, the optimizer is unable to use the index and instead uses a full table scan as below

ScanSELECT last_name, first_name
FROM employees
WHERE UPPER(last_name) LIKE :b1

If you need to use the index for case-independent searches, then either do not permit mixed-case data in the search columns or create a function-based index, such as UPPER(last_name), on the search column.

Large Amount of Data
If the optimizer thinks that the query will access most of the blocks in the table, then it uses a full table scan, even though, the index is available. Oracle takes FTS if the data are spreaded in most of the blocks. Optimizer takes FTS if clustering factor is high.

Small Table
If a table contains less than DB_FILE_MULTIBLOCK_READ_COUNT blocks under the high water mark, which can be read in a single I/O call, then a full table scan might be cheaper than an index range scan, regardless of the fraction of tables being accessed or indexes present.

Full Table Scan Hints
Use the hint FULL(table alias) if you want to force the use of a full table scan. For more information on the FULL hint.

SELECT /*+ FULL(e) */ employee_id, last_name
FROM employees e
WHERE last_name LIKE :b1;

Nested Loop, Hash Join, Sort Merge Join, Cartesian join difference  

Posted by Mawahid in

Nested loop Joins: The optimizer uses nested loop joins when joining small number of rows, with a good driving condition between the two tables. It drives from the outer loop to the inner loop. The inner loop is iterated for every row returned from the outer loop, ideally by an index scan. It is inefficient when join returns large number of rows (typically, more than 10,000 rows is considered large), and the optimizer might choose not to use it.

The cost is calculated as below.
cost = access cost of A + (access cost of B * no_of_rows from A)

A nested loop join involves the following steps:

1. The optimizer determines the driving table and designates it as the outer table.
2. The other table is designated as the inner table.
3. For every row in the outer table, Oracle accesses all the rows in the inner table.

For instance,
scott@DB1.US.ORACLE.COM> SELECT emp.empno, dept.dname
FROM EMP , DEPT
WHERE dept.deptno = 10
AND emp.deptno = dept.deptno
/

EMPNO DNAME
---------- --------------
7782 ACCOUNTING
7839 ACCOUNTING
7934 ACCOUNTING

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=3 Card=5 Bytes=85)
1 0 NESTED LOOPS (Cost=3 Card=5 Bytes=85)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (Cost=1 Card=1 Bytes=10)
3 2 INDEX (UNIQUE SCAN) OF 'PK_DEPT' (UNIQUE)
4 1 TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=5 Bytes=35)

We can also force the Nested loop join hint as below.

SELECT /*+ USE_NL(emp dept) */ emp.empno, dept.dname
FROM EMP , DEPT WHERE dept.deptno = 10
AND emp.deptno = dept.deptno

Hash Join: Hash joins are used for joining large data sets. The optimizer uses the smaller of two tables or data sources to build a hash table on the join key in memory. It then scans the larger table, probing the hash table to find the joined rows. This method is best used when the smaller table fits in available memory. The optimizer uses a hash join to join two tables if they are joined using an equijoin(joins with equals predicates) and large amount of data needs to be joined.

The cost of a Hash loop join is calculated by the following formula:
cost=(access cost of A*no_of_hash partitions of B) + access cost of B

For instance,
scott@DB1.US.ORACLE.COM> ;
1 SELECT emp.empno, dept.dname
2 FROM EMP , DEPT
3* WHERE emp.deptno = dept.deptno
scott@DB1.US.ORACLE.COM>
/
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=14 Bytes=238)
1 0 HASH JOIN (Cost=5 Card=14 Bytes=238)
2 1 TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=7 Bytes=70)
3 1 TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=14 Bytes=98)

We can also force the Hash join hint as below.
SELECT /*+USE_HASH(emp dept) */ emp.empno, dept.dname
FROM EMP , DEPT
WHERE emp.deptno = dept.deptno

Sort-Merge Join: Sort merge joins can be used to join rows from two independent sources. Hash joins generally perform better than sort merge joins. On the other hand, sort merge joins can perform better than hash joins if both of the following conditions exist:

1. The row sources are sorted already.
2. A sort operation does not have to be done.

Sort merge joins are almost exclusively used for non-equi joins (>, <, BETWEEN). Sort merge joins perform better than nested loop joins for large data sets. (You cannot use hash joins unless there is an equality condition). In a merge join, there is no concept of a driving table.
The join consists of two steps:
Sort join operation: Both the inputs are sorted on the join key.
Merge join operation: The sorted lists are merged together.

The optimizer can choose a sort merge join over a hash join for joining large amounts of data if any of the following conditions are true:

1. The join condition between two tables is not an equi-join.
2. OPTIMIZER_MODE is set to RULE.
3. HASH_JOIN_ENABLED is false.
4. Because of sorts already required by other operations, the optimizer finds it is cheaper to use a sort merge than a hash join.
5. The optimizer thinks that the cost of a hash join is higher, based on the settings of HASH_AREA_SIZE and SORT_AREA_SIZE.

The cost of a sort merge join is calculated by the following formula:
cost = access cost of A + access cost of B + (sort cost of A + sort cost of B)

For instance, scott@DB1.US.ORACLE.COM> SELECT emp.empno, dept.dname
FROM EMP , DEPT
WHERE emp.deptno < dept.deptno

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=5 Bytes=85)
1 0 MERGE JOIN (Cost=6 Card=5 Bytes=85)
2 1 SORT (JOIN) (Cost=2 Card=7 Bytes=70)
3 2 TABLE ACCESS (BY INDEX ROWID) OF 'DEPT' (Cost=2 Card=7 Bytes=70)
4 3 INDEX (FULL SCAN) OF 'PK_DEPT' (UNIQUE) (Cost=1 Card =7)
5 1 SORT (JOIN) (Cost=4 Card=14 Bytes=98)
6 5 TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=14 Bytes=98)

We can also force the Hash join hint as below.

scott@DB1.US.ORACLE.COM> SELECT /*+USE_MERGE(emp dept) */
emp.empno, dept.dname
FROM EMP , DEPT
WHERE emp.deptno < dept.deptno

Cartesian join A Cartesian join is used when one or more of the tables does not have any join conditions to any other tables in the statement. The optimizer joins every row from one data source with all the row from the other data source, creating the Cartesian product of the two sets.

For instance,
scott@DB1.US.ORACLE.COM> select * from emp,dept;
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=16 Card=98 Bytes=5194)
1 0 MERGE JOIN (CARTESIAN) (Cost=16 Card=98 Bytes=5194)
2 1 TABLE ACCESS (FULL) OF 'DEPT' (Cost=2 Card=7 Bytes=112)
3 1 BUFFER (SORT) (Cost=14 Card=14 Bytes=518)
4 3 TABLE ACCESS (FULL) OF 'EMP' (Cost=2 Card=14 Bytes=518)