Optimising Oracle Queries - How To Do It

How to optimize oracle queries?

Table access.

There are three ways that Oracle can find a row in a table, these are :

1) Scan each row in the table checking whether to select the row for  output.  This is commonly known as a full table scan (FTS), it  is also the least efficient method of locating a row.

2) By using an index to locate the row.

3) By using the rowid to directly access  the row on disc. This is the most efficient method  of accessing  a  row. Do not use this  method  across transactions  as  it is possible for  a  rowid  to change, use the primary key of the table in  these circumstances.

Unfortunately  we  do not normally  know  the rowid,  so the fastest method of access is by  the use of indexes.

The general rules are as follows:

Indexes   can  be  used  in   the   following circumstances. 
If the indexed column is  mentioned  in the where clause.

2) If the indexed column is not  modified by a function or, an arithmetic operation, or both of them. 
The following exception is applicable,a MIN(column), MIN(column+constant) or MAX(column), MAX(column+constant).

Indexes  cannot  be  used  in  the following circumstances.

There is no 'where' clause.

2) The indexed column is modified in any way.

3) The search is for 'null' or 'not null'.

Indexes and Null.

If  a record has no value (i.e. NULL) in  the column specified as the index key, then there will not  be  an index entry for the record.  For  this  reason  the following query will not use an  index on column COMM 

SELECT    *
                                 FROM      emp 
                                 WHERE     comm is NULL

Oracle  assumes that the majority of  records in  a  table  will contain  values  for   indexed columns,  because of this a FTS will  be  executed  for the following query.

SELECT    *
                                 FROM      emp
                                 WHERE     comm is not NULL

A FTS may not be appropriate if  the  column COMM is sparsely populated. There is an example of how to alter the query to make use of the index.

SELECT    *
                                 FROM      emp
                                 WHERE     comm > -0.01

Indexes and 'NOT=' predicates.

When  a  predicate  contains  a  "not  equal" condition,  Oracle 
 will not  consider  an  index, however, Oracle  will  interpret 
  other   'not'  predicates so as to use any indexes available 

e.g. >
                                 'not sal > 50' => 'sal <= 50'
                                 'not sal <= 50' => 'sal > 50'

Group by and predicate clauses.

The  performance of group by queries  can  be improved by eliminating unwanted rows early in the selection  process.  The  following  two   queries  return  the  same  data, however,  the  second  is potentially quicker, since rows will be eliminated before the set operators are applied.

SELECT    job,
avg(sal)
FROM      emp
GROUP BY  job
HAVING    job = 'president'
OR        job = 'manager'
SELECT    job,
avg(sal)
FROM      emp
WHERE     job = 'president'
OR        job = 'manager'
GROUP BY  job

Multiple index queries.

When  a  query  has  two  or  more   equality  predicate  clauses, multiple indexes may be  used. Oracle will  merge  the  indexes at  run   time, returning  rows  that  are in  both  indexes,  for example.

SELECT    ename
FROM      emp
WHERE     deptno=20
AND       job='manager'

Will merge the following indexes.
non unique index on job
non unique index on deptno

When indexes cannot be merged.

Where a query can use both range and equality  predicates,  as in the following  example,  oracle cannot merge the indexes. The index on job will be  used to locate the 'manager' rows, these will then be checked to ensure deptno is greater than 10.

SELECT    *
                                 FROM      emp
                                 WHERE     job='manager'
                                 AND       deptno > 10

Indexes:
non unique index on job
non unique index on deptno

When there is no clear preference as to which index to use, see the following query, Oracle will use  only one index, because it is inefficient  to merge them. Note that a scan of the second  table, will have to be one for each row returned in  the  first table.

Since  both  indexes are non  unique,  Oracle will  choose the index which, it encounters  first in   the   DC_INDEXES  cache.  In practice, as developers cannot see the cache, the  choice  is arbitrary.

SELECT    ename
                                 FROM      emp
                                 WHERE     sal > 1
                                 AND       empno > 1

Indexes :
non unique index on empno
non unique index on sal

Suppression of indexes for performance

When  there  is a choice  between  merging  a unique  index  with a non unique one,  the  kernel will  always  use  the  unique  index and avoid performing the merge. The reasoning behind this is that  the unique index will return only  one  row. 

See the following example:
SELECT    ename
                                 FROM      emp
                                 WHERE     sal = 3000
                                 AND       empno = 7902

Indexes : 
unique index on empno
non unique index on sal

Only  the  empno  index will be  used,  if  a record  is found then the row will be  checked  to see if the sal = 3000.
Oracle will only use 5 indexes in  a  single sql statement. After 5 indexes have been merged, the kernel will check those rows returned for the values in the remaining predicates.

SELECT    *
                                 FROM      emp
                                 WHERE     empno =7844
                                 AND       job = 'salesman'
                                 AND       deptno = 30
                                 AND       sal = 1500
                                 AND       comm = 0
                                 AND       ename = 'turner'

Indexes : 
non unique index on empno
non unique index on job
non unique index on deptno 
non unique index on sal
non unique index on comm
non unique index on ename

Because  all  the predicates  score  equally, according  to APPENDIX A, only five of  the  above indexes will be used. Rows that are returned  will  be checked by the kernel to see if the last  value is correct.

Concatenated indexes

Concatenated   indexes   are   indexes   that reference more than one column. The index will  be available  for  scoring, provided that  the  first part of the index is present and usable.

In the following examples assume a concatenated index on job and deptno.

SELECT    *
FROM      emp
                             WHERE     job = 'president'
                              AND       deptno = 10

The full index will be used.
SELECT    *
FROM      emp
                                  WHERE     deptno = 10
                                  AND       job = 'president'

The full index will be used.

SELECT    *
FROM      emp
WHERE     deptno = 10

The index will not be used, the leading  part of the index is missing from the statement.

 SELECT    *
FROM      emp
WHERE     job = 'analyst'

The  leading part of the index will be  used, developers should check the selectivity  of  this part  of the index, it may not be selective,  and  needs to be suppressed.

SELECT    *
FROM      emp
WHERE     job != 'clerk'
AND       deptno = 10

The index  cannot  be used  because  the  != prevents the use of the leading part of the index.

Or optimisation

Unlike the 'AND' operator which requires that a  row passes both predicates, the  'OR'  operator requires  a row to pass either of the  predicates. 

See the following example

SELECT    ename,
sal,
                                           job
FROM      emp
WHERE     sal = 3000
OR        job = 'clerk'

Index : 
non unique on job

If the job index is used to identify the rows, then it would only return those employees who  are 'clerks',  it  would not  return those  employees where the sal is 3000 and job is not 'clerk'.   A single concatenated index on job,  sal  or sal,job cannot be used because records with values for the second key, must be found regardless of the value of the first key. If we have an index on sal,job, then with the index on job the kernel can optimise the query.  A single column index on sal could also be used.

If  there are indexes available for both  the predicates that are or'd, then the query will  be processed conceptually, as the union of two select statements. This is shown in the example below.

SELECT    ename,
                                           sal,
                                           job
                                 FROM      emp
                                 WHERE     sal = 3000
                                 OR        job = 'clerk'

Indexes :
non unique on job
non unique on sal

Becomes : 

SELECT    ename,
                                           sal,
                                           job
FROM      emp
WHERE     job = 'clerk'

UNION
SELECT    ename,
                                           sal,
                                           job
FROM      emp
WHERE     sal = 3000
AND       job != 'clerk'

Notice  that  the kernel has  used  the  last predicate   in  the or  clause  for  the   single condition   query,  and  it  has used  the   same predicate  to  form  the  "!="  condition  in the  double condition query. Notice that if we  rewrite the query as:

SELECT    ename,
sal,
job
FROM      emp
WHERE     job = 'clerk'
OR        sal = 3000

Indexes : 
non unique on job
non unique on sal

Becomes :

SELECT    ename, sal,job
FROM      emp
WHERE     sal = 3000

UNION

SELECT    ename,sal,job
                                 FROM      emp
                                 WHERE     job = 'clerk'
                                 AND       sal != 3000

From  this,  we can see that it  is  best  to place  the  predicate associated  with  the  most selective index first in the where clause, and the least selective last. This minimises the number of checks for '!='.

These  'or' optimisations cannot  take  place when  the sql query contains a connect by,  or  an outer join.

Non correlated sub queries.

There are two cases to consider here, firstly queries which use the IN operator, and  secondly, queries which  use the NOT IN operator.  We will deal with the IN operator first.

The following rules apply to optimising  the query:

1) The  main and subqueries  are  optimised separately.

2) The same rules for optimisation apply  to the main and subqueries, for example, in the query below  there  is no optimisation of  the subquery  because there is no where clause.

3)  The driving table is the table from  the  subquery. In the example below this will be job.

4)  The subquery is transformed into a  join by  the following method. The rows  frturned  from the subquery are sorted and duplicates removed.  A full  table scan of these sorted rows is used  for access, finally the  table in the main  query  is joined to the sorted rows from the subquery  using column in the main query's where clause.

SELECT    distinct name 
FROM      men
WHERE     pin in 
(SELECT pin 
FROM job)
indexes :

Gives the following explain plan.

SORT(UNIQUE)
MERGE JOIN
TABLE ACCESS (FULL) OF 'MEN'
SORT(JOIN)
TABLE ACCESS (FULL) OF 'JOB'

Execute time 4759

Note that the SORT(UNIQUE) is caused by  the distinct.

If we now put a dummy where clause on the sub query we get SELEC 
distinct name FROM men                                 WHERE     pin in 
(SELECT pin 
 FROM job )

Indexes : 
unique on job(jobno)
non unique on job(pin)

gives the following explain plan

SORT(UNIQUE)
NESTED LOOPS
TABLE ACCESS (FULL) OF 'MEN'
INDEX(RANGE SCAN) OF 'JOB_1' (NON_UNIQUE)

Note that an index has been used on job. This is  the one case where an index can be  used  even though there is no where clause on the subquery.

The sql is executed conceptually as,  SELECT distinct  pin FROM job ORDER BY pin, these  sorted rows are then joined with the rows from men  using the index job(pin).

The  use  of  NOT  IN  is  not   recommended, developers should consider rewording queries using this  construct  to use outer joins,  however,  it should  be  noted that outer joins are  an  oracle extension, and are not available on other RDBMS's. 

See below for example:

SELECT    *
                                 FROM      dept
                                 WHERE     deptno not in
                                           (SELECT   deptno
                                           FROM      emp)

This can be rewritten as the following  outer join.

SELECT    d.*
                                 FROM      dept d,emp e
                                 WHERE     d.deptno = e.deptno(+)
                                 AND       e.rowid is NULL

This  relies on the fact that each row has  a unique rowid, and that, rowid is never NULL.  Note also  that  any  NOT IN is  equivelent to  a  NOT EXISTS. Therefoe any NOT EXISTS can be transformed into an outer join.

Correlated sub queries

All  Correlated  subqueries follow  the  same execution  path. The main and the  subqueries  are separately   optimised.  The  driving table   in selected  from  the  main  query.  For  each   row returned  in  the  main  query,  the  subquery  is executed.  Indexes can  be  used  for  the  where clauses on both the main and subquery.

Have a Oracle Question
Do you have an Oracle Question?

Oracle Books
Oracle Certification, Database Administration, SQL, Application, Programming Reference Books

Oracle Application
Oracle Application Hints and Tips

Oracle Home
Oracle Database, SQL, Application, Programming Tips

All the site contents are Copyright © www.erpgreat.com and the content authors. All rights reserved.
All product names are trademarks of their respective companies.
The site www.erpgreat.com is not affiliated with or endorsed by any company listed at this site.
Every effort is made to ensure the content integrity.  Information used on this site is at your own risk.
 The content on this site may not be reproduced or redistributed without the express written permission of
www.erpgreat.com or the content authors.