Optimising Oracle Queries

Optimising Oracle Queries - How To Do It

How to optimize oracle queries?

Table access.

There are three ways that Oracle can find a row in a table, these are :

1) Scan each row in the table checking whether to select the row for output. This is commonly known as a full table scan (FTS), it is also the least efficient method of locating a row.

2) By using an index to locate the row.

3) By using the rowid to directly access the row on disc. This is the most efficient method of accessing a row. Do not use this method across transactions as it is possible for a rowid to change, use the primary key of the table in these circumstances.

Unfortunately we do not normally know the rowid, so the fastest method of access is by the use of indexes.

The general rules are as follows:

Indexes can be used in the following circumstances.
If the indexed column is mentioned in the where clause.

2) If the indexed column is not modified by a function or, an arithmetic operation, or both of them.
The following exception is applicable,a MIN(column), MIN(column+constant) or MAX(column), MAX(column+constant).

Indexes cannot be used in the following circumstances.

There is no 'where' clause.

2) The indexed column is modified in any way.

3) The search is for 'null' or 'not null'.

Indexes and Null.

If a record has no value (i.e. NULL) in the column specified as the index key, then there will not be an index entry for the record. For this reason the following query will not use an index on column COMM

SELECT    *
                                 FROM      emp
                                 WHERE     comm is NULL

Oracle assumes that the majority of records in a table will contain values for indexed columns, because of this a FTS will be executed for the following query.

SELECT    *
                                 FROM      emp
                                 WHERE     comm is not NULL

A FTS may not be appropriate if the column COMM is sparsely populated. There is an example of how to alter the query to make use of the index.

SELECT    *
                                 FROM      emp
                                 WHERE     comm > -0.01

Indexes and 'NOT=' predicates.

When a predicate contains a "not equal" condition, Oracle
will not consider an index, however, Oracle will interpret
other 'not' predicates so as to use any indexes available

e.g. >
'not sal > 50' => 'sal <= 50'
'not sal <= 50' => 'sal > 50'

Group by and predicate clauses.

The performance of group by queries can be improved by eliminating unwanted rows early in the selection process. The following two queries return the same data, however, the second is potentially quicker, since rows will be eliminated before the set operators are applied.

SELECT    job,
avg(sal)
FROM      emp
GROUP BY job
HAVING    job = 'president'
OR        job = 'manager'
SELECT    job,
avg(sal)
FROM      emp
WHERE     job = 'president'
OR        job = 'manager'
GROUP BY job

Multiple index queries.

When a query has two or more equality predicate clauses, multiple indexes may be used. Oracle will merge the indexes at run time, returning rows that are in both indexes, for example.

SELECT    ename
FROM      emp
WHERE     deptno=20
AND       job='manager'

Will merge the following indexes.
non unique index on job
non unique index on deptno

When indexes cannot be merged.

Where a query can use both range and equality predicates, as in the following example, oracle cannot merge the indexes. The index on job will be used to locate the 'manager' rows, these will then be checked to ensure deptno is greater than 10.

SELECT    *
                                 FROM      emp
                                 WHERE     job='manager'
                                 AND       deptno > 10

Indexes:
non unique index on job
non unique index on deptno

When there is no clear preference as to which index to use, see the following query, Oracle will use only one index, because it is inefficient to merge them. Note that a scan of the second table, will have to be one for each row returned in the first table.

Since both indexes are non unique, Oracle will choose the index which, it encounters first in the DC_INDEXES cache. In practice, as developers cannot see the cache, the choice is arbitrary.

SELECT    ename
                                 FROM      emp
                                 WHERE     sal > 1
                                 AND       empno > 1

Indexes :
non unique index on empno
non unique index on sal

Suppression of indexes for performance

When there is a choice between merging a unique index with a non unique one, the kernel will always use the unique index and avoid performing the merge. The reasoning behind this is that the unique index will return only one row.

See the following example:
SELECT    ename
                                 FROM      emp
                                 WHERE     sal = 3000
                                 AND       empno = 7902

Indexes :
unique index on empno
non unique index on sal

Only the empno index will be used, if a record is found then the row will be checked to see if the sal = 3000.
Oracle will only use 5 indexes in a single sql statement. After 5 indexes have been merged, the kernel will check those rows returned for the values in the remaining predicates.

SELECT    *
                                 FROM      emp
                                 WHERE     empno =7844
                                 AND       job = 'salesman'
                                 AND       deptno = 30
                                 AND       sal = 1500
                                 AND       comm = 0
                                 AND       ename = 'turner'

Indexes :
non unique index on empno
non unique index on job
non unique index on deptno
non unique index on sal
non unique index on comm
non unique index on ename

Because all the predicates score equally, according to APPENDIX A, only five of the above indexes will be used. Rows that are returned will be checked by the kernel to see if the last value is correct.

Concatenated indexes

Concatenated indexes are indexes that reference more than one column. The index will be available for scoring, provided that the first part of the index is present and usable.

In the following examples assume a concatenated index on job and deptno.

SELECT    *
FROM      emp
                             WHERE     job = 'president'
                              AND       deptno = 10

The full index will be used.
SELECT    *
FROM      emp
                                  WHERE     deptno = 10
                                  AND       job = 'president'

The full index will be used.

SELECT    *
FROM      emp
WHERE     deptno = 10

The index will not be used, the leading part of the index is missing from the statement.

SELECT    *
FROM      emp
WHERE     job = 'analyst'

The leading part of the index will be used, developers should check the selectivity of this part of the index, it may not be selective, and needs to be suppressed.

SELECT    *
FROM      emp
WHERE     job != 'clerk'
AND       deptno = 10

The index cannot be used because the != prevents the use of the leading part of the index.

Or optimisation

Unlike the 'AND' operator which requires that a row passes both predicates, the 'OR' operator requires a row to pass either of the predicates.

See the following example

SELECT    ename,
sal,
                                           job
FROM      emp
WHERE     sal = 3000
OR        job = 'clerk'

Index :
non unique on job

If the job index is used to identify the rows, then it would only return those employees who are 'clerks', it would not return those employees where the sal is 3000 and job is not 'clerk'. A single concatenated index on job, sal or sal,job cannot be used because records with values for the second key, must be found regardless of the value of the first key. If we have an index on sal,job, then with the index on job the kernel can optimise the query. A single column index on sal could also be used.

If there are indexes available for both the predicates that are or'd, then the query will be processed conceptually, as the union of two select statements. This is shown in the example below.

SELECT    ename,
                                           sal,
                                           job
                                 FROM      emp
                                 WHERE     sal = 3000
                                 OR        job = 'clerk'

Indexes :
non unique on job
non unique on sal

Becomes :

SELECT    ename,
                                           sal,
                                           job
FROM      emp
WHERE     job = 'clerk'

UNION
SELECT    ename,
                                           sal,
                                           job
FROM      emp
WHERE     sal = 3000
AND       job != 'clerk'

Notice that the kernel has used the last predicate in the or clause for the single condition query, and it has used the same predicate to form the "!=" condition in the double condition query. Notice that if we rewrite the query as:

SELECT    ename,
sal,
job
FROM      emp
WHERE     job = 'clerk'
OR        sal = 3000

Indexes :
non unique on job
non unique on sal

Becomes :

SELECT    ename, sal,job
FROM      emp
WHERE     sal = 3000

UNION

SELECT    ename,sal,job
                                 FROM      emp
                                 WHERE     job = 'clerk'
                                 AND       sal != 3000

From this, we can see that it is best to place the predicate associated with the most selective index first in the where clause, and the least selective last. This minimises the number of checks for '!='.

These 'or' optimisations cannot take place when the sql query contains a connect by, or an outer join.

Non correlated sub queries.

There are two cases to consider here, firstly queries which use the IN operator, and secondly, queries which use the NOT IN operator. We will deal with the IN operator first.

The following rules apply to optimising the query:

1) The main and subqueries are optimised separately.

2) The same rules for optimisation apply to the main and subqueries, for example, in the query below there is no optimisation of the subquery because there is no where clause.

3) The driving table is the table from the subquery. In the example below this will be job.

4) The subquery is transformed into a join by the following method. The rows frturned from the subquery are sorted and duplicates removed. A full table scan of these sorted rows is used for access, finally the table in the main query is joined to the sorted rows from the subquery using column in the main query's where clause.

SELECT    distinct name
FROM      men
WHERE     pin in
(SELECT pin
FROM job)
indexes :

Gives the following explain plan.

SORT(UNIQUE)
MERGE JOIN
TABLE ACCESS (FULL) OF 'MEN'
SORT(JOIN)
TABLE ACCESS (FULL) OF 'JOB'

Execute time 4759

Note that the SORT(UNIQUE) is caused by the distinct.

If we now put a dummy where clause on the sub query we get SELEC
distinct name FROM men WHERE pin in
(SELECT pin
FROM job )

Indexes :
unique on job(jobno)
non unique on job(pin)

gives the following explain plan

SORT(UNIQUE)
NESTED LOOPS
TABLE ACCESS (FULL) OF 'MEN'
INDEX(RANGE SCAN) OF 'JOB_1' (NON_UNIQUE)

Note that an index has been used on job. This is the one case where an index can be used even though there is no where clause on the subquery.

The sql is executed conceptually as, SELECT distinct pin FROM job ORDER BY pin, these sorted rows are then joined with the rows from men using the index job(pin).

The use of NOT IN is not recommended, developers should consider rewording queries using this construct to use outer joins, however, it should be noted that outer joins are an oracle extension, and are not available on other RDBMS's.

See below for example:

SELECT    *
                                 FROM      dept
                                 WHERE     deptno not in
                                           (SELECT   deptno
                                           FROM      emp)

This can be rewritten as the following outer join.

SELECT    d.*
                                 FROM      dept d,emp e
                                 WHERE     d.deptno = e.deptno(+)
                                 AND       e.rowid is NULL

This relies on the fact that each row has a unique rowid, and that, rowid is never NULL. Note also that any NOT IN is equivelent to a NOT EXISTS. Therefoe any NOT EXISTS can be transformed into an outer join.

Correlated sub queries

All Correlated subqueries follow the same execution path. The main and the subqueries are separately optimised. The driving table in selected from the main query. For each row returned in the main query, the subquery is executed. Indexes can be used for the where clauses on both the main and subquery.

Have a Oracle Question
Do you have an Oracle Question?

Oracle Books
Oracle Certification, Database Administration, SQL, Application, Programming Reference Books

Oracle Application
Oracle Application Hints and Tips

Oracle Home
Oracle Database, SQL, Application, Programming Tips

All the site contents are Copyright © www.erpgreat.com and the content authors. All rights reserved.
All product names are trademarks of their respective companies.
The site www.erpgreat.com is not affiliated with or endorsed by any company listed at this site.
Every effort is made to ensure the content integrity. Information used on this site is at your own risk.
The content on this site may not be reproduced or redistributed without the express written permission of
www.erpgreat.com or the content authors.