You are on page 1of 84

multi-table inserts in oracle 9i

Multi-table insert is a new feature of Oracle 9i Release 1 (9.0). An extension to INSERT..SELECT, this feature enables us to
define multiple insert targets for a source dataset. Until the introduction of this feature, only SQL*Loader had a similar
capability.

This article provides an overview of multi-table inserts and how they are used.

syntax overview

There are two types of multi-table insert as follows:

• INSERT FIRST; and


• INSERT ALL.

Multi-table inserts are an extension to INSERT..SELECT. Syntax is of the following form:

INSERT ALL|FIRST

[WHEN condition THEN] INTO target [VALUES]

[WHEN condition THEN] INTO target [VALUES]

...

[ELSE] INTO target [VALUES]

SELECT ...

FROM source_query;

We define multiple INTO targets between the INSERT ALL/FIRST and the SELECT. The inserts can be conditional or
unconditional and if the record structure of the datasource matches the target table, the VALUES clause can be omitted. We
will describe the various permutations in this article.

setup

For the examples in this article, we shall use the ALL_OBJECTS view as our source data. For simplicity, we will create four
tables with the same structure as follows.

SQL> CREATE TABLE t1

2 ( owner VARCHAR2(30)

3 , object_name VARCHAR2(30)

4 , object_type VARCHAR2(30)

5 , object_id NUMBER

6 , created DATE
7 );

Table created.

SQL> CREATE TABLE t2 AS SELECT * FROM t1;

Table created.

SQL> CREATE TABLE t3 AS SELECT * FROM t1;

Table created.

SQL> CREATE TABLE t4 AS SELECT * FROM t1;

Table created.

These tables will be our targets for the ALL_OBJECTS view data.

simple multi-table insert

To begin, we will unconditionally INSERT ALL the source data into every target table. The source records and target tables
are all of the same structure so we will omit the VALUES clause from each INSERT.

SQL> SELECT COUNT(*) FROM all_objects;

COUNT(*)

----------

28981

1 row selected.

SQL> INSERT ALL

2 INTO t1

3 INTO t2
4 INTO t3

5 INTO t4

6 SELECT owner

7 , object_type

8 , object_name

9 , object_id

10 , created

11 FROM all_objects;

115924 rows created.

SQL> SELECT COUNT(*) FROM t1;

COUNT(*)

----------

28981

1 row selected.

Note the feedback from sqlplus and compare this to the count of ALL_OBJECTS. We get the total number of records
inserted and this is evenly distributed between our target tables (although in practice, this will usually be distributed unevenly
between the target tables).

Before we continue with extended syntax, note that multi-table inserts can turn single source records into multiple target
records (i.e. to re-direct portions of records to different tables). We can see this in the previous example where we insert four
times the number of source records. We can also generate multiple records for a single table (i.e. the same table is
repeatedly used as a target) whereby each record picks a different set of attributes from the source record (similar to
pivotting).

conditional multi-table insert

Multi-table inserts can also be conditional (i.e. we do not need to insert every record into every table in the list). There are
some key points to note about conditional multi-table inserts as follows.

• we cannot mix conditional with unconditional inserts. This means that in situations where we need a conditional
insert on a subset of target tables, we will often need to "pad out" the unconditional inserts with a dummy condition
such as "WHEN 1=1";
• we can optionally include an ELSE clause in our INSERT ALL|FIRST target list for when none of the explicit
conditions are satisfied;
• an INSERT ALL conditional statement will evaluate every insert condition for each record. With INSERT FIRST,
each record will stop being evaluated on the first condition it satisfies;
• the conditions in an INSERT FIRST statement will be evaluated in order from top to bottom. Oracle makes no such
guarantees with an INSERT ALL statement.

With these restrictions in mind, we can now see an example of a conditional INSERT FIRST statement. Each source record
will be directed to one target table at most. Note that for demonstration purposes, the following example includes varying
column lists and an ELSE clause.

SQL> INSERT FIRST

2 --<>--

3 WHEN owner = 'SYSTEM'

4 THEN

5 INTO t1 (owner, object_name)

6 VALUES (owner, object_name)

7 --<>--

8 WHEN object_type = 'TABLE'

9 THEN

10 INTO t2 (owner, object_name, object_type)

11 VALUES (owner, object_name, object_type)

12 --<>--

13 WHEN object_name LIKE 'DBMS%'

14 THEN

15 INTO t3 (owner, object_name, object_type)

16 VALUES (owner, object_name, object_type)

17 --<>--

18 ELSE

19 INTO t4 (owner, object_type, object_name, created, object_id)

20 VALUES (owner, object_type, object_name, created, object_id)

21 --<>--

22 SELECT owner

23 , object_type

24 , object_name
25 , object_id

26 , created

27 FROM all_objects;

28981 rows created.

SQL> SELECT COUNT(*) FROM t1;

COUNT(*)

----------

362

1 row selected.

SQL> SELECT COUNT(*) FROM t2;

COUNT(*)

----------

844

1 row selected.

SQL> SELECT COUNT(*) FROM t3;

COUNT(*)

----------

266

1 row selected.

SQL> SELECT COUNT(*) FROM t4;


COUNT(*)

----------

27509

1 row selected.

We can see that each source record was inserted into one table only. INSERT FIRST is a good choice for performance
when each source record is intended for one target only, but in practice, INSERT ALL is much more common.

Remember that we cannot mix conditional with unconditional inserts. The following example shows the unintuitive error
message we receive if we try.

SQL> INSERT ALL

2 --<>--

3 INTO t1 (owner, object_name) --<-- unconditional

4 VALUES (owner, object_name)

5 --<>--

6 WHEN object_type = 'TABLE' --<-- conditional

7 THEN

8 INTO t2 (owner, object_name, object_type)

9 VALUES (owner, object_name, object_type)

10 --<>--

11 SELECT owner

12 , object_type

13 , object_name

14 , object_id

15 , created

16 FROM all_objects;

INTO t1 (owner, object_name) --<-- unconditional

ERROR at line 3:

ORA-00905: missing keyword


The workaround to this, as stated earlier, is to include a dummy TRUE condition as follows.

SQL> INSERT ALL

2 --<>--

3 WHEN 1 = 1 --<-- dummy TRUE condition

4 THEN

5 INTO t1 (owner, object_name)

6 VALUES (owner, object_name)

7 --<>--

8 WHEN object_type = 'TABLE' --<-- conditional

9 THEN

10 INTO t2 (owner, object_name, object_type)

11 VALUES (owner, object_name, object_type)

12 --<>--

13 SELECT owner

14 , object_type

15 , object_name

16 , object_id

17 , created

18 FROM all_objects;

29958 rows created.

Counter-intuitive to this is the fact that in a conditional multi-table insert, each INTO clause inherits the current condition until
it changes. We can see this below by loading T1, T2 and T3 from a single condition in an INSERT ALL statement. The T4
table will be loaded from the ELSE clause.

SQL> INSERT ALL

2 WHEN owner = 'SYSTEM'

3 THEN

4 INTO t1 (owner, object_name)

5 VALUES (owner, object_name)

6 --<>--

7 INTO t2 (owner, object_name, object_type) --<-- owner = 'SYSTEM


8 VALUES (owner, object_name, object_type)

9 --<>--

10 INTO t3 (owner, object_name, object_type) --<-- owner = 'SYSTEM

11 VALUES (owner, object_name, object_type)

12 ELSE

13 INTO t4 (owner, object_type, object_name, created, object_id)

14 VALUES (owner, object_type, object_name, created, object_id)

15 SELECT owner

16 , object_type

17 , object_name

18 , object_id

19 , created

20 FROM all_objects;

29705 rows created.

SQL> SELECT COUNT(*) FROM all_objects WHERE owner = 'SYSTEM';

COUNT(*)

----------

362

1 row selected.

SQL> SELECT COUNT(*) FROM t1;

COUNT(*)

----------

362

1 row selected.
SQL> SELECT COUNT(*) FROM t2;

COUNT(*)

----------

362

1 row selected.

SQL> SELECT COUNT(*) FROM t3;

COUNT(*)

----------

362

1 row selected.

multi-table inserts and triggers

As we might expect, multi-table inserts will cause insert-event triggers to fire. We can see this quite simply with the following
example. We create two insert triggers (one for T1 and one for T2) and run a conditional INSERT ALL statement. Each
trigger will output a message to the screen on firing.

SQL> CREATE OR REPLACE TRIGGER t1_insert_trigger

2 AFTER INSERT ON t1

3 BEGIN

4 DBMS_OUTPUT.PUT_LINE('T1 trigger fired...');

5 END insert_trigger;

6 /

Trigger created.

SQL> CREATE OR REPLACE TRIGGER t2_insert_trigger


2 AFTER INSERT ON t2

3 BEGIN

4 DBMS_OUTPUT.PUT_LINE('T2 trigger fired...');

5 END insert_trigger;

6 /

Trigger created.

SQL> INSERT ALL

2 --<>--

3 WHEN owner = 'SYSTEM'

4 THEN

5 INTO t1 (owner, object_name)

6 VALUES (owner, object_name)

7 --<>--

8 WHEN object_type = 'TABLE'

9 THEN

10 INTO t2 (owner, object_name, object_type)

11 VALUES (owner, object_name, object_type)

12 --<>--

13 SELECT owner

14 , object_type

15 , object_name

16 FROM all_objects;

T1 trigger fired...

T2 trigger fired...

1339 rows created.

multi-table inserts and sequences


Sequences can be used directly in multi-table inserts but their placement can be counter-intuitive. They are referenced in the
relevant VALUES clause(s) and not in the source query (as we might expect). Further to this, when referencing a single
sequence in multiple VALUES clauses, we might consider it necessary to be "smart" with our use of the NEXTVAL and
CURRVAL pseudo-columns. This is not the case, as the following example demonstrates. We will create a sequence and
use it in multiple INTO..VALUES clauses.

SQL> CREATE SEQUENCE multi_table_seq;

Sequence created.

SQL> INSERT ALL

2 --<>--

3 INTO t1 (owner, object_id)

4 VALUES (owner, multi_table_seq.NEXTVAL)

5 --<>--

6 INTO t1 (owner, object_id)

7 VALUES (owner, multi_table_seq.NEXTVAL)

8 --<>--

9 INTO t1 (owner, object_id)

10 VALUES (owner, multi_table_seq.NEXTVAL)

11 --<>--

12 INTO t1 (owner, object_id)

13 VALUES (owner, multi_table_seq.NEXTVAL)

14 --<>--

15 SELECT owner

16 , object_type

17 , object_name

18 , object_id

19 , created

20 FROM all_objects

21 WHERE ROWNUM <= 50;

200 rows created.


SQL> SELECT COUNT(*) AS record_count

2 , COUNT(DISTINCT(object_id)) AS sequence_numbers_assigned

3 FROM t1;

RECORD_COUNT SEQUENCE_NUMBERS_ASSIGNED

------------ -------------------------

200 50

1 row selected.

We can see from this example that the sequence.NEXTVAL expression is used in each VALUES clause but each increment
is constant for the entire INTO list. Given the fact that Oracle doesn't guarantee the execution order of an INSERT ALL
statement, this sequence behaviour actually makes sense (because we couldn't guarantee a NEXTVAL before a CURRVAL,
for example).

multi-table inserts and referential constraints

It has been stated that Oracle does not guarantee the insert order of an INSERT ALL statement, despite the fact that we will
usually observe an ordered behaviour. This fact is critical when we have a set of INTO target tables that have parent-child
relationships between them. We might consider that simply ordering the INTO clauses in a way such that the parent is
inserted before the child is sufficient. Fortunately, most of the time this will be the case but Oracle cannot guarantee it. This
author has had to use deferrable constraints to workaround this issue in a large, six-table parallel insert where the INTO
ordering was not maintained. Reproducing this problem for this article, however, has not been possible but it is important
that we are aware of the potential issue.

performance considerations

The performance of multi-table inserts can be improved in several ways.

First, we can use INSERT FIRST if it makes sense to do so and if the insert volumes are high (though in practice this will
make only a small difference).

Second, we can use hints such as PARALLEL or APPEND for "brute-force" loading in parallel or direct-path. Hints are
added between the INSERT and the ALL/FIRST as follows:

INSERT /*+ hint */ ALL|FIRST

With regard to parallel insert, the documentation states that the entire statement will be parallelised if we use a PARALLEL
hint for any of the target tables (even if the target tables haven't been created with PARALLEL). If no hint is supplied, then
the insert will not be performed in parallel unless every table in the statement has its PARALLEL attribute set.
Third, we can tune the source query as this is likely to be the most "expensive" part of the operation. The benefit of multi-
table inserts over pre-9i solutions is that we need only generate the source dataset once. Of course, large SQL statements
can often provide numerous opportunities for tuning, so we can benefit in two ways: once to reduce the work to a single
statement; and twice to tune the single statement itself.

We can compare a multi-table insert with a pre-9i solution of loading each table separately. We will load the ALL_OBJECTS
source data into our four target tables, first with multi-table insert (INSERT ALL) and second as four separate statements.
We will use a variation on Tom Kyte's RUNSTATS utility to measure the time and resource differences between the two
methods.

We will begin with the multi-table solution.

SQL> exec runstats_pkg.rs_start;

PL/SQL procedure successfully completed.

SQL> INSERT ALL

2 INTO t1

3 INTO t2

4 INTO t3

5 INTO t4

6 SELECT owner

7 , object_type

8 , object_name

9 , object_id

10 , created

11 FROM all_objects;

115924 rows created.

Now we can run the pre-9i solution by executing four separate statements.

SQL> exec runstats_pkg.rs_middle;

PL/SQL procedure successfully completed.


SQL> INSERT INTO t1

2 SELECT owner, object_name, object_type, object_id, created

3 FROM all_objects;

28981 rows created.

SQL> INSERT INTO t2

2 SELECT owner, object_name, object_type, object_id, created

3 FROM all_objects;

28981 rows created.

SQL> INSERT INTO t3

2 SELECT owner, object_name, object_type, object_id, created

3 FROM all_objects;

28981 rows created.

SQL> INSERT INTO t4

2 SELECT owner, object_name, object_type, object_id, created

3 FROM all_objects;

28981 rows created.

Finally we can report the time and resource differences as follows.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 233 hsecs

Run2 ran in 663 hsecs

Run1 ran in 35.14% of the time


Name Run1 Run2 Diff

STAT..bytes received via SQL*N 1,010 2,453 1,443

STAT..dirty buffers inspected 0 1,536 1,536

STAT..free buffer inspected 0 1,537 1,537

STAT..hot buffers moved to hea 0 2,194 2,194

LATCH.checkpoint queue latch 73 4,047 3,974

STAT..index fetch by key 1,496 5,952 4,456

STAT..rows fetched via callbac 1,488 5,952 4,464

LATCH.cache buffers lru chain 89 4,731 4,642

STAT..consistent gets - examin 3,899 12,816 8,917

LATCH.simulator hash latch 6,396 25,049 18,653

STAT..redo size 7,444,524 7,421,848 -22,676

STAT..index scans kdiixs1 44,658 178,600 133,942

STAT..table fetch by rowid 47,796 191,184 143,388

STAT..buffer is not pinned cou 70,673 282,660 211,987

STAT..buffer is pinned count 71,067 284,268 213,201

STAT..no work - consistent rea 98,650 394,552 295,902

STAT..session logical reads 107,016 411,832 304,816

STAT..consistent gets 103,457 408,276 304,819

LATCH.cache buffers chains 220,008 826,083 606,075

Run1 latches total versus run2 -- difference and pct

Run1 Run2 Diff Pct

234,021 868,611 634,590 26.94%

PL/SQL procedure successfully completed.

We can see that the multi-table insert is considerably quicker in our example. This is because the cost of generating the
source dataset is borne only once with this solution. The overall level of resources used by the inserts are very similar (i.e.
we write the same volume of data regardless of the approach).

multi-table insert restrictions


There are several restrictions with multi-table inserts. The online documentation lists the following:

• we cannot have views or materialized views as targets;


• we cannot use remote tables as targets;
• we cannot load more than 999 columns (all INTO clause combined);
• we cannot parallel insert in RAC environments;
• we cannot parallel insert into an IOT or a table with a bitmap index;
• we cannot use plan stability (outlines) for multi-table insert statements;
• we cannot use TABLE collection expressions; and
• we cannot use a sequence in the source query.

Note in addition to these the previous section on referential constraints and target table ordering.

external tables in oracle 9i

External tables enable us to read flat-files (stored on the O/S) using SQL. They have been introduced in Oracle 9i as an
alternative to SQL*Loader. External tables are essentially stored SQL*Loader control files, but because they are defined as
tables, we can access our flat-file data using all available read-only SQL and PL/SQL operations. We can also read flat-files
in parallel and join files to other files or tables, views and so on.

This article is a short introduction to external tables and how we can benefit from them.

creating an external table

We will begin by creating a simple external table that mimics the EMP table. Before we can create the table, however, we
require an Oracle directory. A directory (also new in 9i) is an Oracle object that "points" to a filesystem location where our
source flat-files are stored (as an aside, note that directory objects can also be used to replace utl_file_dir in UTL_FILE
read-write operations). In the following example, we will create a directory named XT_DIR.

SQL> CREATE DIRECTORY xt_dir AS 'd:\oracle\dir';

Directory created.

Note that to create a directory, we require the CREATE ANY DIRECTORY system privilege. Directories are system-wide, so
if we cannot get the privilege, a DBA can create the directory and grant READ or WRITE to our users as required (READ for
external table users and READ/WRITE as appropriate for UTL_FILE users).

We will reference this directory in our external table definition as follows. Some explanation of the new syntax follows the
table creation.

SQL> CREATE TABLE emp_xt

2 ( empno NUMBER(4)

3 , ename VARCHAR2(10)
4 , job VARCHAR2(9)

5 , mgr NUMBER(4)

6 , hiredate DATE

7 , sal NUMBER(7,2)

8 , comm NUMBER(7,2)

9 , deptno NUMBER(2)

10 )

11 ORGANIZATION EXTERNAL

12 (

13 TYPE ORACLE_LOADER

14 DEFAULT DIRECTORY xt_dir

15 ACCESS PARAMETERS

16 (

17 RECORDS DELIMITED by NEWLINE

18 BADFILE 'emp_xt.bad'

19 LOGFILE 'emp_xt.log'

20 NODISCARDFILE

21 FIELDS TERMINATED BY ','

22 ( empno

23 , ename

24 , job

25 , mgr

26 , hiredate CHAR(20) DATE_FORMAT DATE MASK "DD/MM/YYYY"

27 , sal

28 , comm

29 , deptno

30 )

31 )

32 LOCATION ('emp.dat')

33 )

34 REJECT LIMIT UNLIMITED;


Table created.

We have now created an external table and we can see a wide range of new and extended syntax. In particular, we can see
something that looks similar to (but not quite the same as) a SQL*Loader control file. Some points to note are as follows.

• Line 11: the ORGANIZATION EXTERNAL clause tells Oracle that we are creating an external table;
• Line 13: the ORACLE_LOADER driver is a new type that "powers" external tables;
• Line 14: we can set a DEFAULT DIRECTORY once for the entire table if we wish. This states that, unless we tell
Oracle otherwise, all files created or read as part of this table will reside in this named directory. In most cases, we
will wish to write log/bad/discardfiles to a logging directory and read our incoming data files from a data directory.
For simplicity, we have used a single XT_DIR in our examples for all files;
• Lines 15-31: the ACCESS PARAMETERS clause contains the SQL*Loader-style reference to enable Oracle to
parse the flat-file into rows and columns. At this time, external tables do not offer the extensive range of parse
options that SQL*Loader provides, yet still cater for most loading requirements. Note that if we have made any
syntactical errors in our ACCESS PARAMETERS, we can still create the external table. Access parameters
themselves are not parsed until we issue a SELECT against the external table;
• Lines 22-30: in this example, we have listed each of the fields in the file and table. Like SQL*Loader, the
ORACLE_LOADER external table driver assumes that all incoming fields are CHAR(255) unless we state
otherwise. In the EMP_XT table, we have a DATE column that needs to be converted as on line 26, hence the
need to list all fields. Otherwise, we can simply use the default parsing specified in the FIELDS clause, which in our
simple example only lists the field delimiter;
• Line 32: the LOCATION clause is where we specify the input file(s). Note that we don't have a file named
"emp.dat" at the time of table creation; this is not necessary (we shall discuss this later); and
• Line 34: similar to the ERRORS= clause of SQL*Loader, we can specify a REJECT LIMIT for an external table.
This is the number of bad records we will tolerate before the load is failed.

There are numerous additions to the example table we created above. It is not possible to describe all of the various
permutations (they are in the online documentation) but other options include: altering global field-parsing properties;
altering field-specific properties (as we saw with the DATE formatting above); LOAD WHEN clauses/DISCARD files;
specifying characterset/endianness; specifying delimiters and changing field trimming behaviour.

External tables are read-only. We cannot create indexes on them, nor can we issue DML against them (although they can
be sourced in DML statements, they cannot be the target).

using external tables

Now we have created our EMP_XT external table, we can use it as follows. Remember at this stage we don't have the
emp.dat file that the table expects (as per the LOCATION) clause). We can generate a simple csv-file from the existing EMP
table as follows (using the oracle-developer.net DATA_DUMP utility).

SQL> BEGIN

2 data_dump( query_in => 'SELECT * FROM emp',

3 file_in => 'emp.dat',

4 directory_in => 'XT_DIR',


5 nls_date_fmt_in => 'DD/MM/YYYY' );

6 END;

7 /

PL/SQL procedure successfully completed.

SQL> host dir d:\oracle\dir\emp.dat

Volume in drive D is USER

Volume Serial Number is 7476-8930

Directory of d:\oracle\dir

04/08/2002 18:57 633 emp.dat

1 File(s) 633 bytes

0 Dir(s) 26,696,744,960 bytes free

Now we are ready to select from our external table.

SQL> SELECT * FROM emp_xt;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

---------- ---------- --------- ---------- ---------- ---------- ---------- ----------

7369 SMITH CLERK 7902 17/12/1980 800 20

7499 ALLEN SALESMAN 7698 20/02/1981 1600 300 30

7521 WARD SALESMAN 7698 22/02/1981 1250 500 30

7566 JONES MANAGER 7839 02/04/1981 2975 20

7654 MARTIN SALESMAN 7698 28/09/1981 1250 1400 30

7698 BLAKE MANAGER 7839 01/05/1981 2850 30

7782 CLARK MANAGER 7839 09/06/1981 2450 10

7788 SCOTT ANALYST 7566 09/12/1982 3000 20

7839 KING PRESIDENT 17/11/1981 5000 10

7844 TURNER SALESMAN 7698 08/09/1981 1500 0 30


7876 ADAMS CLERK 7788 12/01/1983 1100 20

7900 JAMES CLERK 7698 03/12/1981 950 30

7902 FORD ANALYST 7566 03/12/1981 3000 20

7934 MILLER CLERK 7782 23/01/1982 1300 40

14 rows selected.

It is clear from the above that if we can query a flat-file via an external table, then we have fundamentally changed the way
in which we can load external data into our database! In fact, we may not even need to load the file at all as we can query it
directly (why bother to stage data that will be used once and replaced every day?).

external table metadata

We can see information about our EMP_XT table in several dictionary views as follows (note that XXX is a placeholder for
USER, ALL and DBA).

• XXX_TABLES;
• XXX_ALL_TABLES;
• XXX_EXTERNAL_TABLES; and
• XXX_EXTERNAL_LOCATIONS.

The last two are specific to external tables only. For general information on our external table we can query
USER_EXTERNAL_TABLES, which is structured as follows.

SQL> desc USER_EXTERNAL_TABLES

Name Null? Type

----------------------------------- -------- -----------------

TABLE_NAME NOT NULL VARCHAR2(30)

TYPE_OWNER CHAR(3)

TYPE_NAME NOT NULL VARCHAR2(30)

DEFAULT_DIRECTORY_OWNER CHAR(3)

DEFAULT_DIRECTORY_NAME NOT NULL VARCHAR2(30)

REJECT_LIMIT VARCHAR2(40)

ACCESS_TYPE VARCHAR2(7)

ACCESS_PARAMETERS VARCHAR2(4000)

As its name suggests, the XXX_EXTERNAL_LOCATIONS views list the file(s) that an external table is currently "pointed to".
We can read multiple files via a single external table and these can be in different directories. Currently, our EMP_XT
location is as follows.
SQL> SELECT *

2 FROM user_external_locations

3 WHERE table_name = 'EMP_XT';

TABLE_NAME LOCATION DIRECTORY_OWNER DIRECTORY_NAME

------------ ------------ ----------------- ----------------

EMP_XT emp.dat SYS XT_DIR

1 row selected.

modifying location

Remember that the emp.dat file did not exist at the time we created the EXP_XT table. If we try to query an external table
that has an incorrect location clause, we receive the following error.

SQL> host del d:\oracle\dir\emp.dat

SQL> host dir d:\oracle\dir\emp.dat

Volume in drive D is USER

Volume Serial Number is 7476-8930

Directory of d:\oracle\dir

File Not Found

SQL> SELECT * FROM emp_xt;

SELECT * FROM emp_xt

ERROR at line 1:

ORA-29913: error in executing ODCIEXTTABLEOPEN callout

ORA-29400: data cartridge error

KUP-04040: file emp.dat in XT_DIR not found

ORA-06512: at "SYS.ORACLE_LOADER", line 14


ORA-06512: at line 1

Our EMP_XT external table currently has a single incoming flat-file (emp.dat) and this doesn't exist. In operational systems,
we are more likely to receive new files every day and these are often uniquely named to distinguish one day's delivery from
another (e.g. with a business date of some format in the file name). The location clause of an external table can be modified
to cater for this without invalidating any dependencies (such as views or packages).

In the following example, we will create an emp_20020804.dat file and modify the EMP_XT table to reference this new file.
We will complete the example by selecting from it.

SQL> BEGIN

2 data_dump( query_in => 'SELECT * FROM emp',

3 file_in => 'emp_20020804.dat',

4 directory_in => 'XT_DIR',

5 nls_date_fmt_in => 'DD/MM/YYYY' );

6 END;

7 /

PL/SQL procedure successfully completed.

SQL> ALTER TABLE emp_xt LOCATION ('emp_20020804.dat');

Table altered.

SQL> SELECT * FROM emp_xt;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

---------- ---------- --------- ---------- ---------- ---------- ---------- ----------

7369 SMITH CLERK 7902 17/12/1980 800 20

7499 ALLEN SALESMAN 7698 20/02/1981 1600 300 30

7521 WARD SALESMAN 7698 22/02/1981 1250 500 30

7566 JONES MANAGER 7839 02/04/1981 2975 20

7654 MARTIN SALESMAN 7698 28/09/1981 1250 1400 30

7698 BLAKE MANAGER 7839 01/05/1981 2850 30


7782 CLARK MANAGER 7839 09/06/1981 2450 10

7788 SCOTT ANALYST 7566 09/12/1982 3000 20

7839 KING PRESIDENT 17/11/1981 5000 10

7844 TURNER SALESMAN 7698 08/09/1981 1500 0 30

7876 ADAMS CLERK 7788 12/01/1983 1100 20

7900 JAMES CLERK 7698 03/12/1981 950 30

7902 FORD ANALYST 7566 03/12/1981 3000 20

7934 MILLER CLERK 7782 23/01/1982 1300 40

14 rows selected.

investigating errors

External tables can optionally generate various log files in the same manner as SQL*Loader. In our EMP_XT table,
remember that we opted to create a logfile and badfile but didn't need a discardfile (as we do not have a LOAD WHEN
clause). For developers who are unfamiliar with these three SQL*Loader-style files, their short descriptions are:

• logfile: contains information on how the input files were parsed, the positions and error messages for any rejected
records and some other general information on the load such as the number of row successfully read;
• badfile: contains the rejected records (i.e. the records that couldn't be loaded for the reasons given in the logfile);
and
• discardfile: contains the records that failed the LOAD WHEN clause (this clause is a simple filter to prevent records
with certain data characteristics from being loaded).

To investigate errors with an external table read or load, therefore, we have the same information available to us as we did
with SQL*Loader. It is likely that, when putting together access parameters for the first few times, we will make mistakes.
Oracle will not parse the parameters when we create the table; rather they will be invoked when we try to read from the
external table (i.e. SELECT from it). Any syntactical errors will show up as a KUP-% message and will need to be
investigated in line with the online documentation (link provided at the end of this article). Errors with the data, however, can
be investigated by reference to the output files.

Note that logfiles are appended on each select from an external table so in a regular batch system we might wish to uniquely
name each output file for support and diagnostic purposes. Oracle provides two switches to append to the file names in the
LOGFILE, BADFILE and DISCARDFILE clauses. These are %p (process ID) and %a (agent number for parallel query) and
are specified in the access parameters (e.g. LOGFILE 'emp_xt_%p.log'). Assuming each select from the external table is
performed in a new session, each output file is generated with a new process ID.

It is unfortunate that we cannot use ALTER TABLE directly to modify the output file names as we are more likely to want to
append a date stamp to them (we can't do this because the file names are part of the ACCESS PARAMETERS clause). We
can, however, take advantage of the fact that we can alter the access parameters to build a utility based somewhat loosely
on the following example. This is a very rough demonstration of the principles we can adopt to make a log/bad/discardfile
naming utility (obviously a "real" utility would be far more flexible and parameterised). The following anonymous block finds
the existing logfile name and replaces it with "today's" logfile before using the external table.

SQL> DECLARE

3 v_access VARCHAR2(32767);

4 v_replace VARCHAR2(128);

5 v_oldfile VARCHAR2(128);

6 v_newfile VARCHAR2(128) := 'emp_20020804.log';

8 BEGIN

10 SELECT access_parameters INTO v_access

11 FROM user_external_tables

12 WHERE table_name = 'EMP_XT';

13

14 IF INSTR(v_access,'NOLOGFILE') > 0 THEN

15 v_access := REPLACE(

16 v_access, 'NOLOGFILE',

17 'LOGFILE ''' || v_newfile || ''''

18 );

19 ELSE

20 v_oldfile := SUBSTR(

21 v_access,

22 INSTR(v_access,'LOGFILE')+8,

23 (INSTR(v_access,'.log')+4)-(INSTR(v_access,'LOGFILE')+7)

24 );

25 v_replace := REPLACE(REPLACE(v_oldfile, '"', ''''), '''');

26 v_access := REPLACE(v_access, v_replace, v_newfile);

27 END IF;

28

29 EXECUTE IMMEDIATE
30 'ALTER TABLE emp_xt ACCESS PARAMETERS (' || v_access || ')';

31

32 END;

33 /

PL/SQL procedure successfully completed.

SQL> SELECT access_parameters

2 FROM user_external_tables

3 WHERE table_name = 'EMP_XT';

ACCESS_PARAMETERS

-----------------------------------------------------------------------

RECORDS DELIMITED by NEWLINE

BADFILE 'emp_xt.bad'

LOGFILE 'emp_20020804.log'

NODISCARDFILE

FIELDS TERMINATED BY ','

( empno

, ename

, job

, mgr

, hiredate CHAR(20) DATE_FORMAT DATE MASK "DD/MM/YYYY"

, sal

, comm

, deptno

1 row selected.

SQL> SELECT COUNT(*) FROM emp_xt;


COUNT(*)

----------

14

1 row selected.

SQL> host dir d:\oracle\dir\emp*

Volume in drive D is USER

Volume Serial Number is 7476-8930

Directory of d:\oracle\dir

04/08/2002 19:09 633 emp_20020804.dat

04/08/2002 21:54 2,558 emp_20020804.log

04/08/2002 19:04 11,850 emp_xt.log

3 File(s) 15,041 bytes

0 Dir(s) 26,696,704,000 bytes free

advantages over sqlloader

We have seen some examples of external table syntax but have not yet explored why we might use them over SQL*Loader.
It is the case that SQL*Loader can parse and load almost any flat-file we wish to throw at it. External tables, on the other
hand, cater for the more common processing requirements. Despite this, their advantages over SQL*Loader are numerous,
some of which are as follows:

• ease of use: external tables can be selected, sorted, filtered, joined, intersected, minused, unioned and so on
using SQL, the language most familiar database to Oracle developers. Anything we can do in a SELECT
statement with a "normal" table can be done with an external table. This makes working with external flat-files very
simple;
• performance (1): reads/loads involving external tables can be parallelised. When combined with direct path (in the
case of INSERTs), this dramatically outperforms SQL*Loader which has to load in serial mode. Parallelism can be
set at the external table level or more selectively in hints;
• performance (2): as seen above, loading a table from an external table is faster than SQL*Loader due to parallel
query and DML. In addition to this, ETL processes that read directly from external tables rather than pre-loaded
staging tables are faster because they do not incur the SQL*Loader step (i.e. data is only read once). For example,
if an ETL process takes 10 minutes in SQL*Loader and 10 minutes in SQL or PL/SQL loading, using external
tables directly in the latter process can eradicate up to 50% of the ETL time;
• disk space: external tables do not require any database space; only the space consumed by the flat-files on the
filesystem. SQL*Loader requires two copies of each file (one inside the database in a staging table), so external
tables are "cheaper" on disk;
• error-trapping: SQL*Loader returns different codes based on its outcome. For batch systems, this can be tricky
because the error code is sometimes ambiguous. For example, exit code 2 means "a bad file has been created".
Of course, this may or may not be a cause for concern. If we allow 50 bad records (errors=50), then the fact that a
bad file contains 1-49 bad records should not signal a batch failure. We therefore need to write some clever code
to interrogate the badfile or logfile to determine whether this is a "bad code 2" or "acceptable code 2". With external
tables, it is much more simple. Until we reach the REJECT LIMIT, the SQL statement continues or completes
successfully. If we reach this limit, the statement fails;
• debugging and support: external tables are equally useful as debugging and support aids. For example, we can
create further external tables over logfiles and badfiles to investigate errors easily with SQL. DBAs can also create
external tables over critical files such as the alert log.

We can see that the advantages of external tables are compelling. In our batch systems, we can use them in one of two
ways and still achieve their benefits. First, and for maximum benefit, we can completely remove the need for staging tables
altogether and use the tables directly in our ETL code. Second, and for medium benefit, we can use them as a straight
replacement for SQL*Loader and therefore load our staging tables in parallel direct-path.

There are, however, a few issues to be aware of that might limit the scope of how we use these tables. For example,
external tables can only have one location set at a time (i.e. different sessions cannot share a table but set different
locations). This means that their use for loading is serialised for the length of time a specific location is required and in use
(and probably should be protected as such). If multiple sessions need to share the same table but load different files at the
same time, then either multiple tables must be created or some form of locking will be required. For the latter scenario, the
length of time a table is locked should be reduced to a minimum to maximise concurrency. For this reason, multi-user
external tables will probably not figure in long-running batch processes.

In addition to this, external tables cannot be indexed. For most staging tables, this won't be an issue as we would prefer
hash joins for ETL batch queries, but in some cases indexes might be required. For these, it would probably be sensible to
use external tables to load the physical staging tables rather than use them directly in the ETL queries.

external tables and the cbo

Because external tables are likely to be used in multi-table queries (possibly involving other external tables in addition to
base tables, lookup tables etc), it is critical that the optimizer knows something of their contents. Fortunately, we can
compute statistics using DBMS_STATS (ANALYZE will not work). Note that the use of the word "compute" in the previous
sentence was deliberate as we must gather statistics on the entire flat-file(s) and not a sample of records.

The following example shows the "before and after" effects of gathering statistics on our EMP_XT external table. Note how
we instruct a full computation in DBMS_STATS (using the estimate_percent parameter).

SQL> set autotrace traceonly explain

SQL> SELECT * FROM emp_xt;


Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE

1 0 EXTERNAL TABLE ACCESS (FULL) OF 'EMP_XT'

SQL> BEGIN

2 DBMS_STATS.GATHER_TABLE_STATS(user,'EMP_XT',estimate_percent=>NULL);

3 END;

4 /

PL/SQL procedure successfully completed.

SQL> SELECT * FROM emp_xt;

Execution Plan

----------------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=14 Bytes=518)

1 0 EXTERNAL TABLE ACCESS (FULL) OF 'EMP_XT' (Cost=2 Card=14 Bytes=518)

SQL> SELECT num_rows, blocks, last_analyzed

2 FROM user_tables

3 WHERE table_name = 'EMP_XT';

NUM_ROWS BLOCKS LAST_ANALYZED

---------- ---------- -------------

14 1 04/08/2002

1 row selected.
Most flat-files will remain reasonably static in size or perhaps grow slowly over time. It is probably sufficient in most cases to
gather statistics on a representative file (or set of files) for each external table once. These can then be locked until such
time that a re-gather is deemed necessary.

generating from sql*loader

Earlier in this article, a number of advantages of external tables over SQL*Loader were listed. It was also noted that in some
cases we might use external tables as a replacement for SQL*Loader but continue to use staging tables. A new feature of
SQL*Loader helps us to develop this quickly; this is the EXTERNAL_TABLE option. This option generates an external table
"load" script from an existing SQL*Loader control file. Once we have this, the script can be cleaned and modified as
required. For existing loads, this dramatically removes the development time of external tables and removes much of the
time spent learning the new syntax through the inevitable "trial and error" phases with the access parameters.

In the following example, we'll create a small SQL*Loader control file to load a staging table named EMP_STG. We will use
the new option EXTERNAL_TABLE=GENERATE_ONLY in the control file and run it to generate a SQL script. Note that
invoking SQL*Loader for this file will not load any data. First, we can see the control file.

options (external_table=generate_only)

load data

infile 'd:\oracle\dir\emp.dat'

badfile 'd:\oracle\dir\emp.bad'

truncate

into table emp_stg

fields terminated by ','

trailing nullcols

empno

, ename

, job

, mgr

, hiredate date "dd/mm/yyyy"

, sal

, comm

, deptno

)
Note the options clause. The external_table=generate_only clause makes SQL*Loader run "silently" and generate a logfile
only. SQL*Loader is invoked as normal. The resulting logfile contains the following SQL statements based on the contents of
the control file. These statements support an external table load to replace our SQL*Loader job.

CREATE DIRECTORY statements needed for files

------------------------------------------------------------------------

CREATE DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000 AS 'd:\oracle\dir\'

CREATE TABLE statement for external table:

------------------------------------------------------------------------

CREATE TABLE "SYS_SQLLDR_X_EXT_EMP_STG"

EMPNO NUMBER(4),

ENAME VARCHAR2(10),

JOB VARCHAR2(9),

MGR NUMBER(4),

HIREDATE DATE,

SAL NUMBER(7,2),

COMM NUMBER(7,2),

DEPTNO NUMBER(2)

ORGANIZATION external

TYPE oracle_loader

DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000

ACCESS PARAMETERS

RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252

BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'emp.bad'

LOGFILE 'emp.log_xt'

READSIZE 1048576
FIELDS TERMINATED BY "," LDRTRIM

MISSING FIELD VALUES ARE NULL

REJECT ROWS WITH ALL NULL FIELDS

EMPNO CHAR(255)

TERMINATED BY ",",

ENAME CHAR(255)

TERMINATED BY ",",

JOB CHAR(255)

TERMINATED BY ",",

MGR CHAR(255)

TERMINATED BY ",",

HIREDATE CHAR(255)

TERMINATED BY ","

DATE_FORMAT DATE MASK "dd/mm/yyyy",

SAL CHAR(255)

TERMINATED BY ",",

COMM CHAR(255)

TERMINATED BY ",",

DEPTNO CHAR(255)

TERMINATED BY ","

location

'emp.dat'

)REJECT LIMIT UNLIMITED

INSERT statements used to load internal tables:


------------------------------------------------------------------------

INSERT /*+ append */ INTO EMP_STG

EMPNO,

ENAME,

JOB,

MGR,

HIREDATE,

SAL,

COMM,

DEPTNO

SELECT

EMPNO,

ENAME,

JOB,

MGR,

HIREDATE,

SAL,

COMM,

DEPTNO

FROM "SYS_SQLLDR_X_EXT_EMP_STG"

statements to cleanup objects created by previous statements:

------------------------------------------------------------------------

DROP TABLE "SYS_SQLLDR_X_EXT_EMP_STG"

DROP DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000

Oracle has done the work for us and provided a script to create our EMP_XT external table. We will probably choose to
clean up certain elements of the generated code (such as the object names, for example), but the hard work of converting a
SQL*Loader load to an external table load is done.
more on performance

All external table reads are direct path and with direct path inserts and parallel DML, external tables will usually be quicker
than SQL*Loader (which is serial). We've also seen that ETL processes involving external tables can be faster because they
do not require the "lead-in" time of loading a staging table first. However, we have not yet looked at external tables
compared with "internal" tables (note that this how Oracle described such tables in the SQL*Loader logfile above). In the
following simple example, we will generate 1 million records in a heap table and a flat-file and compare the time taken to
scan this data.

First we will create a large table based on 1 million EMP records as follows.

SQL> CREATE TABLE million_emps

2 NOLOGGING

3 AS

4 SELECT e1.*

5 FROM emp e1

6 , emp, emp, emp, emp, emp

7 WHERE ROWNUM <= 1000000;

Table created.

We will write these records to a flat-file for our existing EMP_XT table to use.

SQL> BEGIN

2 data_dump( query_in => 'SELECT * FROM million_emps',

3 file_in => 'million_emps.dat',

4 directory_in => 'XT_DIR',

5 nls_date_fmt_in => 'DD/MM/YYYY' );

6 END;

7 /

PL/SQL procedure successfully completed.

SQL> ALTER TABLE emp_xt LOCATION ('million_emps.dat');

Table altered.
Using the wall-clock and autotrace, we will compare a simple fetch of the data from both the MILLION_EMPS and the
EMP_XT tables, starting with the "internal" table.

SQL> set timing on

SQL> set autotrace traceonly statistics

SQL> SELECT * FROM million_emps;

1000000 rows selected.

Elapsed: 00:00:18.05

Statistics

----------------------------------------------------------

0 recursive calls

0 db block gets

72337 consistent gets

5675 physical reads

0 redo size

13067451 bytes sent via SQL*Net to client

733825 bytes received via SQL*Net from client

66668 SQL*Net roundtrips to/from client

0 sorts (memory)

0 sorts (disk)

1000000 rows processed

Next we can test the EMP_XT external table.

SQL> SELECT * FROM emp_xt;

1000000 rows selected.

Elapsed: 00:00:22.08
Statistics

----------------------------------------------------------

192 recursive calls

0 db block gets

6282 consistent gets

0 physical reads

0 redo size

13067451 bytes sent via SQL*Net to client

733825 bytes received via SQL*Net from client

66668 SQL*Net roundtrips to/from client

6 sorts (memory)

0 sorts (disk)

1000000 rows processed

We can see that the external table is approximately 22% slower than the internal table on a single read. If we re-run the
fetches several times, similar timings are recorded. Note that autotrace does not show any physical reads in its statistics for
the external table (this is possibly a bug).

projection

Finally for this introduction, we will look at column projection and how it can affect the results of an external table read. In the
following example, we will modify the EMP_XT column to shrink the maximum size of the ENAME column. When we select
from the table, this modified column will be the source of some bad records (as the data will be too wide). We will then select
from the EMP_XT table but without the "bad column" and compare the record counts.

First we will modify the ENAME column and set the location of EMP_XT to use the original small flat-file.

SQL> ALTER TABLE emp_xt MODIFY ename VARCHAR2(5);

Table altered.

SQL> ALTER TABLE emp_xt LOCATION ('emp_20020804.dat');

Table altered.
We will now attempt to read all of the data from the flat-file as follows.

SQL> SELECT * FROM emp_xt;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

---------- ----- --------- ---------- ---------- ---------- ---------- ----------

7369 SMITH CLERK 7902 17/12/1980 800 20

7499 ALLEN SALESMAN 7698 20/02/1981 1600 300 30

7521 WARD SALESMAN 7698 22/02/1981 1250 500 30

7566 JONES MANAGER 7839 02/04/1981 2975 20

7698 BLAKE MANAGER 7839 01/05/1981 2850 30

7782 CLARK MANAGER 7839 09/06/1981 2450 10

7788 SCOTT ANALYST 7566 09/12/1982 3000 20

7839 KING PRESIDENT 17/11/1981 5000 10

7876 ADAMS CLERK 7788 12/01/1983 1100 20

7900 JAMES CLERK 7698 03/12/1981 950 30

7902 FORD ANALYST 7566 03/12/1981 3000 20

11 rows selected.

Knowing the EMP table as well as we all do, we can see that 3 records are missing (the bad and log files verify this as being
related to ENAME data). What is interesting about external tables is that we have seemingly erratic results depending on
how we fetch from them. For example, if we run a COUNT(*) as follows, we see that we have 14 records in our file as
expected and not 11 as fetched in the previous example.

SQL> SELECT COUNT(*) FROM emp_xt;

COUNT(*)

----------

14

1 row selected.
These supposedly erratic results are not, in fact, erratic at all. This behaviour is related to column projection. Oracle will not
bother to project data unless it is actually required (it would be wasted effort). With EMP_XT in its current format, if we don't
ask Oracle to project the ENAME column, we won't encounter any bad data.

We must also consider parsing and the difference between a field and a column (we have both in an external table). In our
example, Oracle can still parse the bad data. This is because our access parameters clause defaults the ENAME field in the
file to CHAR(255). This is easily wide enough for all ENAME values in our flat-file. The ENAME column, however, is now
defined as VARCHAR2(5), so it is only when Oracle moves from parsing into reading and projecting that errors are
encountered. In other words, Oracle can parse every record of the ENAME field into a CHAR(255), but it can't squeeze
every record into a VARCHAR2(5) during projection.

To continue, as COUNT(*) is a special case, we should validate the projection behaviour with a specific column. We will use
COUNT(empno) as follows (EMPNO does not have any bad data).

SQL> SELECT COUNT(empno) FROM emp_xt;

COUNT(EMPNO)

------------

14

1 row selected.

Finally, if we project the ENAME column, we see the effects of the bad data once more.

SQL> SELECT COUNT(ename) FROM emp_xt;

COUNT(ENAME)

------------

11

1 row selected.

Incidentally, we haven't seen the effects of a violated reject limit as yet in this article. Seeing as we have manufactured some
bad data for the projection example, we can easily demonstrate this. In the final example, we will set a 0 reject limit for
EMP_XT and issue a SELECT against the table.

SQL> ALTER TABLE emp_xt REJECT LIMIT 0;

Table altered.
SQL> SELECT * FROM emp_xt;

SELECT * FROM emp_xt

ERROR at line 1:

ORA-29913: error in executing ODCIEXTTABLEFETCH callout

ORA-30653: reject limit reached

ORA-06512: at "SYS.ORACLE_LOADER", line 14

ORA-06512: at line 1

Rather than a KUP error message, this time we get a meaningful, precise Oracle error that tells us the issue with our
external table read.

user-defined aggregate functions in oracle 9i

In this article, we will take a brief look at user-defined aggregate functions, a new feature of Oracle 9i Release 1 (9.0). This
feature enables us to create our own aggregation rules and define them as a function for use in aggregate or analytic SQL
statements in the same way as built-ins such as MAX, SUM or AVG.

We will begin the article with a brief overview of these functions and then see a "real-life" example of an aggregate that I
wrote when SQL alone could not fulfil the business requirements.

overview of user-defined aggregate functions

User-defined aggregate functions are possible due to Oracle's Data Cartridge model which takes advantage of object types
and other extensibility features. According to the online documentation a data-cartridge is "the mechanism for extending the
capabilities of the Oracle server". What this actually means is that we can create a set of data rules inside one or more
object types and "plug them into" the server to use as indexing schemes, aggregate functions or even CBO extensions.
Oracle provides the development framework for us to do this and it is actually a lot more simple than it sounds (as we shall
see).

With the development framework for data cartridges mapped out for us by Oracle, user-defined aggregate functions are
actually very simple to create. We need two components to implement aggregate functions as follows.

• an object type specification and body; and


• a PL/SQL function.

There are actually other ways of creating user-defined aggregate functions (for example using Java or OCI directly), but we
will concentrate purely on "standard" object types and PL/SQL.
As stated, there is a development framework for data cartridges (think of this as a template) provided by Oracle. This
provides the structure of the object type and its methods (down to the detail of how we name them) and also defines how we
create our PL/SQL function. For example, our object type will contain the following:

• attribute(s) for holding state information. These can be of any existing built-in or user-defined datatype;
• a mandatory ODCIAggregateInitialize static method to reset the state attributes at the start of an aggregation;
• a mandatory ODCIAggregateIterate member method to apply each input value to the running aggregate value;
• a mandatory ODCIAggregateTerminate member method to return the final result of the aggregate; and
• an optional ODCIAggregateMerge member method, used to combine the results of more than one stream of
aggregation (for example, with parallel query) before returning a result.

The PL/SQL function itself is very simple. It simply declares itself as an aggregate function and defines the object type that
implements its rules.

The best way to visualise a user-defined aggregate function is through an example. In the following section we will see an
aggregate function used to support the rules of a data-summary routine. This function is named "BREAKSUM".

implementing breaksum

The BREAKSUM function responds to a particular business problem that cannot be solved with SQL and analytic functions.
The requirement is simply to keep a running total of account transactions but to reset the value to 0 whenever the addition of
a negative transaction value makes the overall running total negative.

sample data

We will start by creating a sample table and data.

SQL> CREATE TABLE accounts ( account NUMBER, cycle NUMBER, val NUMBER(5,2) );

Table created.

SQL> INSERT ALL

2 INTO accounts VALUES ( 999, 1, 0.11 )

3 INTO accounts VALUES ( 999, 2, 0.18 )

4 INTO accounts VALUES ( 999, 3, 0.27 )

5 INTO accounts VALUES ( 999, 4, 0.35 )

6 INTO accounts VALUES ( 999, 5, 0.52 )

7 INTO accounts VALUES ( 999, 6, 0.61 )

8 INTO accounts VALUES ( 999, 7, -1.51 )

9 INTO accounts VALUES ( 999, 8, 0.63 )


10 INTO accounts VALUES ( 999, 9, 92.00 )

11 INTO accounts VALUES ( 999, 10, 88.00 )

12 INTO accounts VALUES ( 999, 11, -400 )

13 INTO accounts VALUES ( 999, 12, 0.8 )

14 SELECT NULL

15 FROM dual;

12 rows created.

To help us to visualise the problem that the aggregate function will solve, we can query the data and keep a running total of
the VAL column as follows.

SQL> SELECT account

2 , cycle

3 , val

4 , SUM(val) OVER (ORDER BY cycle) AS running_total

5 FROM accounts;

ACCOUNT CYCLE VAL RUNNING_TOTAL

---------- ---------- ---------- -------------

999 1 .11 .11

999 2 .18 .29

999 3 .27 .56

999 4 .35 .91

999 5 .52 1.43

999 6 .61 2.04

999 7 -1.51 .53

999 8 .63 1.16

999 9 92 93.16

999 10 88 181.16

999 11 -400 -218.84

999 12 .8 -218.04
12 rows selected.

Note the running total result at cycle 11. Our requirement specifies that the running total must reset to 0 at this point. It
seems quite a simple requirement, but is actually very difficult to implement in a SQL statement that will perform well at high
volume.

breaksum type specification

As stated earlier, to create an aggregate function we need an object type and a PL/SQL function. We can begin with the
TYP_OBJ_BREAKSUM object type specification as follows. Note that we will code all four available ODCIAggregate
methods in our supporting type (we will wish to enable the function for parallel query later on; hence we include the optional
ODCIAggregateMerge function). We require only one attribute to maintain the running total for our ACCOUNTS.VAL
column.

SQL> CREATE TYPE typ_obj_breaksum AS OBJECT

2 (

3 sum NUMBER,

5 STATIC FUNCTION ODCIAggregateInitialize (

6 sctx IN OUT typ_obj_breaksum

7 ) RETURN NUMBER,

9 MEMBER FUNCTION ODCIAggregateIterate (

10 self IN OUT typ_obj_breaksum,

11 value IN NUMBER

12 ) RETURN NUMBER,

13

14 MEMBER FUNCTION ODCIAggregateTerminate (

15 self IN typ_obj_breaksum,

16 retval OUT NUMBER,

17 flags IN NUMBER

18 ) RETURN NUMBER,

19

20 MEMBER FUNCTION ODCIAggregateMerge (


21 self IN OUT typ_obj_breaksum,

22 ctx2 IN typ_obj_breaksum

23 ) RETURN NUMBER

24 );

25 /

Type created.

The formats for the ODCIAggregate methods are fixed by the data cartridge framework that Oracle provides. We do not
need to call any of these methods directly in any code we write, but we do need to set the datatypes to allow our object type
to be passed between methods. Each method returns a success constant of type NUMBER.

We do, however, have to determine the attributes to include in the object type. Remember that these are state attributes
used to store our intermediate program data for the duration of the aggregate call. The BREAKSUM function is simply a
variation on a SUM aggregate so only requires a single numeric state attribute to hold a running total (this is the attribute
named "sum" in our type specification).

breaksum type body

The type body is where we implement our aggregation rules for each of the methods. To make the implementation of
BREAKSUM easier to follow, we will step through the code one function at a time, beginning with the initialisation method as
follows.

SQL> CREATE TYPE BODY typ_obj_breaksum IS

3 STATIC FUNCTION ODCIAggregateInitialize (

4 sctx IN OUT typ_obj_breaksum

5 ) RETURN NUMBER IS

6 BEGIN

7 sctx := typ_obj_breaksum(0);

8 RETURN ODCIConst.Success;

9 END;

10

As we are summing, initialisation is very simple; we start at 0. We do this by initialising an instance of


TYP_OBJ_BREAKSUM as above. Note that the function returns a constant to indicate success to the calling context.

The ODCIAggregateIterate function that follows is where we code the majority of our logic. This is the method that instructs
Oracle on how to aggregate the individual elements that are passed to it. In the case of BREAKSUM, each element is a
number that must be added to keep a running sum. The rules for this sum dictate that if the running sum drops below 0, then
it must be reset to 0. We can see this logic below.

11 MEMBER FUNCTION ODCIAggregateIterate (

12 self IN OUT typ_obj_breaksum,

13 value IN NUMBER

14 ) RETURN NUMBER IS

15 BEGIN

16 self.sum := CASE

17 WHEN value >= 0

18 OR (value < 0 AND self.sum + value > 0)

19 THEN self.sum + value

20 ELSE 0

21 END;

22 RETURN ODCIConst.Success;

23 END;

24

The iterate method quite simply states that if the incoming value is positive or if the running sum added to a negative value
remains positive, then continue with the addition. Otherwise, reset the running total to 0. This will be invoked when the
incoming value is a negative number with a greater absolute value than the running total so far.

Next we can see the terminate method. This method is used to piece together the state attributes (in our case, just the "sum"
attribute) and return a final value for the aggregation. Our logic is simple as we just need to return the sum value itself.

25 MEMBER FUNCTION ODCIAggregateTerminate (

26 self IN typ_obj_breaksum,

27 retval OUT NUMBER,

28 flags IN NUMBER

29 ) RETURN NUMBER IS

30 BEGIN

31 retval := self.sum;

32 RETURN ODCIConst.Success;

33 END;

34
This completes the mandatory methods for a user-defined aggregate function. We can optionally include an
ODCIAggregateMerge method if we are likely to have more than one stream of aggregation (such as in parallel query). As
its name suggests, the merge method is where we aggregate the results of two separate streams of processing. The merge
logic for BREAKSUM is as follows.

35 MEMBER FUNCTION ODCIAggregateMerge (

36 self IN OUT typ_obj_breaksum,

37 ctx2 IN typ_obj_breaksum

38 ) RETURN NUMBER IS

39 BEGIN

40 self.sum := CASE

41 WHEN self.sum + ctx2.sum > 0

42 THEN self.sum + ctx2.sum

43 ELSE 0

44 END;

45 RETURN ODCIConst.Success;

46 END;

47

48 END;

49 /

Type body created.

For some aggregations, the merge method can be quite complex, but for a simple addition like BREAKSUM, we simply need
to repeat our iteration logic as above. The main difference is that the two values to be aggregated are both running totals,
rather than a running total and iteration value as with the ODCIAggregateIterate method.

This completes our implementation type for BREAKSUM. We can see that it is very simple to code, especially given the fact
that Oracle requires us to use a specific format and development framework.

breaksum aggregate function

Now we have our aggregation rules implemented via the TYP_OBJ_BREAKSUM type, we must create the SQL interface to
this type. We do this with the following function declaration.

SQL> CREATE FUNCTION breaksum (input NUMBER) RETURN NUMBER

2 PARALLEL_ENABLE
3 AGGREGATE USING typ_obj_breaksum;

4 /

Function created.

We can immediately see that there is no logic in our aggregate function. We simply declare it to be implemented by the
TYP_OBJ_BREAKSUM type. Note in particular the following:

• aggregate functions can only accept one parameter. In the case of BREAKSUM, this is a NUMBER (as we are
adding account transactions), but this can be an object type if more than a single input value from each record
needs to contribute to the aggregate result;
• we have declared the function to be parallel-query enabled (hence the need for the merge method in the
TYP_OBJ_BREAKSUM object type); and
• we declare the function an aggregate using the AGGREGATE keyword and supply the name of the implementing
object type with the USING [type_name] syntax.

using breaksum

We can now see the results of using BREAKSUM. We can begin with a simple aggregate query; i.e. a GROUP BY on the
account number. We only have one account (with twelve cycles) in our sample data, so we will simply output the overall
result of the BREAKSUM as follows.

SQL> SELECT account

2 , BREAKSUM (val) AS end_of_cycle_total

3 FROM accounts

4 GROUP BY

5 account;

ACCOUNT END_OF_CYCLE_TOTAL

---------- ------------------

999 .8

1 row selected.

Note that user-defined aggregate functions are also enabled for use as analytics. We can take advantage of this to show the
intermediate workings of BREAKSUM (i.e. the running total for each cycle) as follows.

SQL> SELECT account

2 , cycle
3 , val

4 , BREAKSUM (val) OVER

5 (PARTITION BY account ORDER BY cycle) AS running

6 FROM accounts;

ACCOUNT CYCLE VAL RUNNING

---------- ---------- ---------- ----------

999 1 .11 .11

999 2 .18 .29

999 3 .27 .56

999 4 .35 .91

999 5 .52 1.43

999 6 .61 2.04

999 7 -1.51 .53

999 8 .63 1.16

999 9 92 93.16

999 10 88 181.16

999 11 -400 0

999 12 .8 .8

12 rows selected.

Compare this to our original running total before we created BREAKSUM. We saw earlier that the running total at cycle 11
was negative and that BREAKSUM was needed to reset this total to 0. We can see above that BREAKSUM has done
exactly this.

We can test BREAKSUM against multiple accounts. In the following example we add ten more accounts to our sample data,
each with twelve cycles of data. We then aggregate each account using BREAKSUM.

SQL> INSERT INTO accounts

2 SELECT MOD(ROWNUM,10) AS account

3 , MOD(ROWNUM,12)+1 AS cycle

4 , DBMS_RANDOM.VALUE AS val

5 FROM all_objects
6 WHERE ROWNUM <= 120;

120 rows created.

SQL> SELECT account

2 , BREAKSUM (val) AS end_of_cycle_total

3 FROM accounts

4 GROUP BY

5 account;

ACCOUNT END_OF_CYCLE_TOTAL

---------- ------------------

0 5.25

1 7.1

2 6.63

3 4.34

4 6.82

5 5.11

6 7.5

7 5.67

8 6.41

9 5.87

999 .8

11 rows selected.

Finally, we can test our merge method by invoking parallel query. In the following example, we run the previous SQL
statement with parallel query enabled and then compare the results with the serial output from above.

SQL> SELECT /*+ PARALLEL(accounts 2) */

2 account

3 , BREAKSUM (val) AS end_of_cycle_total


4 FROM accounts

5 GROUP BY

6 account;

ACCOUNT END_OF_CYCLE_TOTAL

---------- ------------------

0 5.25

2 6.63

7 5.67

8 6.41

9 5.87

1 7.1

3 4.34

4 6.82

5 5.11

6 7.5

999 .8

11 rows selected.

For a final "sanity check", we can confirm that we invoked parallel query as follows.

SQL> SELECT *

2 FROM v$pq_sesstat

3 WHERE statistic = 'Queries Parallelized';

STATISTIC LAST_QUERY SESSION_TOTAL

------------------------------ ---------- -------------

Queries Parallelized 1 1

1 row selected.

further reading
For more information on user-defined aggregate functions and other possibilities for extending Oracle's functionality, read
the Data Cartridge Guide, including this section. Possibly the best-known example of a user-defined aggregate on the
web is Tom Kyte's STRAGG function for concatenating strings from multiple rows. This link also includes a CONCAT_ALL
function by James Padfield which is essentially a re-factored STRAGG but with additional flexibility over delimiters.

decomposing sql%rowcount for merge

The MERGE statement (AKA "UPSERT") released in Oracle 9i is possibly one of the most useful ETL-enabling technologies
built into the Oracle kernel. For those who have missed Oracle's headlines for the last year and a half and are unaware of
what the MERGE statement does, it simply enables us to either UPDATE or INSERT a row into a target table in one
statement. You simply tell Oracle your rules for determining whether a target row should updated or inserted from the
source, and Oracle does the rest.

Prior to 9i, the alternative in SQL was to perform two individual DML statements (one UPDATE and one INSERT, each with
opposing predicates). The alternative in PL/SQL was even worse - either try to INSERT a row and if it failed with a
DUP_VAL_ON_INDEX exception, then UPDATE the row instead, or try to UPDATE a row, only inserting in the event of a
SQL%NOTFOUND.

rowcounts

In the days of excessive auditing, many ETL tools keep track of the number of inserts and number of updates being
performed during batch data loads. Unfortunately, Oracle is as yet unable to provide individual rowcounts for the UPDATE
and INSERT components of the MERGE statement (who's betting this will be a 10g feature?). Instead, we still get the SQL
%ROWCOUNT attribute, which tells us the overall number of records merged.

What I am demonstrating in this short article is my attempt at decomposing SQL%ROWCOUNT into its component DML
counts. To enable this I am simply maintaining package variables to keep track of either the number of updates, inserts or
both. Incrementing the package counters is done by simply "piggy-backing" one of the UPDATE SET columns or one of the
INSERT VALUES columns. This will become clearer when you see the code.

demo setup

First, the setup. I am going to create two variations on the EMP table. EMP_SOURCE will be a full copy of the EMP table (all
14 rows of it!). EMP_TARGET is going to contain just eight of these rows. I am then going to MERGE EMP_SOURCE into
EMP_TARGET, such that we will expect eight rows to be updated and six records to be inserted.

SQL> CREATE TABLE emp_source

2 AS

3 SELECT * FROM emp;

Table created.
SQL> SELECT COUNT(*) FROM emp_source;

COUNT(*)

----------

14

SQL> CREATE TABLE emp_target

2 AS

3 SELECT * FROM emp WHERE ROWNUM <= 8;

Table created.

SQL> SELECT COUNT(*) FROM emp_target;

COUNT(*)

----------

example merge

I'll now MERGE the EMP_SOURCE data into EMP_TARGET.

SQL> BEGIN

2 MERGE INTO emp_target et

3 USING ( SELECT * FROM emp_source ) es

4 ON ( et.empno = es.empno )

5 WHEN MATCHED THEN

6 UPDATE

7 SET et.ename = es.ename

8 , et.sal = es.sal

9 , et.mgr = es.mgr

10 , et.deptno = es.deptno

11 WHEN NOT MATCHED THEN


12 INSERT

13 ( et.empno, et.ename, et.sal, et.mgr, et.deptno )

14 VALUES

15 ( es.empno, es.ename, es.sal, es.mgr, es.deptno );

16

17 DBMS_OUTPUT.PUT_LINE(TO_CHAR(SQL%ROWCOUNT) || ' rows merged.');

18 END;

19 /

14 rows merged.

PL/SQL procedure successfully completed.

An important caveat to note is that Oracle will generate an "ORA-30926: unable to get a stable set of rows in the source
tables" error if there is either a many-to-one or many-to-many relationship between the source and target tables. This is not
as serious as it sounds because you would normally have to MERGE a one-to-one or one-to-zero relationship as your join
condition would be protected by the target's primary key.

etl package

As can be seen, the SQL%ROWCOUNT attribute provides us with the total MERGE count, but we have no idea of how
many updates or inserts were performed, which doesn't help us when we have a production batch run to audit or debug. To
enable us to keep track of this, I created a small package named ETL with three functions (plus two overloads) and one
procedure. The source code for this is as follows.

CREATE OR REPLACE PACKAGE etl AS

c_inserting CONSTANT PLS_INTEGER := 0;

c_updating CONSTANT PLS_INTEGER := 1;

FUNCTION merge_counter (

action_in IN PLS_INTEGER DEFAULT c_inserting

) RETURN PLS_INTEGER;

FUNCTION get_merge_update_count RETURN PLS_INTEGER;


FUNCTION get_merge_update_count (

merge_count_in IN PLS_INTEGER

) RETURN PLS_INTEGER;

FUNCTION get_merge_insert_count RETURN PLS_INTEGER;

FUNCTION get_merge_insert_count (

merge_count_in in PLS_INTEGER

) RETURN PLS_INTEGER;

PROCEDURE reset_counters;

END etl;

CREATE OR REPLACE PACKAGE BODY etl AS

g_update_counter PLS_INTEGER NOT NULL := 0;

g_insert_counter PLS_INTEGER NOT NULL := 0;

/*----------- FUNCTION merge_counter -----------*/

FUNCTION merge_counter (

action_in IN PLS_INTEGER DEFAULT c_inserting

) RETURN PLS_INTEGER IS

BEGIN

CASE action_in

WHEN c_updating

THEN g_update_counter := g_update_counter + 1;

WHEN c_inserting

THEN g_insert_counter := g_insert_counter + 1;

ELSE
RAISE PROGRAM_ERROR;

END CASE;

RETURN 0;

END merge_counter;

/*----------- FUNCTION get_merge_update_count V1 -----------*/

FUNCTION get_merge_update_count

RETURN PLS_INTEGER is

BEGIN

RETURN g_update_counter;

END get_merge_update_count;

/*----------- FUNCTION get_merge_update_count V2 -----------*/

FUNCTION get_merge_update_count (

merge_count_in IN PLS_INTEGER

) RETURN PLS_INTEGER IS

BEGIN

RETURN NVL( merge_count_in - g_insert_counter, 0 );

END get_merge_update_count;

/*----------- FUNCTION get_merge_insert_count V1 -----------*/

FUNCTION get_merge_insert_count

RETURN PLS_INTEGER IS

BEGIN

RETURN g_insert_counter;

END get_merge_insert_count;

/*----------- FUNCTION get_merge_insert_count V2 -----------*/

FUNCTION get_merge_insert_count (

merge_count_in IN PLS_INTEGER

) RETURN PLS_INTEGER IS
BEGIN

RETURN NVL( merge_count_in - g_update_counter, 0 );

END get_merge_insert_count;

/*----------- FUNCTION reset_counters -----------*/

PROCEDURE reset_counters IS

BEGIN

g_update_counter := 0;

g_insert_counter := 0;

END reset_counters;

END etl;

Note that there is one function to set either the UPDATE or INSERT counter and two functions to retrieve INSERT or
UPDATE counts (each with an overload - totalling four "get" functions). Finally, there is a small procedure to reset the
counters.

decomposing sql%rowcount using the etl package

The following is an example of how we might "piggy-back" the earlier MERGE statement to decompose the SQL
%ROWCOUNT.

SQL> ROLLBACK;

Rollback complete.

SQL> BEGIN

2 MERGE INTO emp_target et

3 USING ( SELECT * FROM emp_source ) es

4 ON ( et.empno = es.empno )

5 WHEN MATCHED THEN

6 UPDATE

7 SET et.ename = es.ename


8 , et.sal = es.sal

9 , et.mgr = es.mgr

10 , et.deptno = es.deptno

11 WHEN NOT MATCHED THEN

12 INSERT

13 ( et.empno

14 , et.ename

15 , et.sal

16 , et.mgr

17 , et.deptno

18 )

19 VALUES

20 ( CASE etl.merge_counter(etl.c_inserting)

21 WHEN 0 THEN es.empno

22 END

23 , es.ename

24 , es.sal

25 , es.mgr

26 , es.deptno

27 );

28

29 DBMS_OUTPUT.PUT_LINE(

30 TO_CHAR(SQL%ROWCOUNT) || ' rows merged.'

31 );

32 DBMS_OUTPUT.PUT_LINE(

33 TO_CHAR(etl.get_merge_insert_count) || ' rows inserted.'

34 );

35 DBMS_OUTPUT.PUT_LINE(

36 TO_CHAR(etl.get_merge_update_count( SQL%ROWCOUNT ))

37 || ' rows updated.'

38 );
39 etl.reset_counters;

40 END;

41 /

14 rows merged.

6 rows inserted.

8 rows updated.

Some notes on the above PL/SQL block:

• Line 20. I have wrapped the INSERT of es.empno in a simple CASE expression. This CASE expression calls the
ETL function MERGE_COUNTER, telling it that I require an INSERT counter to be incremented. This function
ALWAYS returns 0, so my first test is for 0, which of course guarantees that the es.empno value will be preserved
in the overall INSERT statement.
• Line 33. A simple call to the non-parameter overload of the ETL.GET_MERGE_INSERT_COUNT function returns
me the value of the INSERT counter.
• Line 36. I chose not to maintain an UPDATE counter in this example (because I know my data and I always get
more updates than inserts, so to be as efficient as possible given the fact that I'm forcing all these extra function
executions, I'll not bother with an UPDATE count). Instead, I supplied the overloaded
ETL.GET_MERGE_UPDATE_COUNT function with the SQL%ROWCOUNT and this returned me the total
MERGE count minus the retained INSERT count. You have the option of course to keep both counters going in
one statement, but the overloads in the ETL package make this unnecessary.

conclusions

And that is my basic implementation of a MERGE INSERT and a MERGE UPDATE counter. There is a performance cost,
naturally, but Oracle maintains that the cost of executing PL/SQL functions from SQL is continually falling. My tests in
merging 146,000 rows, 99% of which were inserts showed very few identifiable resource costs other than CPU time, which
was marginally increased as is to be expected. No doubt some of you will consider it a cost too much, but I would argue that
in many cases, the extra cost of keeping a package variable updated will be marginal when set against a large, database-
intensive MERGE statement. I will be happily using it in my ETL processing, until Oracle includes its own of course!

flashback query in oracle 9i

This short article introduces flashback query, a new feature of Oracle 9i. Flashback query enables us to query our data as it
existed in a previous state. In other words, we can query our data from a point in time before we or any other users made
permanent changes to it.

Flashback query works in tandem with another new feature of Oracle 9i; automatic undo management, whereby Oracle
manages our rollback segments for us. In fact, the term "rollback segments" is confined to history (or applies only to manual
undo management at least). From 9i, we know these segments as "undo segments" and under automatic management,
Oracle will create as many or as few as are required to satisfy our transaction workloads. This is, of course, within system
limits; these being the size of the undo tablespace and a new parameter named undo_retention. This parameter specifies
the number of seconds that Oracle should retain undo data for us.
Flashback query enables us to query the undo segments directly, either by SCN (System Change Number) or by timestamp.
The means and ease of doing this changed dramatically between Oracle 9.0 and 9.2 and we shall examine both of them in
this article. It is unlikely, however, that any users of 9i Release 2 (9.2) will wish to use the 9.0 method and we shall see why
below.

requirements

To be able to use flashback query, we require the following system elements:

• undo_management=auto (set in pfile/spfile);


• undo_retention=n (set in pfile/spfile, where n is a positive number of seconds);
• undo_tablespace=[undo tablespace name] (set in pfile/spfile);
• FLASHBACK or FLASHBACK ANY system privilege; and
• EXECUTE on DBMS_FLASHBACK.

sample data

For our sample data we will create a test table and copy a few rows and columns of ALL_TABLES as follows.

SQL> CREATE TABLE t

2 AS

3 SELECT owner, table_name, tablespace_name

4 FROM all_tables

5 WHERE ROWNUM <= 5;

Table created.

SQL> SELECT * FROM t;

OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS SEG$ SYSTEM

SYS CLU$ SYSTEM

SYS OBJ$ SYSTEM

SYS FILE$ SYSTEM

SYS COL$ SYSTEM


5 rows selected.

A small note on new tables, which applies to both releases of 9i, is that it might not be possible to begin flashback queries
against them immediately. Oracle recommends that we wait for approximately five minutes, which equates to the intervals at
which the SCN is mapped to a timestamp. The SCN itself is incremented with every commit. Attempting to flashback query a
new table before then is likely to result in ORA-01466: unable to read data - table definition has changed.

9i release 1: dbms_flashback

Flashback queries in Oracle 9.0 use calls to DBMS_FLASHBACK to enable and disable the feature either side of a SQL
statement or the opening of a PL/SQL cursor. We'll move straight onto an example to show how this works. In the following
example, we will update our data and commit the changes. We will then enable flashback query to a point in time before this
change and then run a query against the sample table. Finally we will disable flashback query to enable us to resume
"normal" query mode.

Note that when we enable flashback query, we provide either a timestamp or SCN. We will use the SCN for the 9.0
examples and timestamps (another new feature of 9i) for the 9.2 examples later on. Because a timestamp is mapped to an
SCN number every five minutes, the SCN offers a much finer level of precision for flashback. We will begin, therefore, by
capturing the current SCN and permanently updating our sample data.

SQL> SELECT DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER AS scn

2 FROM dual;

SCN

----------

592967

1 row selected.

SQL> UPDATE t SET tablespace_name = LOWER(tablespace_name);

5 rows updated.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM t;


OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS SEG$ system

SYS CLU$ system

SYS OBJ$ system

SYS FILE$ system

SYS COL$ system

5 rows selected.

Now we can enable flashback query to the point in time before we changed our data (i.e. using the earlier SCN). Once in
"flashback query mode", any queries we run will return data consistent with the SCN we enabled. We can see this below.

SQL> exec DBMS_FLASHBACK.ENABLE_AT_SYSTEM_CHANGE_NUMBER(592967);

PL/SQL procedure successfully completed.

SQL> SELECT * FROM t;

OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS SEG$ SYSTEM

SYS CLU$ SYSTEM

SYS OBJ$ SYSTEM

SYS FILE$ SYSTEM

SYS COL$ SYSTEM

5 rows selected.

We can now see our data in its original state (i.e. before we changed the TABLESPACE_NAME to lower-case). To leave
"flashback query mode", we must disable it using DBMS_FLASHBACK as follows. We can now see the data in its current,
post-update form.

SQL> exec DBMS_FLASHBACK.DISABLE;


PL/SQL procedure successfully completed.

SQL> SELECT * FROM t;

OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS SEG$ system

SYS CLU$ system

SYS OBJ$ system

SYS FILE$ system

SYS COL$ system

5 rows selected.

There are a number of uses for this feature, such as data recovery, adhoc change-tracking or point-in-time queries, but of
course, this is restricted to the period specified by undo_retention (with some variance depending on transaction loads). In
fact, Oracle suggests that it is feasible to build flashback query capabilities into our applications. We might view this as a
somewhat ambitious claim for the technology, but as a short-term recovery mechanism, it is very useful.

9i release 2: as of [scn|timestamp]

As we saw in the previous section, flashback query in 9i Release 1 is somewhat involved. Oracle 9i Release 2 makes
flashback query significantly easier by building the capability into the SQL FROM clause itself. In 9.2, we can query our
data AS OF TIMESTAMP or AS OF SCNdirectly. Flashback query does not need to be explicitly enabled and disabled via
package calls; it is invoked directly by this syntax.

We will repeat the earlier example but this time using 9.2 flashback query. Furthermore, we'll use a timestamp (another new
feature of Oracle 9i) instead of an SCN. Note that the initial ALTER SESSION is simply a convenience to enable a
consistent timestamp format mask without having to supply it explicitly.

SQL> ALTER SESSION SET NLS_TIMESTAMP_FORMAT = 'DD-MON-YYYY HH24:MI:SS.FF3';

Session altered.

SQL> SELECT LOCALTIMESTAMP

2 FROM dual;
LOCALTIMESTAMP

------------------------

22-NOV-2002 21:31:01.750

1 row selected.

SQL> UPDATE t SET table_name = LOWER(table_name);

5 rows updated.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM t;

OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS seg$ system

SYS clu$ system

SYS obj$ system

SYS file$ system

SYS col$ system

5 rows selected.

Now we can invoke flashback query as of the timestamp prior to our update.

SQL> SELECT *

2 FROM t AS OF TIMESTAMP TO_TIMESTAMP('22-NOV-2002 21:31:01.750');


OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS SEG$ SYSTEM

SYS CLU$ SYSTEM

SYS OBJ$ SYSTEM

SYS FILE$ SYSTEM

SYS COL$ SYSTEM

5 rows selected.

This is much easier! The extended FROM clause is simple and intuitive to use and is more likely to encourage developers to
use flashback query. A particularly good use for this is for resetting test data during development and unit-testing. It is also a
good short-term recovery tool for rectifying minor mistakes.

a note on flashback precision

As noted earlier, flashback query timestamps are mapped to SCNs, but only once every five minutes. This makes flashback
queries with timestamps subject to precision errors. We can see the effect of this in our 9.2 flashback query results above.
Our flashback query correctly returns the original upper-case TABLE_NAME data but it also returns the original upper-case
TABLESPACE_NAME data also. This tells us that the SCN that Oracle mapped to our timestamp is from a time before we
ran the 9.0 example.

Using an SCN, however, we can be more precise with our flashback query. We will take a guess that the SCN after our 9.0
example update will be a few greater than before we began (there is no other activity on this test system). Using this, we will
try a 9.2 flashback query AS OF SCN.

SQL> SELECT * FROM t AS OF SCN 592969;

OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS SEG$ system

SYS CLU$ system

SYS OBJ$ system

SYS FILE$ system

SYS COL$ system

5 rows selected.
We can now see the data as it existed after the first update (9.0 example) but before the second update (9.2). Using the
SCN enabled us to be far more precise with our flashback query.

recovering data

Finally for this article, the 9i Release 2 flashback query syntax makes it much easier to recover data. Using the AS OF
syntax, we can either update the table from the flashback query source or we can delete the current data and insert the
flashback data. In the following example, we'll remove the current data and replace it with the data as it existed after our 9.0
examples (i.e. lower-case TABLESPACE_NAME).

SQL> DELETE FROM t;

5 rows deleted.

SQL> INSERT INTO t

2 SELECT * FROM t AS OF SCN 592969;

5 rows created.

SQL> COMMIT;

Commit complete.

SQL> SELECT * FROM t;

OWNER TABLE_NAME TABLESPACE_NAME

--------------- --------------- --------------------

SYS SEG$ system

SYS CLU$ system

SYS OBJ$ system

SYS FILE$ system

SYS COL$ system

5 rows selected.
Data changes between current and flashback data can be identified quite simply by set queries using MINUS/UNION. If we
need to search for changes to specific columns, we can join the current and flashback datasets together (we can also use
the 9i FULL OUTER JOIN for this purpose). In the 9i Release 1 variant of flashback query, this would only be possible in
PL/SQL. The sequence would be: enable flashback -> open flashback cursor -> disable flashback -> open current cursor ->
fetch and compare data. As stated throughout this article, this simplicity of flashback query in 9i Release 2 makes this
complex variant largely redundant.

case expressions and statements in oracle 9i

The CASE expression was introduced by Oracle in version 8i. It was a SQL-only expression that provided much greater
flexibility than the functionally-similar DECODE function. The PL/SQL parser didn't understand CASE in 8i, however, which
was a major frustration for developers (the workaround was to use views, dynamic SQL or DECODE).

Oracle 9i Release 1 (9.0) extends CASE capabilities with the following enhancements:

• a new simple CASE expression (8i CASE was a "searched" or "switched" expression);
• a new CASE statement; a PL/SQL construct equivalent to IF-THEN-ELSE; and
• full PL/SQL support for both types of CASE expression; in SQL and in PL/SQL constructs (in 9i, the SQL and
PL/SQL parsers are the same).

In this article, we will work through each of the new features and show a range of possibilities for the new syntax.

simple case expression

The simple CASE expression is new in 9i. In SQL, it is functionally equivalent to DECODE in that it tests a single value or
expression for equality only. This is supposedly optimised for simple equality tests where the cost of repeating the test
expression is high (although in most cases it is extremely difficult to show a performance difference over DECODE or the
older searched CASE expression).

A simple CASE expression takes the following format. As with all CASE expression and statement formats in this article, it
will evaluate from top to bottom and "exit" on the first TRUE condition.

CASE {value or expression}

WHEN {value}

THEN {something}

[WHEN...]

[THEN...]

[ELSE...] --<-- NULL if not specified and no WHEN tests satisfied

END

The following is a contrived example of a simple CASE expression against the EMP table.

SQL> SELECT ename


2 , job

3 , CASE deptno

4 WHEN 10

5 THEN 'ACCOUNTS'

6 WHEN 20

7 THEN 'SALES'

8 WHEN 30

9 THEN 'RESEARCH'

10 WHEN 40

11 THEN 'OPERATIONS'

12 ELSE 'UNKNOWN'

13 END AS department

14 FROM emp;

ENAME JOB DEPARTMENT

---------- --------- ----------

SMITH CLERK SALES

ALLEN SALESMAN RESEARCH

WARD SALESMAN RESEARCH

JONES MANAGER SALES

MARTIN SALESMAN RESEARCH

BLAKE MANAGER RESEARCH

CLARK MANAGER ACCOUNTS

SCOTT ANALYST SALES

KING PRESIDENT ACCOUNTS

TURNER SALESMAN RESEARCH

ADAMS CLERK SALES

JAMES CLERK RESEARCH

FORD ANALYST SALES

MILLER CLERK ACCOUNTS


14 rows selected.

searched case expression

The searched CASE expression is the 8i variant. This is much more flexible than a simple CASE expression or DECODE
function. It can conduct multiple tests involving a range of different columns, expressions and operators. Each WHEN clause
can include a number of AND/OR tests. It takes the following format (note that the expressions to evaluate are included
within each WHEN clause).

CASE

WHEN {test or tests}

THEN {something}

[WHEN {test or tests}]

[THEN...]

[ELSE...]

END

For example:

CASE

WHEN column IN (val1, val2)

AND another_column > 0

THEN something

WHEN yet_another_column != 'not this value'

THEN something_else

END

The following query against EMP shows how we might use searched CASE to evaluate the current pay status of each
employee.

SQL> SELECT ename

2 , job

3 , CASE

4 WHEN sal < 1000

5 THEN 'Low paid'

6 WHEN sal BETWEEN 1001 AND 2000

7 THEN 'Reasonably well paid'


8 WHEN sal BETWEEN 2001 AND 3001

9 THEN 'Well paid'

10 ELSE 'Overpaid'

11 END AS pay_status

12 FROM emp;

ENAME JOB PAY_STATUS

---------- --------- --------------------

SMITH CLERK Low paid

ALLEN SALESMAN Reasonably well paid

WARD SALESMAN Reasonably well paid

JONES MANAGER Well paid

MARTIN SALESMAN Reasonably well paid

BLAKE MANAGER Well paid

CLARK MANAGER Well paid

SCOTT ANALYST Well paid

KING PRESIDENT Overpaid

TURNER SALESMAN Reasonably well paid

ADAMS CLERK Reasonably well paid

JAMES CLERK Low paid

FORD ANALYST Well paid

MILLER CLERK Reasonably well paid

14 rows selected.

case expressions in pl/sql

As stated earlier, the SQL and PL/SQL parsers are the same from 9i onwards. This means that CASE expressions can be
used in static implicit and explicit SQL cursors within PL/SQL. In addition to this, the CASE expression can also be used as
an assignment mechanism, which provides an extremely elegant method for IF-THEN-ELSE-type constructs. For example,
the following construct...

IF something = something THEN

variable := value;
ELSE

variable := alternative_value;

END IF;

...can now be written as a CASE expression as follows.

variable := CASE something

WHEN something

THEN value

ELSE alternative_value

END;

This flexibility is something that DECODE doesn't provide as it is a SQL-only function. Needless to say, both simple and
searched CASE expressions can be used as above. The following example shows a simple CASE expression being used to
assign a variable.

SQL> DECLARE

3 v_dummy VARCHAR2(10) := 'DUMMY';

4 v_assign VARCHAR2(10);

6 BEGIN

8 v_assign := CASE v_dummy

9 --

10 WHEN 'Dummy'

11 THEN 'INITCAP'

12 --

13 WHEN 'dummy'

14 THEN 'LOWER'

15 --

16 WHEN 'DUMMY'

17 THEN 'UPPER'

18 --
19 ELSE 'MIXED'

20 --

21 END;

22

23 DBMS_OUTPUT.PUT_LINE(

24 'Variable v_dummy is in '||v_assign||' type case.'

25 );

26

27 END;

28 /

Variable v_dummy is in UPPER type case.

PL/SQL procedure successfully completed.

We can take this example a stage further and use the CASE expression directly inside the call to DBMS_OUTPUT as
follows.

SQL> DECLARE

2 v_dummy VARCHAR2(10) := 'DUMMY';

3 BEGIN

4 DBMS_OUTPUT.PUT_LINE(

5 'Variable v_dummy is in ' || CASE v_dummy

6 WHEN 'Dummy'

7 THEN 'INITCAP'

8 WHEN 'dummy'

9 THEN 'LOWER'

10 WHEN 'DUMMY'

11 THEN 'UPPER'

12 ELSE 'MIXED'

13 END || ' type case.' );

14 END;

15 /
Variable v_dummy is in UPPER type case.

PL/SQL procedure successfully completed.

Here we have removed the need for an intermediate variable. Similarly, CASE expressions can be used directly in function
RETURN statements. In the following example, we will create a function that returns each employee's pay status using the
CASE expression from our earlier examples.

SQL> CREATE FUNCTION pay_status (

2 sal_in IN NUMBER

3 ) RETURN VARCHAR2 IS

4 BEGIN

5 RETURN CASE

6 WHEN sal_in < 1000

7 THEN 'Low paid'

8 WHEN sal_in BETWEEN 1001 AND 2000

9 THEN 'Reasonably well paid'

10 WHEN sal_in BETWEEN 2001 AND 3001

11 THEN 'Well paid'

12 ELSE 'Overpaid'

13 END;

14 END;

15 /

Function created.

SQL> SELECT ename

2 , pay_status(sal) AS pay_status

3 FROM emp;

ENAME PAY_STATUS

---------- --------------------

SMITH Low paid


ALLEN Reasonably well paid

WARD Reasonably well paid

JONES Well paid

MARTIN Reasonably well paid

BLAKE Well paid

CLARK Well paid

SCOTT Well paid

KING Overpaid

TURNER Reasonably well paid

ADAMS Reasonably well paid

JAMES Low paid

FORD Well paid

MILLER Reasonably well paid

14 rows selected.

Of course, we need to balance the good practice of rules encapsulation with our performance requirements. If the CASE
expression is only used in one SQL statement in our application, then in performance terms we will benefit greatly from "in-
lining" the expression directly. If the business rule is used in numerous SQL statements across the application, we might be
more prepared to pay the context-switch penalty and wrap it in a function as above.

Note that in some earlier versions of 9i, we might need to wrap the CASE expression inside TRIM to be able to return it
directly from a function (i.e. RETURN TRIM(CASE...)). There is a "NULL-terminator" bug similar to a quite-well known
variant in 8i Native Dynamic SQL (this would sometimes appear when attempting to EXECUTE IMMEDIATE a SQL
statement fetched directly from a table).

ordering data with case expressions

We have already seen that CASE expressions provide great flexibility within both SQL and PL/SQL. CASE expressions can
also be used in ORDER BY clauses to dynamically order data. This is especially useful in two ways:

• when we need to order data with no inherent order properties; and


• when we need to support user-defined ordering from a front-end application.

In the following example, we will order the EMP data according to the JOB column but not alphabetically.

SQL> SELECT ename

2 , job
3 FROM emp

4 ORDER BY CASE job

5 WHEN 'PRESIDENT'

6 THEN 1

7 WHEN 'MANAGER'

8 THEN 2

9 WHEN 'ANALYST'

10 THEN 3

11 WHEN 'SALESMAN'

12 THEN 4

13 ELSE 5

14 END;

ENAME JOB

---------- ---------

KING PRESIDENT

JONES MANAGER

BLAKE MANAGER

CLARK MANAGER

SCOTT ANALYST

FORD ANALYST

ALLEN SALESMAN

WARD SALESMAN

MARTIN SALESMAN

TURNER SALESMAN

SMITH CLERK

MILLER CLERK

ADAMS CLERK

JAMES CLERK

14 rows selected.
As stated earlier, the second possibility is for user-defined ordering. This is most common on search screens where users
can specify how they want their results ordered. It is quite common for developers to code complicated dynamic SQL
solutions to support such requirements. With CASE expressions, however, we can avoid such complexity, especially when
the number of ordering columns is low. In the following example, we will create a dummy procedure to output EMP data
according to a user's preference for ordering.

SQL> CREATE FUNCTION order_emps( p_column IN VARCHAR2 )

2 RETURN SYS_REFCURSOR AS

4 v_rc SYS_REFCURSOR;

6 BEGIN

8 DBMS_OUTPUT.PUT_LINE('Ordering by ' || p_column || '...');

10 OPEN v_rc FOR SELECT ename, job, hiredate, sal

11 FROM emp

12 ORDER BY

13 CASE UPPER(p_column)

14 WHEN 'ENAME'

15 THEN ename

16 WHEN 'SAL'

17 THEN TO_CHAR(sal,'fm0000')

18 WHEN 'JOB'

19 THEN job

20 WHEN 'HIREDATE'

21 THEN TO_CHAR(hiredate,'YYYYMMDD')

22 END;

23

24 RETURN v_rc;

25

26 END order_emps;

27 /
Function created.

CASE expressions can only return a single datatype, so we need to cast NUMBER and DATE columns to VARCHAR2 as
above. This can change their ordering behaviour, so we ensure that the format masks we use enable them to sort correctly.

Now we have the function in place, we can simulate a front-end application by setting up a refcursor variable in sqlplus and
calling the function with different inputs as follows.

SQL> var rc refcursor;

SQL> set autoprint on

SQL> exec :rc := order_emps('job');

Ordering by job...

PL/SQL procedure successfully completed.

ENAME JOB HIREDATE SAL

---------- --------- --------- ----------

SCOTT ANALYST 19-APR-87 3000

FORD ANALYST 03-DEC-81 3000

SMITH CLERK 17-DEC-80 800

ADAMS CLERK 23-MAY-87 1100

MILLER CLERK 23-JAN-82 1300

JAMES CLERK 03-DEC-81 950

JONES MANAGER 02-APR-81 2975

CLARK MANAGER 09-JUN-81 2450

BLAKE MANAGER 01-MAY-81 2850

KING PRESIDENT 17-NOV-81 5000

ALLEN SALESMAN 20-FEB-81 1600

MARTIN SALESMAN 28-SEP-81 1250

TURNER SALESMAN 08-SEP-81 1500

WARD SALESMAN 22-FEB-81 1250


14 rows selected.

SQL> exec :rc := order_emps('hiredate');

Ordering by hiredate...

PL/SQL procedure successfully completed.

ENAME JOB HIREDATE SAL

---------- --------- --------- ----------

SMITH CLERK 17-DEC-80 800

ALLEN SALESMAN 20-FEB-81 1600

WARD SALESMAN 22-FEB-81 1250

JONES MANAGER 02-APR-81 2975

BLAKE MANAGER 01-MAY-81 2850

CLARK MANAGER 09-JUN-81 2450

TURNER SALESMAN 08-SEP-81 1500

MARTIN SALESMAN 28-SEP-81 1250

KING PRESIDENT 17-NOV-81 5000

JAMES CLERK 03-DEC-81 950

FORD ANALYST 03-DEC-81 3000

MILLER CLERK 23-JAN-82 1300

SCOTT ANALYST 19-APR-87 3000

ADAMS CLERK 23-MAY-87 1100

14 rows selected.

The overall benefits of this method are derived from having a single, static cursor compiled into our application code. With
this, we do not need to resort to dynamic SQL solutions which are more difficult to maintain and debug but can also be
slower to fetch due to additional soft parsing.

filtering data with case expressions


In addition to flexible ordering, CASE expressions can also be used to conditionally filter data or join datasets. In filters,
CASE expressions can replace complex AND/OR filters, but this can sometimes have an impact on CBO arithmetic and
resulting query plans, so care will need to be taken. We can see this as follows. First we will write a fairly complex set of
predicates against an EMP-DEPT query.

SQL> SELECT e.ename

2 , e.empno

3 , e.job

4 , e.sal

5 , e.hiredate

6 , d.deptno

7 FROM dept d

8 , emp e

9 WHERE d.deptno = e.deptno

10 AND NOT ( e.deptno = 10

11 AND e.sal >= 1000 )

12 AND e.hiredate <= DATE '1990-01-01'

13 AND d.loc != 'CHICAGO';

ENAME EMPNO JOB SAL HIREDATE DEPTNO

---------- ---------- --------- ---------- --------- ----------

SMITH 7369 CLERK 800 17-DEC-80 20

JONES 7566 MANAGER 2975 02-APR-81 20

SCOTT 7788 ANALYST 3000 19-APR-87 20

ADAMS 7876 CLERK 1100 23-MAY-87 20

FORD 7902 ANALYST 3000 03-DEC-81 20

5 rows selected.

We can re-write this using a CASE expression. It can be much easier as a "multi-filter" in certain scenarios, as we can work
through our predicates in a much more logical fashion. We can see this below. All filters evaluating as true will be give a
value of 0 and we will only return data that evaluates to 1.

SQL> SELECT e.ename

2 , e.empno
3 , e.job

4 , e.sal

5 , e.hiredate

6 , d.deptno

7 FROM dept d

8 , emp e

9 WHERE d.deptno = e.deptno

10 AND CASE

11 WHEN e.deptno = 10

12 AND e.sal >= 1000

13 THEN 0

14 WHEN e.hiredate > DATE '1990-01-01'

15 THEN 0

16 WHEN d.loc = 'CHICAGO'

17 THEN 0

18 ELSE 1

19 END = 1;

ENAME EMPNO JOB SAL HIREDATE DEPTNO

---------- ---------- --------- ---------- --------- ----------

SMITH 7369 CLERK 800 17-DEC-80 20

JONES 7566 MANAGER 2975 02-APR-81 20

SCOTT 7788 ANALYST 3000 19-APR-87 20

ADAMS 7876 CLERK 1100 23-MAY-87 20

FORD 7902 ANALYST 3000 03-DEC-81 20

5 rows selected.

As stated, care needs to be taken with this as it can change the CBO's decision paths. As we are only dealing with EMP and
DEPT here, the following example ends up with the same join mechanism, but note the different filter predicates reported by
DBMS_XPLAN (this is a 9i Release 2 feature). When costing the predicates, Oracle treats the entire CASE expression as a
single filter, rather than each filter separately. With histograms or even the most basic column statistics, Oracle is able to
cost the filters when we write them the "AND/OR way". With CASE, Oracle has no such knowledge to draw on.
SQL> EXPLAIN PLAN SET STATEMENT_ID = 'FILTERS'

2 FOR

3 SELECT e.ename

4 , e.empno

5 , e.job

6 , e.sal

7 , e.hiredate

8 , d.deptno

9 FROM dept d

10 , emp e

11 WHERE d.deptno = e.deptno

12 AND NOT ( e.deptno = 10

13 AND e.sal >= 1000 )

14 AND e.hiredate <= DATE '1990-01-01'

15 AND d.loc != 'CHICAGO';

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY('PLAN_TABLE','FILTERS'));

PLAN_TABLE_OUTPUT

---------------------------------------------------------------------------

--------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

--------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 10 | 360 | 5 |

|* 1 | HASH JOIN | | 10 | 360 | 5 |

|* 2 | TABLE ACCESS FULL | DEPT | 3 | 27 | 2 |

|* 3 | TABLE ACCESS FULL | EMP | 10 | 270 | 2 |

--------------------------------------------------------------------
Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("D"."DEPTNO"="E"."DEPTNO")

2 - filter("D"."LOC"<>'CHICAGO')

3 - filter(("E"."DEPTNO"<>10 OR "E"."SAL"<1000) AND

"E"."HIREDATE"<=TO_DATE(' 1990-01-01 00:00:00', 'syyyy-mm-dd

hh24:mi:ss'))

Note: cpu costing is off

20 rows selected.

SQL> EXPLAIN PLAN SET STATEMENT_ID = 'CASE'

2 FOR

3 SELECT e.ename

4 , e.empno

5 , e.job

6 , e.sal

7 , e.hiredate

8 , d.deptno

9 FROM dept d

10 , emp e

11 WHERE d.deptno = e.deptno

12 AND CASE

13 WHEN e.deptno = 10

14 AND e.sal >= 1000

15 THEN 0

16 WHEN e.hiredate > DATE '1990-01-01'

17 THEN 0
18 WHEN d.loc = 'CHICAGO'

19 THEN 0

20 ELSE 1

21 END = 1;

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY('PLAN_TABLE','CASE'));

PLAN_TABLE_OUTPUT

---------------------------------------------------------------------------

--------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost |

--------------------------------------------------------------------

| 0 | SELECT STATEMENT | | 1 | 36 | 5 |

|* 1 | HASH JOIN | | 1 | 36 | 5 |

| 2 | TABLE ACCESS FULL | DEPT | 4 | 36 | 2 |

| 3 | TABLE ACCESS FULL | EMP | 14 | 378 | 2 |

--------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("D"."DEPTNO"="E"."DEPTNO")

filter(CASE WHEN ("E"."DEPTNO"=10 AND "E"."SAL">=1000) THEN

0 WHEN "E"."HIREDATE">TO_DATE(' 1990-01-01 00:00:00', 'syyyy-mm-dd

hh24:mi:ss') THEN 0 WHEN "D"."LOC"='CHICAGO' THEN 0 ELSE 1 END =1)

Note: cpu costing is off


19 rows selected.

case statements (pl/sql only)

We have spent a lot of time looking at CASE expressions in this article. We will finish with a look at the new CASE
statement. Most developers seem to use this term when they are in fact describing CASE expressions. The
CASE statement is a PL/SQL-only construct that is similar to IF-THEN-ELSE. Its simple and searched formats are as
follows.

CASE {variable or expression}

WHEN {value}

THEN {one or more operations};

[WHEN..THEN]

ELSE {default operation};

END CASE;

CASE

WHEN {expression test or tests}

THEN {one or more operations};

[WHEN..THEN]

ELSE {default operation};

END CASE;

Note the semi-colons. CASE statements do not return values like CASE expressions. CASE statements are IF tests that are
used to decide which action(s) or operation(s) to execute. Note also the END CASE syntax. This is mandatory. In the
following example, we will return to our dummy test but call a procedure within each evaluation.

SQL> DECLARE

3 v_dummy VARCHAR2(10) := 'DUMMY';

5 PROCEDURE output (input VARCHAR2) IS

6 BEGIN

7 DBMS_OUTPUT.PUT_LINE(

8 'Variable v_dummy is in '||input||' type case.');

9 END output;
10

11 BEGIN

12

13 CASE v_dummy

14

15 WHEN 'Dummy'

16 THEN output('INITCAP');

17

18 WHEN 'dummy'

19 THEN output('LOWER');

20

21 WHEN 'DUMMY'

22 THEN output('UPPER');

23

24 ELSE output('MIXED');

25

26 END CASE;

27

28 END;

29 /

Variable v_dummy is in UPPER type case.

PL/SQL procedure successfully completed.

CASE statements can be useful for very simple, compact and repeated tests (such as testing a variable for a range of
values). Other than this, it is unlikely to draw many developers away from IF-THEN-ELSE. The main difference between
CASE and IF is that the CASE statement mustevaluate to something. Oracle has provided a built-in exception for this event;
CASE_NOT_FOUND. The following example shows what happens if the CASE statement cannot find a true test. We will
trap the CASE_NOT_FOUND and re-raise the exception to demonstrate the error message.

SQL> DECLARE

3 v_dummy VARCHAR2(10) := 'dUmMy';

4
5 PROCEDURE output (input VARCHAR2) IS

6 BEGIN

7 DBMS_OUTPUT.PUT_LINE(

8 'Variable v_dummy is in '||input||' type case.');

9 END output;

10

11 BEGIN

12

13 CASE v_dummy

14

15 WHEN 'Dummy'

16 THEN output('INITCAP');

17

18 WHEN 'dummy'

19 THEN output('LOWER');

20

21 WHEN 'DUMMY'

22 THEN output('UPPER');

23

24 END CASE;

25

26 EXCEPTION

27 WHEN CASE_NOT_FOUND THEN

28 DBMS_OUTPUT.PUT_LINE('Ooops!');

29 RAISE;

30 END;

31 /

Ooops!

DECLARE

ERROR at line 1:
ORA-06592: CASE not found while executing CASE statement

ORA-06512: at line 29

The workaround to this is simple: add an "ELSE NULL" to the CASE statement.

You might also like