You are on page 1of 7

Faster, Safer, and Smaller LOBs

By Jonathan Gennick

Oracle SecureFiles provides increased performance, better security, and reduced disk usage.

Many organizations today face the challenge of an explosion in the amount of unstructured, often file-based data
that must be managed in conjunction with the more-structured data typically associated with relational databases.
Banking applications store check images for cleared checks, healthcare applications store digital images from X-
rays and CAT scans, geospatial systems store satellite imagery, enterprise resource planning systems store invoice
images, and the list goes on.

Oracle Database 11g makes significant strides in addressing the problem of unstructured data, by introducing a
completely new storage infrastructure for data that today is often left in file systems. This new infrastructure is
called Oracle SecureFiles, and it is a complete, drop-in replacement of Oracle's previous large-object (LOB)
infrastructure (now called BasicFiles). Every facet of the Oracle SecureFiles LOB architecture, from disk format
to network protocol to redo and undo algorithms, has been rethought and reimplemented to

• Improve performance
• Improve security
• Reduce disk usage

Suppose you're migrating digital assets from databases and file systems and developing a more-complete digital
asset management system such as one a magazine publisher might use. Your system must manage article drafts
and the workflow involved in writing, editing, revising, and publishing that content. This article describes how to
set up and configure Oracle SecureFiles and how to migrate sample content for this digital asset management
system.

Configuring Your Instance

After installing Oracle Database 11g and before using Oracle SecureFiles, check the setting of the new Oracle
Database 11g db_securefile initialization parameter. The default setting is

db_securefile = permitted

This setting makes BasicFiles, not SecureFiles, the default storage approach for any new LOB columns you create
while enabling you to explicitly specify SecureFiles when needed. (BasicFiles is also the LOB architecture in
previous database releases.) Other settings give you various options for forcing the use of one or the other storage
type. For example, you can set db_securefile = force to ensure that all LOBs are created as SecureFiles even
when the creating user explicitly specifies otherwise. You should be aware of which db_securefile setting is in
effect for your database.

Creating Tablespaces

Any tablespaces you plan to use with Oracle SecureFiles need to be configured for automatic segment space
management (ASSM). The key here is to specify SEGMENT SPACE MANAGEMENT AUTO when creating
your LOB tablespaces. The following creates the ARTICLE_LOBS tablespace for this article:

CREATE TABLESPACE
article_lobs
DATAFILE 'H:\APP\JONATHAN\ORADATA\ORCL\ARTICLE_LOBS.DBF'
SIZE 5M
EXTENT MANAGEMENT LOCAL
SEGMENT SPACE MANAGEMENT AUTO;

If you are following along with this article's example, be sure to give yourself adequate quota on the
ARTICLE_LOBS tablespace. You'll also need to adjust the datafile path to something appropriate for your
system.

Setting Up for Encryption

Oracle SecureFiles provides optional transparent data encryption of LOBs. Transparent data encryption protects
LOB data from unauthorized users who might somehow gain access to a datafile. This encryption is completely
transparent to end users and applications.

If you plan to use encryption with Oracle SecureFiles, you'll need to create a transparent data encryption wallet to
hold the encryption key. First, create a directory to contain the wallet. You can place that directory underneath
$ORACLE_HOME. For example

mkdir H:\app\Jonathan\product\11.1.0\db_1\wallet

Next, add an ENCRYPTION_WALLET_LOCATION setting to your sqlnet.ora file. You'll find sqlnet.ora in
$ORACLE_HOME/network/admin. Following is the setting corresponding to the directory just created:

ENCRYPTION_WALLET_LOCATION=
(SOURCE=(METHOD=FILE)(METHOD_DATA=
(DIRECTORY= H:\app\Jonathan\product\
11.1.0\db_1\wallet)))

Be certain to include the ENCRYPTION_WALLET_LOCATION setting in sqlnet.ora as one long line. And
again, if you are following along with this article's example, you'll need to adjust the wallet directory to
something appropriate for your own system.

Finally, set an encryption key:

ALTER SYSTEM SET ENCRYPTION KEY


AUTHENTICATED BY "My secret key";

With this setting, your Oracle Database instance will generate an encryption key and place it in the wallet. The
wallet is then protected by the password "My secret key."

Creating SecureFile LOBs

Creating an Oracle SecureFiles LOB can be as simple as specifying STORE AS SECUREFILE in the LOB storage
clause of a CREATE TABLE or ALTER TABLE statement. (Be sure to assign your SecureFiles LOBs to
appropriate, ASSM-enabled tablespaces.) Listing 1 creates the article_draft_redef table, with a single, SecureFiles
LOB column named article_content. Data for that LOB is assigned to the ARTICLE_LOBS tablespace (created
earlier in this article). Each row in the table contains a magazine article at a specific stage in the
writing/editing/revising cycle.

Code Listing 1: Creating an Oracle SecureFiles LOB

CREATE TABLE article_draft_redef (


article_id NUMBER,
article_stage VARCHAR(10),
article_content BLOB,

CONSTRAINT article_stage_check_b
CHECK (article_stage IN (
'1st Draft', 'Edit Pass', '2nd Draft',
'Copyedit', 'Review', 'Final')),

CONSTRAINT article_draft_pk_b
PRIMARY KEY (article_id)
)
LOB (article_content) STORE AS SECUREFILE
article_content (
TABLESPACE article_lobs
RETENTION MIN 3600
KEEP_DUPLICATES NOCOMPRESS DECRYPT
CACHE READS);

Listing 1 shows some of the options available when creating a new LOB.

RETENTION MIN 3600 ensures that at least one hour (60 minutes x 60 seconds/minute = 3,600) of undo data is
retained for the LOB column. This ability to specify a minimum retention time is helpful for running your
database in flashback mode. The default is RETENTION AUTO, which retains only enough undo to satisfy
consistent read requests while a transaction is in progress. You can also specify a maximum retention in bytes or
no retention at all. See the Oracle SQL Reference for details on the syntax for retention.

KEEP_DUPLICATES, NOCOMPRESS, and DECRYPT explicitly disable deduplication, compression, and


encryption. I'll come back to these options later in the article. In real life, you would likely enable one or more of
them from the beginning.

CACHE READS causes LOB data to be placed into the buffer cache during read operations but not during write
operations. You gain improved read performance here at the price of potentially crowding other data out of the
buffer cache. NOCACHE specifies the default behavior, which is never to bring a LOB's data into the buffer
cache.

I've created the article_draft_redef table in Listing 1 to migrate a table that uses the older BasicFiles to the newer
SecureFiles storage method. To follow along with the migration, download the o57securefiles.zip file, unzip that
file, and follow the instructions in readme.txt to create the original table named article_draft that I will be
migrating. The logical column structure of article_draft is exactly the same as that of article_draft_redef.

The article_draft table contains existing LOB data stored with BasicFiles. I'll assume that that data is in use by
running applications and that I want to migrate that data from BasicFiles to SecureFiles. The article_draft_redef
table, with its SecureFiles LOB column, represents the destination of the article_draft table data. (Bonus! There is
even a real article for you to read in the example data.)

Planning a Migration

Because Oracle SecureFiles represents a completely new way of writing LOB data to the database, the only way
to migrate LOB data from BasicFiles to SecureFiles is essentially to rewrite the data by recreating or redefining
the tables containing BasicFiles columns. If you can afford to take your data offline, you can simply select data
from your old table and insert it into the new one. Begin with a statement such as
INSERT INTO article_draft_redef
SELECT article_id,
article_stage,
article_content
FROM article_draft;

Follow this INSERT with a DROP and a RENAME to make the new table take the place of the old one:

DROP TABLE article_draft;

RENAME article_draft_redef
TO article_draft;

There's really no need to take your data offline, though. Instead, consider using the online table redefinition
feature introduced in Oracle9i Database Release 1. Listing 2 shows a PL/SQL block that migrates the article_draft
table to the structure indicated by the article_draft_redef table. Online table definition is very easy to do, and it
makes the migration transparent to users and applications, because the table being migrated remains available the
entire time.

Code Listing 2: Migrating by online redefinition

DECLARE
error_counter PLS_INTEGER;
BEGIN

--Begin the redefinition process


DBMS_REDEFINITION.START_REDEF_TABLE (
'gennick', 'article_draft', 'article_draft_redef',
'article_id, article_stage, article_content');

--Finish the redefinition process


DBMS_REDEFINITION.FINISH_REDEF_TABLE (
'gennick', 'article_draft', 'article_draft_redef');

END;

If your LOB table happens to be partitioned, another approach to consider is partition exchange. You can specify
LOB storage on a per- partition basis, giving you the option of migrating one partition at a time from BasicFiles
to SecureFiles. Use the URL under "Next Steps" to go to Arup Nanda's article on partitioning features, "Partition
Decisions," where you will find an example showing how partition exchange is done.

Be sure you have enough tablespace and disk space set aside to support whatever migration method you choose.
For the online redefinition approach, you'll need enough space for two complete copies of your data to coexist.
Partition exchange requires only enough disk space to hold two copies of whichever partition you are exchanging.
Bear in mind that one downside of partition exchange is that the partition being swapped needs to be briefly taken
offline. In the end, you'll have to weigh the different approaches and their trade-offs and choose the approach that
works best in your own situation.

Confirming Migration

You can confirm the SecureFiles status of a given LOB column by checking that column's segment subtype. For
example, execute the following to confirm that article_content is now a SecureFiles LOB:

SELECT segment_subtype
FROM user_segments
WHERE segment_name='ARTICLE_CONTENT';

The result you want to see is SECUREFILE, which indicates SecureFiles storage. If you see ASSM, you're still
set for BasicFiles storage.

Enabling Deduplication

Having migrated to Oracle SecureFiles, you can choose to enable deduplication, an Oracle SecureFiles feature
whereby the database server stores only one copy of a given LOB in a given column within the same partition.
Two or more users can independently store the same data in a LOB column—storing the same article draft twice,
for example—and the database keeps track of that duplication, storing only one copy of the data. Deduplication is
transparent. Users each perceive that they have their own copy of deduplicated data, even though that is not really
the case.

Code Listing 3: Checking space used by a LOB

DECLARE
seg_blocks NUMBER;
seg_bytes NUMBER;

used_blocks NUMBER;
used_bytes NUMBER;

expired_blocks NUMBER;
expired_bytes NUMBER;

unexpired_blocks NUMBER;
unexpired_bytes NUMBER;

BEGIN
DBMS_SPACE.SPACE_USAGE (
'GENNICK', 'ARTICLE_CONTENT', 'LOB'
, seg_blocks, seg_bytes, used_blocks, used_bytes
, expired_blocks, expired_bytes, unexpired_blocks, unexpired_bytes);

DBMS_OUTPUT.PUT_LINE ('Bytes used = ' || to_char(used_bytes));


END;

In a real-life migration, it would likely make sense to deduplicate during, not after, the migration process. For
example purposes, though, it's nice to be able to see that removal of duplicates is actually occurring. Run the code
in Listing 3 to see how many bytes are in use by the LOB segment. (Be sure to change the first parameter of the
procedure call to a valid schema name on your system.) On my system, the result is

Bytes used = 450560

To deduplicate the data in the article_content column in the article_draft table, issue the following ALTER
TABLE statement:

ALTER TABLE article_draft


MODIFY LOB(article_content)
(DEDUPLICATE LOB);
Existing LOBs in the article_content column will be scanned, and any current duplicates will be detected and
eliminated. Any new LOB values written to the column will be checked against existing values to prevent future
duplication.

Now, run the code in Listing 3 again and you can see the effects of the deduplication. The number of bytes used
by the segment should be reduced. On my system, I see the following, lower usage:

Bytes used = 376832

The number of bytes used in the LOB segment has dropped because document #6 is a duplicate of an article
stored in a different row. Because of deduplication, that article is not stored twice. Instead, a pointer is stored to
the LOB in that other row. When you read the contents of the LOB for document #6, that pointer is dereferenced
for you in a completely transparent manner.

Duplicate detection is based on checksums computed with the SHA1 cryptographic hash algorithm. When you
write a new LOB into a LOB segment, a checksum is computed over the first n bytes of that new LOB. If that
checksum doesn't match that for any existing LOB in the segment, the new LOB is written to the database.
Checksums are saved for each LOB, for future deduplication.

When the checksum for a newly incoming LOB matches an existing checksum, there is the strong possibility that
the two LOBs are identical. The database instance will begin doing a byte-by-byte comparison of the incoming
LOB data with the possible duplicate that is already stored. That possible duplicate is referred to as the primary
LOB. The comparison process reads the primary LOB, but no new data is written, so long as the LOB data
continues to match. If the LOBs compare as equal, only a pointer to the primary LOB is written. If the
comparison fails, data from the primary LOB up to that point is used to construct the first part of the new LOB,
and remaining incoming data for the new LOB is written to disk.

Enabling Compression

Compression encodes the data in a LOB to reduce the number of bytes required to store that LOB. Compression
is transparent and reduces the amount of storage needed for your LOB data, giving you more benefit from each
dollar your organization spends on disk storage systems. Compression requires additional CPU cycles, so be sure
that compression is worthwhile before enabling it on a LOB column. For example, image data, such as PNG files
and JPEG files, is often compressed as part of the encoding scheme used. Further compression of image files may
be a waste of CPU cycles. But text-based data such as XML and word-processing documents tends to compress a
great deal, making the trade-off of CPU time for more disk space very worthwhile.

To enable compression on the sample SecureFiles LOB column, issue the following ALTER TABLE statement:

ALTER TABLE article_draft


MODIFY LOB(article_content)
(COMPRESS HIGH);

This statement activates a high level of compression on the column data. You can specify MEDIUM to trade away
some of that compression for reduced CPU cycles. There is currently no LOW compression option, but that
keyword is reserved for future implementation.

If you rerun the code in Listing 3, you should see that even less space is now used by the LOB segment. I get the
following result:

Bytes used = 90112


Remember that compression has multiple benefits. Disk space requirements are reduced. Less data needs to be
transferred back and forth, leading to better performance. Redo generation is reduced. Less buffer cache data gets
used, leaving more room for other objects to be in the cache.

Enabling Encryption

Finally, we come to transparent data encryption. LOB data that is encrypted is protected even when datafiles or
backup files fall into the wrong hands. Encrypted LOB data is protected even while being sent across the network.

To encrypt LOBs in Oracle SecureFiles, you can choose from four encryption algorithms: 3DES168, AES128,
AES192, and AES256. For example, to choose AES256 encryption, which is a 256-bit algorithm, issue the
following ALTER TABLE statement:

ALTER TABLE article_draft


MODIFY LOB(article_content)
(ENCRYPT USING 'AES256');

Remember that transparent data encryption protects your data from malevolent users who might happen to come
into possession of a database datafile or backup piece. You still must manage access to database objects within
your database, though, so that only authorized database users are allowed access to query the LOB data you are
protecting.

Reap the Benefits!

Oracle SecureFiles offers three compelling features you can take advantage of immediately without making any
changes to your LOB-using applications:

• Deduplication
• Compression
• Encryption

Preliminary testing also shows significant increases in performance through use of Oracle SecureFiles. Oracle's
in-house testing shows a 200 to 900 percent increase in write performance, depending on LOB size and whether
existing segment space is being reused. Early-adopter clients have reported 300 to 700 percent increases in
performance of key application functions.

Because Oracle SecureFiles is completely backward-compatible with BasicFiles, even to the point of using the
same datatype names, you can begin to take advantage of everything Oracle SecureFiles has to offer, almost from
the moment you upgrade to Oracle Database 11g. You do need to rewrite your LOB data in the database, but you
can do that while keeping the data online, and the change is completely transparent to both users and applications.

Oracle SecureFiles is a gift. Drop it in. Take immediate advantage of improved performance, save money on disk
storage, and secure your data from those who would do you harm.

You might also like