You are on page 1of 17

Database LOB Sizing and Performance Optimization PegaRULES Process Commander v 5.

Copyright 2009 Pegasystems Inc., Cambridge, MA


All rights reserved.

This document describes products and services of Pegasystems Inc. It may contain trade secrets and proprietary information. The document and product are protected by copyright and distributed under licenses restricting their use, copying distribution, or transmittal in any form without prior written authorization of Pegasystems Inc. This document is current as of the date of publication only. Changes in the document may be made from time to time at the discretion of Pegasystems. This document remains the property of Pegasystems and must be returned to it upon request. This document does not imply any commitment to offer or deliver the products or services described. This document may include references to Pegasystems product features that have not been licensed by your company. If you have questions about whether a particular capability is included in your installation, please consult your Pegasystems service consultant. For Pegasystems trademarks and registered trademarks, all rights reserved. Other brand or product names are trademarks of their respective holders. Although Pegasystems Inc. strives for accuracy in its publications, any publication may contain inaccuracies or typographical errors. This document could contain technical inaccuracies or typographical errors. Changes are periodically added to the information herein. Pegasystems Inc. may make improvements and/or changes in the information described herein at any time.

This document is the property of: Pegasystems Inc. 101 Main Street Cambridge, MA 02142-1590 Phone: (617) 374-9600 Fax: (617) 374-9620 www.pega.com PegaRULES Process Commander Document: Database LOB Sizing and Performance Optimization Software Version 5.4 Updated: February 4, 2009

Database LOB Sizing and Performance Optimization

CONTENTS
1 2 3 4 5 OVERVIEW ..................................................................................................................... 1 SUGGESTED APPROACH ................................................................................................. 1 THE STRUCTURE OF PEGARULES DATA ........................................................................ 1 DATABASE SIZING SCRIPTS AND SCRIPT OUTPUT ........................................................... 2 ORACLE BLOB SIZING................................................................................................... 2 5.1 5.2 5.3 5.4 5.5 6 6.1 6.2 6.3 6.4 6.5 6.6 7 7.1 7.2 7.3 7.4 7.5 Using the Oracle Sizing Script..................................................................................................... 2 Effects of Row Storage, Chunk Size, and BLOB Caching ........................................................ 3 Performance and Sizing Guidelines for PegaRULES BLOBs in Oracle .................................. 4 Exposed Columns and Table Sizing in Oracle Databases.......................................................... 4 Performance and Sizing Guidelines for Oracle Tables .............................................................. 5 Using the Sizing Scripts for DB2 Versions Earlier Than V8.2.................................................. 6 Using the Sizing Scripts for DB2 Versions 8.2 and Later.......................................................... 7 BLOB Storage in DB2 UDB ....................................................................................................... 7 Performance and Sizing Guidelines for PegaRULES BLOBs in DB2 ..................................... 8 Exposed Column and Table Sizing in DB2 Databases .............................................................. 8 Performance and Sizing Guidelines for DB2 Tables ................................................................. 9 Using the Sizing Script with SQL Server 2000 .......................................................................... 9 Using the Sizing Script with SQL Server 2005/2008 ...............................................................10 BLOB Storage in SQL Server ...................................................................................................11 Performance and Sizing Guidelines for PegaRULES BLOBs in SQL Server .......................12 Exposed Column and Table Sizing in SQL Server Databases ................................................12

DB2 BLOB SIZING ........................................................................................................ 6

SQL SERVER BLOB SIZING ........................................................................................... 9

Database LOB Sizing and Performance Optimization

Database LOB Sizing and Performance Optimization

1 Overview
This document explains how to estimate disk requirements for the PegaRULES database by running sizing scripts, calculating an estimated size, and tuning performance for BLOB (Binary Large Object) data. Database system types described in this document are Oracle, DB2, and SQL Server. This document is intended for database and system administrators. Readers of this document should be familiar with the management of their database system and the PRPC database.

2 Suggested Approach
The BLOB sizing scripts should be run during new application development, at the point when the developers have some knowledge of the volume of data being generated, and need to estimate the size of the database and its BLOBs. The developers of more mature applications should run the sizing procedures to estimate disk space whenever new work objects or work types are introduced that will put a new strain on the database. Important factors in both cases include the size of the applications work objects, and whether or not BLOB compression is enabled. Finally, the procedures can be run on a schedule, as part of regular database administration, to aid the DBA in estimating the growth of the database for each table. Typically database management systems store BLOB data in an area separate from the data in exposed columns. In addition, default caching and access methods for this binary data may be different from the ones used by regular data. Therefore, sizing and managing the BLOB data for space, performance, and scale requires careful consideration. Sizing information generated by the script helps a DBA decide on the appropriate placement and sizing of the BLOB data for each table in PRPC. Sizing entails a three-part procedure: Run the sizing script appropriate for the database type to collect storage allocation information on both the BLOB column and all exposed columns for each table in the database. Use the script data as input to sizing calculations. These yield the sizing estimates for the tables. Adjust database settings accordingly. Use the given performance guidelines to re-organize the storage of the database tables.

Compression, which can reduce BLOB size by a third or more, is enabled by a property setting in the PRPC pegarules.xml file (versions 4.x and earlier) or the prconfig.xml file (versions 5.x and later). Compression is turned on by default. Refer to PRKB-9850, How to Compress the BLOB Values in the PegaRULES Database, for further information. Scripts for each supported database system are in the zip file Blob_sizing.zip. Download this zip from: http://pdn.pega.com/DevNet/PRPCv4/TechnologyPapers/TechPapers.asp

3 The Structure of PegaRULES Data


PRPC stores data in the PegaRULES database in two formats. Binary data, including multimedia content, is stored as a storage stream, a column that contains property data in a compressed format. In Oracle, SQL Server, and DB2, the storage stream is stored as a binary large object, or BLOB, data type. Most PegaRULES tables contain a storage stream column named pzPVStream.
1

Database LOB Sizing and Performance Optimization

Text data, used by the application, is stored in relational, or exposed, columns in database tables. A developer might choose to add to the exposed data by adding text columns for data maintained in the BLOBs.

4 Database Sizing Scripts and Script Output


For each of the supported database systems, this table lists script names, DB versions supported by the scripts, and whether the script requires editing before it is executed.
DB System Sizing Script Name DB Versions Supported Edit?

Oracle DB2 UDB

oracle_sizing.sql db2_sizing_proc.sql and db2_sizing_run.sql db2_pre8.2_sizing.db2 sql_server_sizing.sql

SQL Server

All versions Version 8.2 and later (Version 8.2 and Version 8.1 Fix Pack 7 are the same.) All versions earlier than 8.2 All versions

No Yes

Yes Yes

Script Output

For each table in the PegaRULES schema, the script output lists: Name of Table Row Count Average Row Length: The average row size in bytes, including the exposed columns and the tables LOB locator value. Average BLOB Length: The average size in bytes of the tables BLOB column. Max BLOB Length: The size in bytes of the tables largest BLOB.

A PegaRULES table can have either zero or one BLOB column. If a table meets the following two conditions, it will not have a BLOB column: Average BLOB length and Max BLOB length equals 0, -, or NULL. Number of rows is greater than 0.

5 Oracle BLOB Sizing


This section provides performance and sizing guidelines for PRPC systems using Oracle.

5.1

Using the Oracle Sizing Script

The download location for the sizing script is: http://pdn.pega.com/DevNet/PRPCv4/TechnologyPapers/TechPapers.asp


1. 2. 3. 4. 5.

Download and open Blob_Sizing.zip. Extract oracle_sizing.sql to a local directory on your system and navigate to that directory. Log on to the Oracle database using SQL*Plus with DBA credentials. For example:
C:\>sqlplus sys/manager as sysdba

Spool the script output to a text file called Pega_sizes.txt.


sql>spool pega_sizes.txt

Run the script using the name of the PegaRULES schema as the argument.
sql>@oracle_sizing.sql PEGA

Database LOB Sizing and Performance Optimization

where PEGA is the schema name for the PegaRULES application. The schema name must be in upper case letters. 6. Turn off spooling and exit.
sql>spool off; sql>exit

The resulting file (pega_sizes.txt) contains a tabular report that shows row and column size statistics for each table in the PegaRULES schema.
SQL> @oracle_sizing.sql PEGA Table Name PC_ASSIGN_WORKBASKET PC_ASSIGN_WORKLIST PC_DATA_WORKATTACH PC_HISTORY_WORK | | | | | NUM ROWS | AVG ROW SZ | MAX LOB LEN | AVG LOB LEN 0 | 0 | | 1 | 2237 | 1867 | 1867 0 | 0 | | 13 | 1598 | 1625 | 1117

5.2

Effects of Row Storage, Chunk Size, and BLOB Caching

Row Storage. Oracle uses a complex strategy for storing and accessing BLOB data. BLOBs may be

stored either in the row (along with other data) or in a separate storage area. If the clause ENABLE STORAGE IN ROW is used during table creation, and if the size of the BLOB is less than 3964 bytes, it will be stored together with other row data. When BLOBs are stored in a row, multiple BLOBs may reside in a single data block. ENABLE STORAGE IN ROW is the default setting for all tables. If DISABLE STORAGE IN ROW clause is used during table creation, then regardless of the BLOBs size, it is stored in a separate area.
Chunk Size. A chunk is one or more Oracle blocks, and the default chunk size is one block. When data

is stored in an area outside of the row, the CHUNK SIZE parameter sets the number of bytes Oracle will read in and out of the BLOB at a time. Specify the chunk size for the LOB when creating the table that contains the LOB. Chunk size corresponds to the data size used in Oracle when accessing or modifying the LOB value. Set CHUNK SIZE to a multiple of the tablespace block size for that LOB column. For example, if the block size is 8192 bytes, using a chunk size of 16384 bytes will require two blocks for each row with BLOB data (16384 bytes / 8192 bytes). If the CHUNK SIZE value is less than the block size, or not a multiple of it, Oracle sets the CHUNK SIZE to the next closest multiple of the block size. In the above example, if the CHUNK SIZE is set to 1024, Oracle resets it to 8192 bytes.
BLOB Caching. BLOBs that are stored in table rows use the same buffer caching rules as the tabular

row, as they reside in the same data block as the rows. By default, BLOBs stored out of the row use direct path reads and writes to store and retrieve data without caching. Another way to cache BLOBs is to use the CACHE parameter when defining the BLOB storage clause. This parameter causes reads and writes to BLOB data to use the buffer cache. Since CACHE is used for frequently-accessed data, set the CACHE parameter for BLOBs that are small enough to stay in memory without excessive thrashing. An alternative to the CACHE parameter is CACHE READ. This causes BLOB data to be brought into the buffer cache during read operations only. During write operations, the BLOB will use direct path writes and bypass the buffer cache.

Database LOB Sizing and Performance Optimization

5.3
1.

Performance and Sizing Guidelines for PegaRULES BLOBs in Oracle


Retain the ENABLE STORAGE IN ROW setting for all tables where the maximum BLOB size is less than 3964 bytes. This allows the BLOBs to be stored in the same row as other data. Therefore, you can have multiple BLOBs in a single data block. After getting the sizing information from the oracle_sizing.sql script, you may want to include extra space for growth in BLOB columns. For example, if you have tables with BLOB sizes between 7KB and 8KB, you might want to increase the default buffer size to 16KB to accommodate growth. For tables with large BLOBs, set the CHUNK SIZE of the table to be close to the average BLOB size. This allows BLOB columns to be written and read in the increments of the CHUNK SIZE, while allowing the BLOBs to reside in the same tablespace as the table data. Review the waits statistics of a stats pack or Automatic Workload Repository report to see if there are excessive direct path read/write events. If the size of the BLOBs on these tables with the excessive read/writes is less than 2MB, use the CACHE parameter. Using the CACHE parameter for tables with large BLOBs may cause excessive thrashing of the buffer cache due to the large objects competing for RAM with other database objects. NOTE: Discussion on how to run stats pack reports is out of the scope of this document. For more information, see the Performance Tuning guide in the Oracle documentation set. Place large BLOBs into their own tablespace with a larger block size. You can then allocate a separate buffer cache of the appropriate block size for these BLOBs. This feature is applicable for versions 9i and higher. This allows you to separate the larger BLOBs, while still taking advantage of the buffer cache. Using the NOLOGGING parameter for PegaRULES BLOBs is strongly discouraged. This could potentially leave the application in an unstable state, due to the lack of logging information available if the database fails. Consider creating the tablespace with Automatic Storage Management using large fixed extents, if there are excessive High Watermark enqueues in a stats pack report. This optimizes performance and disk utilization. If it is not already enabled, consider enabling compression through the pegarules.xml file (in versions 4.2 and earlier) or prconfig.xml file (versions 5.1 and later). BLOB compression can greatly improve utilization of disk storage. Refer to PRKB-9850, How to Compress the BLOB Values in the PegaRULES Database, for further information.

2.

3.

4.

5.

6.

7.

8.

5.4

Exposed Columns and Table Sizing in Oracle Databases


pc_history_work pc_assign_workbasket pc_assign_worklist pc_work pr_log_usage or pr4_log_usage (PRPC versions 5.1 and earlier), or pr_perf_stats (all later PRPC versions)

The tables that are most likely to grow over time are:

Depending on the flows and the processes developed, additional tables (such as pc_other) may grow as well; discuss this issue with your PRPC System Architect.
4 Database LOB Sizing and Performance Optimization

Use the data from Pega_sizing.txt to calculate the size required for the exposed columns for each table.
1.

Estimate the number of rows per DB block by dividing the block size by the average row size from pega_sizes.txt.
DB Block Size / Average row size = Rows per block

2.

Calculate the number of blocks required for the table by dividing the number of rows by the number of rows per block.
Number of rows in table / Number of rows per block = Number of blocks needed per table

3.

Calculate the space required for the exposed columns by multiplying the blocks needed by the block size.
Number of blocks needed per table * block size = Space required

For example, consider this example space calculation for a database created with a block size of 8K, in which the table pc_work contains 2000 rows with an average row length of 1.5K (from pega_sizes.txt).
Block size / Ave row size = Rows per block 8K / 1.5K = 5 Rows in table / Rows per block) = Blocks required 2000 / 5 = 400 Blocks required * Block size = Space required 400 * 8K = 3200K (3.2MB)

In this example, the pc_work table requires approximately 400 8K blocks for storage, or 3.2 MB of space.
Notes

Keep in mind that this figure is an estimate, because it does not account for the per-block header of approximately 80 to 150 bytes, nor for any percent free space value that might have been specified when the table was allocated. Both values must be subtracted from the DB block size to report a more accurate value. Discard any remainder portion of the rows per block value. It is necessary to round down this number, because the system cannot split a row across blocks (8 / 1.5 = 5.33, rounded down to 5).

5.5
1.

Performance and Sizing Guidelines for Oracle Tables


After determining the size of the BLOB and the exposed columns for each table, add the two to find the total estimated size requirement for the PegaRULES database. 2. Performance may be gained by setting the tables PR_SYS_LOCKS and PC_UNIQUE_ID to use the KEEP buffer pool. These are small and highly interactive tables that can be permanently maintained in the buffer cache 3. Analyze all database tables and indexes regularly, or set up an Automatic Workload Repository report in version 10g, so as to provide the optimizer with the appropriate information to choose the best explain plans for SQL queries. 4. The parameter db_file_multiblock_read_count specifies the number of blocks that are read in a single I/O during a full table scan. You can use this parameter to efficiently set up data retrieval If the db_file_multiblock_read_count parameter is not set, then only one block is returned per table scan, no matter how much data the system can return.

Database LOB Sizing and Performance Optimization

For example, if a user has requested a large amount of data (256K) through a Windows system, the system can return this data in 64K increments. If the database block size is 8K, and db_file_multiblock_read_count is not set, then each read will return only 8K. It would require 32 reads to return all the data. However, if db_file_multiblock_read_count is set to 8, the system will return 8 blocks of 8K with one read. Thus only four reads are necessary, greatly increasing the efficiency of the database access.

6 DB2 BLOB Sizing


This section provides performance and sizing guidelines for PRPC systems running on DB2.

6.1

Using the Sizing Scripts for DB2 Versions Earlier Than V8.2

Use the sizing script db2_pre8.2_sizing.db2 if you are running DB2 versions earlier than 8.2. The download location for the sizing script is: http://pdn.pega.com/DevNet/PRPCv4/TechnologyPapers/TechPapers.asp Open Blob_Sizing.zip and extract the script db2_pre8.2_sizing.db2 to a local directory. From DB2s Command Center Utility, run the RUNSTATS command for all tables in the PegaRULES schema. 3. Edit the db2_pre8.2_sizing.db2 script: Substitute your database name, logon user and user password for the placeholders in the scripts first line:
1. 2.

Connect to <DATABASE> user <USERNAME> using <PASSWORD>;

Enter an output directory location for the files exported by the script. 4. Run the script:
$>db2 td@ -f db2_pre8.2_sizing.db2

The output file, avg_row_length.txt, contains a list of tables, along with the avg_col_size and number of pages used for each table. 5. This script creates another SQL script file (by default in C:\) called SIZESCRIPT.DB2. Open this script for editing. Substitute your database name, logon user and user password for the placeholders in the scripts first line:
Connect to <DATABASE> user <USERNAME> using <PASSWORD>;

Find all instances of the double-quote character (") and replace with a blank character (space). 6. Run the updated script:
$>db2 t f SIZESCRIPT.DB2 r lob_sizes.txt

The output file, lob_sizes.txt, will contain a list of table names in the PegaRULES schema, along with a count of the number of rows and the average and maximum size of the LOB field.
Database Connection Information Database server SQL authorization ID Local database alias = DB2/NT 8.1.0 = PRPCV51 = PEGA

TABLENAME NUM_ROWS AVG_LOB_LEN MAX_LOB_LEN -------------------- ----------- ----------- -----------

Database LOB Sizing and Performance Optimization

PC_ASSIGN_WORKBASKET 1 record(s) selected.

TABLENAME NUM_ROWS AVG_LOB_LEN MAX_LOB_LEN ------------------ ----------- ----------- ----------PC_ASSIGN_WORKLIST 0 1 record(s) selected. TABLENAME NUM_ROWS ---------------- ----------PC_DATA_UNIQUEID 1 1 record(s) selected. ...

6.2

Using the Sizing Scripts for DB2 Versions 8.2 and Later

The later DB2 versions use the SQL scripts db2_sizing_proc.sql and db2_sizing_run.sql to determine the size of the LOB columns. The download location for the sizing script is: http://pdn.pega.com/DevNet/PRPCv4/TechnologyPapers/TechPapers.asp
Note
1. 2.

You must have a compiler installed on your DB2 server in order to install and run stored procedures.

Open Blob_Sizing.zip and extract the script db2_pre8.2_sizing.db2 to a local directory. From DB2s Command Center Utility, run the RUNSTATS command for all tables in the PegaRULES schema. 3. Substitute your database name, logon user and user password for the placeholders in the scripts first line:
Connect to <DATABASE> user <USERNAME> using <PASSWORD>;
4.

Run the db2_sizing_proc.db2 script to set up the stored procedures required to return the results.
$>db2 td@ -f db2_sizing_proc.db2

5.

Run the db2_sizing_run.db2 script to create the output file table_sizes.txt.


$>db2 td@ -f db2_sizing_run.db2 r table_sizes.txt

This file lists the average row length, number of rows, and average LOB size for tables in the PegaRULES schema. Run this script on a regular schedule to determine the growth rate of the PegaRULES database.

6.3

BLOB Storage in DB2 UDB

In DB2, LOB columns are stored in a different physical format, separate from other data in a row. LOB data is stored in 64 MB areas broken up into segments starting at 1024 bytes and doubling in size with each successive segment (that is, the first segment is 1024 bytes, the next 2048, the next 4096 bytes, etc., up to 64MB). You can store BLOB columns and indexes in separate tablespaces from the tabular rows using the LONG IN clause of the CREATE TABLE statement. Note that the tablespace specified for the table with a BLOB must be a DMS tablespace, and the tablespace specified for the BLOB column must be a LARGE DMS tablespace.
Database LOB Sizing and Performance Optimization 7

Note

The separation of BLOB columns can only be done as part of the table creation, and not with the ALTER TABLE statement.

LOB fields are never placed in the buffer cache and can only take advantage of O/S level file system caching, if available.

6.4
1.

Performance and Sizing Guidelines for PegaRULES BLOBs in DB2


Some PegaRULES tables have very small BLOBs. Use the scripts to identify them, and for tables with an average BLOB size less than 3K, use the COMPACT option to optimize the storage without affecting performance. Using the COMPACT option fixes the segment size at 1024 bytes. This allows for minimum waste of disk space when storing the LOB data since the block will not grow. A PegaRULES implementation may have some tables with large BLOBs. For these tables, consider using a separate tablespace using the LONG IN clause during table creation. This clause allows the BLOB column data to be stored in a separate tablespace. Since BLOB data is not stored or retrieved from the buffer caches, perform base tuning of the buffer caches on the relational columns. In DB2, BLOBs larger than 1GB are not LOGGED. If a BLOB is smaller than 1 GB, you should retain the DEFAULT logging characteristics. If it is not already enabled, consider enabling compression through the pegarules.xml file (in versions 4.2 and earlier) or prconfig.xml file (versions 5.1 and later).

2.

3. 4. 5.

BLOB compression can greatly improve utilization of disk storage. Refer to PRKB-9850, How to Compress the BLOB Values in the PegaRULES Database, for further information.

6.5

Exposed Column and Table Sizing in DB2 Databases


pc_history_work pc_assign_workbasket pc_assign_worklist pc_work pr_log_usage or pr4_log_usage (PRPC versions 5.1 and earlier), or pr_perf_stats (all later PRPC versions)

The tables that are most likely to grow over time in PRPC are:

Depending on the flows and the processes developed, additional tables (such as pc_other) may grow as well; discuss this issue with your PRPC System Architect. Use the data from the sizing scripts (refer to Section 6.1 or Section 6.2) to calculate the size required for the exposed columns for each table.
1.

Estimate the number of rows per DB page by dividing the page size by the average row size from pega_sizes.txt.
DB page size / Average row size = Rows per page

2.

Calculate the number of pages required for the table by dividing the number of rows by the number of rows per page.
Number of rows in table / Rows per page = Number of pages needed for table

3.

Calculate the space required for the exposed columns by multiplying the pages needed by the page size.

Database LOB Sizing and Performance Optimization

Number of pages needed for table * page size = Space required

For example, consider this example space calculation for a database created with a page size of 4K, in which the table pc_work contains 1,500 rows with an average row length of 1.2K (from pega_sizes.txt).
Page size / Ave row size = Rows per page 4K / 1.2K = 3 Rows in table / Rows per page) = Pages required 1500 / 3 = 500 Pages required * Page size = Space required 500 * 4K = 2000K (2.0MB) Notes

Keep in mind that this figure is an estimate, because it does not account for the page header of 68 bytes, nor for any percent free space value that might have been specified when the table was allocated. Both values must be subtracted from the page size to attain a more accurate value. Discard any remainder portion of the rows per page value. It is necessary to round down this number, because the system cannot split a row across pages (4 / 1.2 = 3.33, rounded down to 3).

6.6
1.

Performance and Sizing Guidelines for DB2 Tables


The PegaRULES application includes a number of tables that require page sizes larger than 4K. It is recommended that the DBA create and use an additional 16K tablespace for these tables and for large BLOB storage. In addition to allocating a 16K tablespace, allocate a 16K page size buffer pool for these tables. Note BLOB storage in this tablespace will not use the BUFFERPOOL. 2. Since PegaRULES inserts and updates large volumes of data, use the db2empfa tool to enable multi-page allocation of space. This tool improves insert performance because disk space is allocated one extent at a time rather than one page at a time. 3. Perform RUNSTATS on the all tables and indexes on a regular basis. This allows the optimizer to choose the most efficient query plans. 4. Column compression may be used to improve the efficiency of disk space used for relational columns. Setting compression on columns with large amount of NULLs can improve the storage efficiency of the database. Tables with compressed columns also materialize the compressed columns into the buffer cache, hence improving the amount of pages in the buffer cache and potentially increasing performance.

7 SQL Server BLOB Sizing


This section provides performance and sizing guidelines for PRPC systems running on various releases of SQL Server. All releases use the same sql_server_sizing.sql script to determine the size of the LOB columns.

7.1

Using the Sizing Script with SQL Server 2000

The download location for the sizing script is: http://pdn.pega.com/DevNet/PRPCv4/TechnologyPapers/TechPapers.asp


Database LOB Sizing and Performance Optimization 9

Download Blob_Sizing.zip and extract the sql_server_sizing.sql script to a local directory on your system. 2. Open MS Query Analyzer and log on to the PegaRULES database. 3. Select File > Open. Navigate to the location where you saved sql_server_sizing.sql, and open the file. The file opens in the Editor pane. 4. Search for the statement containing TABLE_CATALOG:
1.

select distinct TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_CATALOG = 'DB_NAME' and table_name <> 'dtproperties' and table_name not like 'sy%'

Change the TABLE_CATALOG value to the name of your PegaRULES database. Save the script. 5. Press F5 to execute the script. 6. Open a new Query Analyzer window and run:
exec dbo.listTableRowCounts

The results display in the Results pane and can be exported, or you can copy and paste the information into another application.

7.2

Using the Sizing Script with SQL Server 2005/2008

The download location for the sizing script is: http://pdn.pega.com/DevNet/PRPCv4/TechnologyPapers/TechPapers.asp


1. 2. 3. 4. 5.

Download Blob_Sizing.zip and extract the sql_server_sizing.sql script to a local directory on your system. Open SQL Server Management Studio and connect to your server. In the Object Explorer, navigate to the PegaRULES database. Select File > Open > File... Navigate to the location where you saved sql_server_sizing.sql, and open the file. The file opens in the Editor pane. Search for the statement containing DB_NAME():
SET @SQL = 'DBCC UPDATEUSAGE (' + DB_NAME() + ')'

Enclose the entire variable string in double-quotes, for example:


SET @SQL = 'DBCC UPDATEUSAGE ("' + DB_NAME() + '")'
6.

Search for the statement containing TABLE_CATALOG:


select distinct TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_CATALOG = 'DATABASENAME' and table_name <> 'dtproperties' and table_name not like 'sy%'

Change the TABLE_CATALOG value to the name of your PegaRULES database. Save the script. 7. Click Execute. Ignore the message reporting an error caused by dropping an unknown procedure. 8. Click the New Query button. Enter and execute the stored procedure:
exec dbo.listTableRowCounts; GO

Your results will resemble the table in this screen shot:

10

Database LOB Sizing and Performance Optimization

9.

Save the results to a file with Query > Results To > Results to File, or copy and paste the information into another application.

7.3
Note

BLOB Storage in SQL Server


Microsoft has announced plans to deprecate the IMAGE data type in a future release of SQL Server, replacing it with the VARBINARY(MAX) data type. Pegasystems will migrate its schema prior to that time, but IMAGE is still supported in V5.4.

SQL Server 2000 stores BLOB data using the IMAGE data type, which is stored as a collection of 8K pages that may not be located together on a disk. In SQL Server 2000, BLOB data pages are logically organized in a B-tree structure to facilitate accessing sub-structures within the BLOB. The text_in_row parameter of sp_tableoption controls the behavior of image storage in a BLOB:
sp_tableoption [ @TableNamePattern = ] <tablename> , [ @OptionName = ] text_in_row' , [ @OptionValue = ] 'ON'

If this parameter is set to ON, SQL Server stores image data in the row up to 256 bytes (by default). You can also enter a value between 24 and 7,000 to store image data in the row up to that specific value. If text_in_row is ON and the space needed for the BLOB is larger than the amount of space available, or the space needed is larger than the amount specified in the text_in_row option, then the database inserts a 72-byte root structure in the row, as well as pointers to the pages that contain the BLOB data. If text_in_row is set to OFF, then the image data is stored in separate pages as a logical B-tree. SQL Server allows more than one BLOB per image data page. If there is less than 32Kb of data, then the database stores a 16-byte pointer with the row data that points to a root structure stored on an image page. If there is less than 64 bytes of data for the image, all the data is stored with the root structure on an image page. If the BLOB size is greater than 32K, the database builds intermediate node structures between the data blocks and the root node. These intermediate node structures are stored in separate pages, which are not shared with image data pages.

Database LOB Sizing and Performance Optimization

11

BLOB columns may be stored in a separate file group from relational data if TEXT_IMAGE ON is set during table creation.

7.4
1.

Performance and Sizing Guidelines for PegaRULES BLOBs in SQL Server


The PegaRULES application has a number of tables that have very small BLOBs. Tables where the avg_row_size + max(lob_size) is less than 8060 bytes may take advantage of turning the text_in_row option ON. For each of these tables, set the text_in_row size to be equal to the lesser of the maximum observed LOB size, or 7000. 2. Moving BLOB columns to their own file group for large BLOBs may enable better control over the I/O rates of the database, if the file groups exist on different disks and the I/O read/write sizes can be controlled. 3. If it is not already enabled, consider enabling compression through the pegarules.xml file (in versions 4.2 and earlier) or prconfig.xml file (versions 5.1 and later).

BLOB compression can greatly improve utilization of disk storage. Refer to PRKB-9850, How to Compress the BLOB Values in the PegaRULES Database, for further information.

7.5

Exposed Column and Table Sizing in SQL Server Databases


pc_history_work pc_assign_workbasket pc_assign_worklist pc_work pr_log_usage or pr4_log_usage (PRPC versions 5.1 and earlier), or pr_perf_stats (all later PRPC versions)

The tables that are most likely to grow over time in PRPC application are:

Depending on the flows and the processes developed, additional tables (such as pc_other) may grow as well; this should be discussed with your PRPC System Architect. Using the data obtained from running sql_server_sizing.sql to calculate the size required for the exposed columns.
1.

Estimate the number of rows per DB page by dividing the page size by the average row size from pega_sizes.txt.
DB page size / Average row size = Rows per page

2.

Calculate the number of pages required for the table by dividing the number of rows by the number of rows per page.
Number of rows in table / Rows per page = Number of pages needed for table

3.

Calculate the space required for the exposed columns by multiplying pages needed by page size.
Number of pages needed for table * page size = Space required

For example, consider this example space calculation for a database created with a fixed page size of 8K (8060 usable bytes), in which the table pc_work contains 1500 rows with an average row length of 1.2K (from pega_sizes.txt).
Page size / Ave row size = Rows per page 8K / 1.2K = 6

12

Database LOB Sizing and Performance Optimization

Rows in table / Rows per page = Pages required 1500 / 6 = 250 Pages required * Page size = Space required 250 * 8K = 2000K (2.0MB) Notes

This figure is an estimate, because it does not account for the page header and footer of 132 bytes, nor for any percent free space value that might have been specified when the table was allocated. Both values must be subtracted from the page size to attain a more accurate value. Discard any remainder portion of the rows per page value. It is necessary to round down this number, because the system cannot split a row across pages (8 / 1.2 = 6.66, rounded down to 6).

Database LOB Sizing and Performance Optimization

13

You might also like