You are on page 1of 28

Designing High Performance BIRT Reports

Mica J. Block Director Actuate Corporate Engineers Actuate Corporation

Topics

Understanding generation performance External factors Overhead Estimated Component Times Performance tips Generation Rendering

NOTE: This presentation deals with the performance of the reports themselves regardless of the server technology being used.

Understanding Performance

Pages-per-Second Myth

Assumes all reports are equal Ignores Number of report items per page Complexity of the query Pages are defined at render time Impact of aggregates, and more Reality: report items-per-second is better metric Pages-per-second applies only to same report on different runs

External Factors

System: CPU, load Raw CPU power RAM Overall load in server environment JVM User expectations DB performance Database design Query design & optimization Performance of vendors query features Network overhead

Actuate Architecture
Development Tier Storage, Data Access & Integration Tier Content Production Tier iServer F1 V F2 F1 V F2 i M E i M E EII EII Presentation (Web/Portal) Tier iPortal
iPortal
Mgmt Console

Client Tier (Web Browsers)

IE Firefox

iPortal

F1 V F2 F1 V F2 i M E i M E EII EII

Perf Mgmt

Development Tier:

IT builds reports, blueprints, metadata, & templates for different reporting styles

Storage & Data Tier: Dedicated, secure storage locations for accessing data, storing project & report content Production Tier: Presentation Tier: Client Tier: Single, scalable cluster for generating content for different reporting styles Dedicated tier for accessing & presenting report & dashboard content to users Users consume content according to their analytic objective

Estimated Component Times

Estimated from simple listing using single table (10,000 rows) in SQL Server Generation only does not include rendering Not scientific methodology (done on a laptop) Your mileage will vary Use your own data Try in your own environment Focus on specific reports with problems

Estimated Component Times

Pages: little to no effect Changed page break from 200 to 100 --> double pages Adds < 2% to report run Formatting: little to no effect Added numeric and date formatting Was slightly faster Groups: moderate to significant Add two group levels to simple listing Adds ~5-20% to report run per group Depends on the number of group breaks Depends on how the data is sorted

Estimated Component Times

One-pass aggregates: moderate Added two aggregates Adds ~4% per aggregate to report run Depends on number of groups Look-ahead aggregates: significant Total for group as percent of overall total Adds ~2-8% per aggregate to report run Depends on number of groups and number of data items Charts: Very significant One chart added ~33% to report run One chart per group ~30-150% to report run Depends on number of groups (i.e. charts).

Estimated Component Times

Report Name

Size

Average (in miliseconds)

Difference

Compare Report

Single Table

4.30 MB

1,954.40

Single Table Formatted Single Table Double Pages Group By City (4 instances) Group By Customer (400 instances) Group By Customer Sorted Group By City Aggregates (2 per group) Group By Customer Aggregates (2 per group) Group By City Two Pass Group By Customer Two Pass Single Chart Group By City Chart Group By Customer Chart

4.30 MB 4.35 MB 4.51 MB 4.69 MB 4.69 MB 4.71 MB 4.89 MB 4.81 MB 5.00 MB 5.92 MB 6.25 MB 21.6 MB

1,931.80 1,996.00 2,201.20 2,353.20 2,207.60 2,377.60 2,542.60 2,366.40 2,397.60 2,607.40 2,888.80 5,704.80

-1.16%Single Table 2.13%Single Table 10.28%Single Table Double Pages 17.90%Single Table Double Pages 10.60%Single Table Double Pages 8.01%Group By City 8.05%Group By Customer 7.50%Group By City 1.89%Group By Customer 33.41%Single Table 31.24%Group By City 142.43%Group By Customer

Implications

Report generation depends on: number of report items Presence of aggregates Number of groups Sorting of data Presence of charts Time per page depends on output format Pages per second depends on layout Decreasing page break number doubles performance!

Performance Strategies

Use report items-per-second as a guide Relatively fixed for a platform Determine a time budget How many report items can the report afford? Performance strategies Remove application-specific bottlenecks Make report items work harder Reduce impact of aggregates

How to Analyze Performance

Test functionality separately Write to a log file timers in key areas Collect run times Remove all content from report Collect run times again Difference is cost of processing report items Remainder is per-row cost Example:

Performance Tips

General Observations

Report optimization is a trial and error effort Some of the report optimization techniques require additional development time Not necessary to use these techniques when the reports perform within the user requirements These techniques should only be used to optimize reports

Use Latest Version

Use latest version of BIRT Has many performance improvements Do not use Total functions These functions are deprecated in BIRT 2.2.2 Has some performance issues Especially with filters

Optimize Database Access

Extra time from queries, DB overhead, computation, etc. Minimize query time Make sure query is optimized Reduce the number of columns and rows returned Reduce number of queries needed Use stored procedures Use materialized views

Optimize XML Access

XML is versatile, and powerful to describe meta data and actual data in one file BIRT has a generic XML ODA which uses an extremely efficient XPath algorithm to parse the results generic is great to solve a multitude of needs, but lacks to solve a single need very well If the XML Schema will not change, and high user loads are required, specialize connectors should be built to improve overall system performance

Optimize XML Access

Java API for XML Binding (JAXB) is a specialized API for Java used to efficiently and quickly parse a fixed schema XML data file Upside may be 10x faster than the generic XML ODA Downside if the XML Schema changes, JAXB classes will need to be re-compiled Downside no UI exists to create data sets, JAXB classes must be used with a scripted data source The same also applies for the Web Services ODA

Filtering

BIRT enables filtering at different layers such as in the table Push filtering to the database (if possible) Reduces the size of the result set Extremely important with two pass aggregates

Sorting

When you add a group section BIRT will automatically sort the dataset in memory. There is no setting to tell BIRT that the data is already sorted. Always better to push the sort to the database

Getting caught in a (Data) Bind

As of BIRT 2.1.3 this will change for a future release with data set caching Each report item with a specified data binding will force that data set to re-execute for each binding Bindings will cascade down to contained report items (data bindings on a table cascade down to items inside the table) In nearly all reports data sets should only have 1 binding specified Only extremely complex reports with inter-woven data set requirements will require multiple bindings per data set Joint Data Sets can be used in some cases to avoid multiple bindings on a single data set Do not bind data sets on the Master Page

Aggregates

Aggregates: Sum( ), Count( ), Min( ), etc. Two types Running done while creating the table Look-ahead - requires two passes over data For performance, review look-ahead type Create a stored procedure to do calculation Use a separate query Use a data filter to merge totals into each row Compare to out-of-box solution

Charts

Good news - Most time spent in rendering (using drawing primitives in swing) Actual code is optimized Size and resolution will impact performance All points are loaded in memory. Avoid charts with many points Little more you can discern in a chart with 10,000 points than in a chart with 500 points More points will also take longer to render as there is more to draw Make sure you use the table binding not the dataset binding

Charts

3D charts might take more time as it uses a real 3D algorithm to sort surfaces 2d charts with depth have no significant performance impact Grouping inside charts will be the number one point that slows things down Chart engine uses a different grouping algorithm Group the data in the data set BIRT 2.3 will use the DTE grouping capabilities Avoid extra markers, labels, shadows, gradients, etc will impact the performance as it means more shapes and fills to draw

General Tips

Reduce number of report items Concatenate values where makes sense First Name + Last Name Avoid table data bindings when not used Use new Crosstab report item when appropriate as it is tuned for such operations.

Rendering Tips

PDF Set appropriate page size in the master page Will significantly decrease dynamic geometry HTML Avoid group sections with many items Will cause a long TOC list and will impact viewing performance

Q&A

You might also like