You are on page 1of 4

1/21/13

ClearPearks - Leaders in Business Intelligence Solutions CLEARPEAKS.COM

Boost Performance of Informatica Lookups ClearPeaks Blog

Previous post Next post

Boost Performance of Informatica Lookups


February 16th, 2012. Jordi G.

Introduction
Lookups are expensive in terms of resources and time. A set of tips about how to setup lookup transformations would dramatically improve the main constrains such as time and performance. In this article you will learn about the following topics: - Lookup cache - Persistent lookup cache - Unconnected lookup - Order by clause within SQL

Lookup Cache
Problem: For non-cached lookups, Informatica hits the database and bring the entire set of rows for each record coming from the source. There is an impact in terms of time and resources. If there are 2 Million rows from the source qualifier, Informatica hits 2 Million times the database for the same query. Solution: When a lookup is cached: Informatica queries the database, brings the whole set of rows to the Informatica server and stores in a cache file. When this lookup is called next time, Informatica uses the file cached. As a result, Informatica saves the time and the resources to hit the database again. When to cache a lookup? As a general rule, we will use lookup cache when the following condition is satisfied: N>>M N is the number of records from the source M is the number of records retrieved from the lookup Note: Remember to implement database index on the columns used in the lookup condition to provide better performance in noncached lookups.

Persistent Lookup Cache


Problem: Informatica cache the lookups by default. Lets consider the following scenario: A lookup table is used many times in different mappings. In each Lookup transformation, Informatica builds the same lookup cache table over and over again. Do we need to build the lookup cache every time for each lookup? Solution: It is possible to build the cache file once instead of creating the same cache file N-times. Just using persistent cache option will allow Informatica to save resources and time for something done before.

www.clearpeaks.com/blog/etl/boost-performance-of-informatica-lookups

1/4

1/21/13

Boost Performance of Informatica Lookups ClearPeaks Blog

Check the following parameters in the transformation to use Persistent Lookup cache: - Lookup caching enabled - Lookup cache persistent

Figure 1: Cache Persistent Enabled

From now onwards, the same cache file will be used in all the consecutive runs, saving time building the cache file. However, the lookup data might change and then the cache must be refreshed by either deleting the cache file or checking the option Re-cache from lookup source.

Figure 2:Re-cache from Lookup Source Enabled

In case of using a lookup reusable in multiple mappings we will have 1 mapping with Re-cache option enabled while others will remain with the Re-cache option disabled. Whenever the cache needs to be refreshed we just need to run the first mapping. Note:Take into account that it is necessary to ensure data integrity in long run ETL process when underlying tables change frequently. Furthermore, Informatica Power Center is not able to create larger files than 2GB. In case of a file exceeds 2GB, Informatica will create multiple cache files. Using multiple files will decrease the performance. Hence, we might consider joining the lookup source table in the database.

Unconnected lookup
Problem: Imagine the following mapping with 1,000,000 records retrieved from the Source Qualifier:

Figure 3: Connected Lookup Transformation Suppose out of a million records, the condition is satisfied 10% of the amount of records. In case of connected lookup, the lookup will be called 900,000 times even there isnt any match. Solution: It is possible calling the Lookup transformation only when the condition is satisfied. As a result, in our scenario the transformation will be called and executed only 100,000 of times out of 1M. The solution is using an Expression transformation that calls the lookup transformation that is not connected to the dataflow:

www.clearpeaks.com/blog/etl/boost-performance-of-informatica-lookups

2/4

1/21/13

Boost Performance of Informatica Lookups ClearPeaks Blog

Figure 4: Unconnected Lookup Transformation

For instance, an Expression transformation will contain a port with the following expression: IIF (ISNULL (COUNTRY), :LKP.LKP_COUNTRY (EMPLOYEE_ID), COUNTRY)

If the COUNTRY is null, then the lookup named LKP_COUNTRY is called with the parameter EMPLOYEE_ID. The ports in the look up transformation are COUNTRY and EMPLOYEE_ID, as well as the input port.

Order by clause within SQL


Informatica takes the time (and the effort) to bring all the data for each port within the lookup transformation. Thereby, it is recommended to get rid of those ports that are not used to avoid additional processing. It is also a best practice to perform ORDER BY clause on the columns which are being used in the join condition. This ORDER BY clause is done by default and helps Informatica to save time and space to create its own index. Informatica sorts the query for each column on the SELECT statement. Hence, redundant or unnecessary columns should not be here. To avoid any sort, just add a comment at the end of the SQL override:

Figure 5: To Avoid ORDER BY in SQL Override

To sum up, it is possible to enhance Informatica lookups by using different set of configurations in order to increase performance as well as save resources and time. However, before applying any of the mentioned features, an analysis of the tables and the SQL queries involved needs to be done.
0 0 1 1456

Posted under ETL 2 Comments

2 Responses to Boost Performance of Informatica Lookups

1.

priya jain says: November 5, 2012 at 1:14 pm Your website is very informative .

2.

Nitin B says: December 26, 2012 at 9:38 am Nice Knowledge Sharing !!

Leave a Comment

www.clearpeaks.com/blog/etl/boost-performance-of-informatica-lookups

3/4

1/21/13
Name (required)

Boost Performance of Informatica Lookups ClearPeaks Blog

Mail (will not be published) (required)

Website

Facebook Twitter Linkdn RSS Feed

Blog Categories
Academy (17) Analytics (1) Customer Success Stories (2) Data Management (1) Data Warehousing (8) ETL (11) Events (18) Exalytics (1) General (18) Oracle BI EE (36) Oracle BI EE 11g (26) Reporting (3) Webinars (11)

Most Popular Posts


MS Excel spreadsheets as a data source in Informatica PowerCenter - (18 comments) Configuring OBIEE to work in Single Sign-On (SSO) Environment on IIS - (12 comments) SQL Override: Mapping Reusability in Informatica - (11 comments) Deduplication using Analytic Function - (7 comments) Retrieving non-existent data for a global view of reality - (6 comments) Creating a 100% Stacked Bar Chart in Oracle BIEE 11g - (6 comments) OBI Tables tips & tricks: Hard Coding Zero Values and Combined Request Reports - (6 comments)

privacy policy - Copyright 2000-2010 ClearPeaks

www.clearpeaks.com/blog/etl/boost-performance-of-informatica-lookups

4/4

You might also like