You are on page 1of 5

2/20/2011 Data Warehouse Architecture Compon…

Infonitive
Data Manage me nt Consulting

Home
About Infonitive
Business Intelligence
Data Warehousing
Metadata Management
Contact

Evaluation Criteria for Data Quality Platforms and Vendors Criteria for True Business Intelligence

Data Warehouse Architecture Components and Implementation


Options
August 12, 2010 Categories: Data Warehousing | Data Warehousing Basics | Information Architecture |
Operational Data Store | Infonitive Research

At a high level a generic data warehousing architecture can be said to comprise of the following components:

Data Staging Layer (DSL)


Operational Data Store (ODS)
Data Warehouse (DW, EDW for an Enterprise)
Data Marts (DM)

Each components functions are follows:

Data Staging Layer

Decouples extraction processes from data cleansing processes


Houses the transformation of formats of data from the incoming disparate systems to the single target data
warehousing technology platform
It reduces impact of loads on transaction system
Maintains a ‘Snapshot’ that allows repeatable / re-startable ETL processes
Is an area to store data using different load schedules to the DW
Is not typically available to end users for data consumption
Acts as an area to store 3rd party data
Employs data manipulation not appropriate for the DW
The data in Data Staging Area is: a copy of operational data, deleted after a short period so non-historical,
in flat tables atomic in granularity

Operational Data Store (ODS)


ODS is a hybrid of data warehouse and operational systems. Functions of an ODS are:

Allows OLTP response times, update capabilities and DSS capabilities


http://infonitive.com/?p=205 1/5
2/20/2011 Data Warehouse Architecture Compon…
Can be used for operational reporting because data from multiple source systems are integrated at this
point
Operational reporting is typically more structured so dimensional models are not required
Data in the ODS is: kept for 6 months so non-historical, in flat tables or 3NF, often a copy of operational
data, and is atomic in granularity

For architectural purposes, ODS’ are classified as follows:

Class I – Real-time
Class II – Two-hour to four-hour
Class III – Daily
Class IV -Aggregated data from the data warehouse

Enterprise Data Warehouse (EDW)

An EDW is the central hub of the BI infrastructure in some architectural approaches. The function of the EDW is
to be the single point of truth for enterprise. Data in the EDW is:

cleansed and harmonized during the ETL process


Historical
Distributed to various other systems
Often 3NF but can be dimensional
Usually atomic but sometimes aggregate

The degree of normalization in the EDW data model is a tradeoff between multiple properties or characteristics
of the data warehouse. The following matrix captures the trade-offs:

Data Marts
A Data mart is a departmental, or subject oriented, subset of the EDW. Data in the data mart comes from the
EDW if one exists Relationship between EDW and data marts is often called a hub-and-spoke architecture. Data
in Data Mart is often aggregated but may be atomic, dimensional, may or may not be historical, and may be
volatile.

Data Warehousing Architectural Options

There are three architectural approaches for setting up a flexible, future-oriented data warehouse. This
architectural decision is critical since the selection will play a large role in many aspects of the data warehouse
such as data modeling methodology uses and physical infrastructure characteristics. The data warehouse
architecture will determine the locations of the data warehouses and data marts themselves, and where the
control resides. For example, the data can reside in a central location that is managed centrally. Alternatively, the
data can reside in distributed local and/or remote locations that are either managed centrally or independently.

At a higher level, the three architectural approaches are:

Enterprise data warehouse (EDW)


Dependent data marts
Independent data marts

http://infonitive.com/?p=205 2/5
2/20/2011 Data Warehouse Architecture Compon…
These architectural choices do not have to be used exclusively by themselves, thy can be used in combinations as
suited to the organizational needs. For example an often used option is to combine the EDW and dependent data
marts in the same instance. The data marts are chained to follow in lock step with EDW, rather than directly from
the staging area. The grain of the marts are restricted by the lowest grain of the EDW, as is the data latency and
recency.

Enterprise Data Warehouse:


An enterprise data warehouse supports a large part of business requirements for a more fully integrated data
warehousing environment that has a high degree of data access and usage across departments or lines of
business. The data warehouse is designed and constructed based on the needs of the business as a whole. It is a
common repository for decision-support data that is available across the entire organization. The term
“Enterprise” reflects the scope of data access and usage, not the physical structure.

This type of data warehouse is characterized as having all the data under central management. However,
centralization does not necessarily imply that all the data is in one location or in one common systems
environment. While the data is centralized, it is logically centralized rather than physically co-located. When this is
the case, by design, it then may also be referred to as a hub and spoke implementation. The key point is that the
environment is managed as a single integrated entity.

Independent data mart architecture:


An independent data mart architecture, as the name implies, is comprised of standalone data marts that are
controlled by particular workgroups, departments, or lines of business. There typically is no connectivity with
data marts in other workgroups, departments, or lines of business. Therefore, these data marts do not share any
conformed dimensions and conformed facts between them. This is one of the concerns when using an
independent data mart. The data in each may be at a different level of currency, and the data definitions may not
be consistent – even for data elements with the same name. Independant data marts are primary candidates for
data mart consolidation for companies around the world today. The proliferation of such independent data marts
has resulted in issues such as:

Increased hardware and software costs for the numerous data martsIncreased resource requirements for
support and maintenance
Many redundant and inconsistent implementations of the same data
Development of many extract, transform, and load (ETL) processes
Lack of a common data model, and common data definitions, leading to inconsistent and inaccurate
analyses and reports
No data integration or consistency across the data marts
Time spent, and delays encountered, while deciding what data can be used, and for what purpose
Concern and risk of making decisions based on data that may not be accurate, consistent, or current
Many heterogeneous hardware platforms and software environments that were implemented, because of
cost, available applications, or personal preference, resulting in even more inconsistency and lack of
integration.
Inconsistent reports due to the different levels of data currency stemming from differing update cycles; and
worse yet, data from differing data sources

Dependent data mart architecture


An interconnected data mart architecture is basically a distributed implementation. Although separate data marts
are implemented in a particular workgroup, department, or line of business, they are integrated, or
http://infonitive.com/?p=205 3/5
2/20/2011 Data Warehouse Architecture Compon…
interconnected, to provide a more global view of the data. These data marts are connected to each other using
conformed dimensions and conformed facts. Each of the dependant data marts typically has a common staging
area. At the highest level of integration, the combination of all dependent data marts could be thought of as a
distributed enterprise data warehouse. In an implementation where the EDW and dependent data mart
architectures are combined, the Staging Area is basically replaced by the EDW.

Search

Recent Articles

Design & Architectural Approaches for Data Warehouses: Inmon vs Kimball, Federated, Hybrid
Business Intelligence Implementation: Think Big, Start Small
Business Intelligence Vision & Strategy
Market Trends – Business Intelligence in Core Banking Systems
Criteria for True Business Intelligence
Data Warehouse Architecture Components and Implementation Options
Evaluation Criteria for Data Quality Platforms and Vendors
In-Memory Analytic Platforms – The Next Frontier in Business Intelligence
Data Warehouse – Analysis & Design
Data Cleansing in Data Warehousing: A Structured Approach
Challenges of Real Time Data Integration in Data Warehousing
Data Warehousing: Implementation Strategies for Externally Serviced Data Warehouses
Concept of Global Data Codes as an integral part of an MDM solution
Advantages of an Effectively Managing Master Data (MDM)
Outsourcing as an option for a Data Warehousing Program

Infonitive Site Map


About Infonitive
Business Intelligence
Data Warehousing
Metadata Management
Contact

Go To Top »

http://infonitive.com/?p=205 4/5
2/20/2011 Data Warehouse Architecture Compon…
Infonitive

Pages

Home
About Infonitive
Business Intelligence
Data Warehousing
Metadata Management
Contact

Stay In Touch

Site RSS Feed

More

Stay updated by subscribing to the RSS feed.


© 2010 Infonitive

Content Protected Using Blog Protector By: PcDrome.

http://infonitive.com/?p=205 5/5

You might also like