You are on page 1of 26

DATA WAREHOUSE Overview

E-BIZ Practice
Tata Consultancy Services, India
Agenda

➭ DW Concepts
➭ Components
➭ Architecture
➭ Representative Tools

10/18/08 TCS Confidential 2


Need for Data Warehousing

➭ Difficulty in obtaining integrated information


➭ Information structure not able to provide ‘full and
dynamic’ analysis of information available
➭ Inconsistent results obtained from queries and
reports arising from heterogeneous data sources
➭ Increased difficulty in delivering consistent
comprehensive information in a timely fashion

10/18/08 TCS Confidential 3


Data Warehousing - Defined

Data Warehouse is a
• Subject-Oriented
• Integrated
• Time-Variant
• Non-volatile
collection of data in support of management’s
decision

10/18/08 TCS Confidential 4


What are Data Warehouses?

➭ Data warehouses store large volumes of data which


are frequently used by DSS
➭ It is maintained separately from the organization’s
operational databases
➭ Data warehouses are relatively static with only
infrequent updates
➭ A data warehouse is a stand-alone repository of
information, integrated from several, possibly
heterogeneous operational databases
10/18/08 TCS Confidential 5
Data Warehousing

➭ Is the enabling technology that facilitates improved


business decision-making
➭ It’s a process, not a product
➭ A technique for assembling and managing a wide
variety of data from multiple operational systems
for decision support and analytical processing
It’s a journey not a destination...

10/18/08 TCS Confidential 6


Contrasting DW with OLTP

10/18/08 TCS Confidential 7


DW Components
Metadata Layer
Extraction Data Mart
Cleansing Population
Aggregation
FS1 Summarization
S Transformation
T DM1
A
FS2
G
. I
N ODS DW
DM2

Transmission
. N
E
G
DMn
A
. T
W
O
R
E
OLAP ANALYSIS
FSn R
K A
Legacy System Knowledge Discovery
10/18/08 TCS Confidential 8
Operational Process

➭ Data extraction
➭ Data Cleansing and Transformation
➭ Data Load and refresh
➭ Build derived data and views
➭ Service queries
➭ Administer the warehouse

10/18/08 TCS Confidential 9


Extraction Process
( Data Capturing )

Incremental
Business Data Data
Feed System
Transactions Capturing
Application
Process

Control Metadata

•Extract the incremental data from feed system


•Store the extracted data into a temporary area

10/18/08 TCS Confidential 10


Extraction Process
(Data Transmission )
Feed System Side Staging area
Network Cloud

Incremental
FTP Incremental
Data
Data

•Transmit the extracted data from Feed system to Staging area


• Periodicity of transmission ( daily / weekly ) depends upon the feed system

10/18/08 TCS Confidential 11


Cleansing Process

Process Metadata Clean


Cleansing Rules data
Raw data Cleansing
(Staging Area) Process Good

Control Metadata
Bad
Cleansing
•Clean the Raw Data Reports
•Mark it Good/Bad
•Generate the cleansing Reports and mail to
the DWA and Feed System representatives
10/18/08 TCS Confidential 12
Transformation Process

Process Metadata
•Mapping Detail
•Transformation Rule
Clean
Operational Transformation Operational
Data Process Data
Store
Control Metadata

•Transform the cleaned Operational Data into DSS Data


•Load the DSS data into ODS
•ODS contains the current DSS data at the lowest level of granularity

10/18/08 TCS Confidential 13


Summarization Process

ODS DW

Summarization
Process Weekly Monthly Yearly

Control Metadata

• Summarize and aggregate ODS data and Populate to the Warehouse


• Periodicity of Summarization Process depends upon the level of
summarization at Warehouse ( weekly, monthly, daily )

10/18/08 TCS Confidential 14


DW Components/Tools
➭ Extraction/transformation/load tool (family of
tools including data modeling tool, extraction
tool, Meta data repository, and DW
administration tools)
➭ Meta data exchange architecture (API used to
integrate all components of DW with central
Meta data)
➭ Target databases (relational, multidimensional,
hybrid)
➭ Data access and analysis tools for end users
➭ Database servers, operating systems, networks
10/18/08 TCS Confidential 15
DW Options and Architectures

➭ Virtual Data Warehouse


➭ Enterprise Data Warehouse
➭ Data Marts
➭ Distributed Data Marts
➭ Multi-tiered warehouse

10/18/08 TCS Confidential 16


Virtual Data Warehouse

Legacy
U
Client/ A S
Server P E
I R
OLTP S
Application

External
Operational Systems Data
10/18/08 TCS Confidential 17
Enterprise Data Warehouse

Legacy Select
Metadata
Repository
Extract
Client/ U
Server A
Transform DATA
S
P E
WAREHOUSE
OLTP Integrate I R
S
Maintain
External
Data Preparation
Operational Systems
Enterprise wide Data
10/18/08 TCS Confidential 18
Data Marts

Legacy Select
Metadata
Repository
Extract
Client/ U
Server A
Transform DATA MART
S
P E
OLTP Integrate I R
S
Maintain
External
Data Preparation
Operational Systems
Data
10/18/08 TCS Confidential 19
Distributed Data Marts

Legacy Select Data Mart

Extract
Client/ U
Server A S
Transform Data Mart P E
OLTP Integrate I R
S
Maintain
External Data Mart
Data Preparation
Operational Systems
Data
10/18/08 TCS Confidential 20
Multi-tiered Data Warehouse

Data Mart
Legacy
Select

Extract
Client/
Server
Metadata U
Repository A S
Data Mart P
Transform E
DATA
OLTP WAREHOUSE I R
Integrate
S
External Maintain
Data Mart
Operational Systems
Enterprise wide Data
10/18/08 TCS Confidential 21
Multi-tiered Data Warehouse

Legacy Select Data Mart

Extract Metadata
Client/ Repository
U
Server A S
Transform Data Mart DATA P
WAREHOUSE E
OLTP Integrate I R
S
Maintain
External Data Mart
Data Preparation
Operational Systems
Data
10/18/08 TCS Confidential 22
Steps in Building a Warehouse
➭ Identify key business drivers
➭ Survey information needs and identify desired
functionality and define functional requirements for
initial subject area.
➭ Architect long-term, data warehousing architecture
➭ Evaluate and Finalize DW tool & technology
➭ Conduct Proof-of-Concept

10/18/08 TCS Confidential Cont. 23


Steps in Building a Warehouse

➭ Design target data base schema


➭ Build data mapping, extract, transformation,
cleansing and aggregation/summarization rules
➭ Build initial data mart, using exact subset of
enterprise data warehousing architecture and expand
to enterprise architecture over subsequent phases
➭ Maintain and administer data warehouse

10/18/08 TCS Confidential 24


Representative DSS Tools:

Tool Category Products


ETL Tools ETI Extract, Informatica, IBM Visual Warehouse
Oracle Warehouse Builder
OLAP Server Oracle Express Server, Hyperion Essbase, IBM DB2
OLAP Server, Microsoft SQL Server OLAP Services,
Seagate HOLOS, SAS/MDDB
OLAP Tools Oracle Express Suite, Business Objects, Web Intelligence,
SAS, Cognos Powerplay/Impromtu, KALIDO,
MicroStrategy, Brio Query, MetaCube
Data Warehouse Oracle, Informix, Teradata, DB2/UDB, Sybase, Microsoft
SQL Server, RedBricks
Data Mining & SAS Enterprise Miner, IBM Intelligent Miner,
Analysis SPSS/Clementine, TCS Tools

10/18/08 TCS Confidential 25


QUESTIONS ???

10/18/08 TCS Confidential 26

You might also like