You are on page 1of 15

Contents

• What is datawarehouse?
• Difference between OLTP & DW
• Different data
• Datawarehouse Architecture
• Dimensional Modeling
• CDC(Change Data Capture)
• ETL(Extract Transform Load) &
OLAP
What is datawarehouse?
A data warehouse is built on relational database that is designed for
query and analysis rather than for transaction processing. It usually
contains historical data derived from transaction data, but it can include
data from other sources. It separates analysis workload from transaction
workload and enables an organization to consolidate data from several
sources.
In addition to a relational database, a data warehouse environment
includes an extraction, transportation, transformation, and loading
(ETL) solution, an online analytical processing (OLAP) engine.
Characteristics of a data warehouse

• Subject Oriented

• Integrated
• Nonvolatile
• Time Variant
Parameter DATAWAREHOUSE OLTP

Data warehouses are OLTP systems support


designed to only predefined
accommodate ad hoc operations. Your
Workload
queries. You might not applications might be
know the workload of specifically tuned or
your data warehouse in designed to support only
advance, so a data these operations.
warehouse should be
A data warehouse is In OLTP systems, end
optimized to perform
updated on a variety
well for a wide regularof users routinely issue
Data basis byquery
possible the ETL individual data
Modificatio process
operations(run nightly modification
ns or weekly) using bulk statements to the
data modification database. The OLTP
techniques. The end database is always up
users of a data to date, and reflects
warehouse do not the current state of
Parameter DATAWAREHOUSE OLTP

Data warehouses often OLTP systems often use


use denormalized or fully normalized schemas
partially denormalized to optimize
Schema
schemas (such as a star update/insert/delete
Design
schema) to optimize performance, and to
query performance. guarantee data
consistency.

Typical A typical data A typical OLTP


Operations warehouse query operation accesses
scans thousands or only a handful of
millions of rows. For records. For example,
example, "Find the "Retrieve the current
total sales for all order for this
customers last customer."
month."
Different Types of Data

• Metadata
• Dimension data
• Fact Data
• Aggregated Data
Data warehouse Architectures
Dimension Modeling

• Identify a Business Process


• Identify Grain
• Identify elements which describe the process
(Dimensions)
• Identify elements which measure process(facts)10
Schemas

Star schema Snowflake


schema

Product details

Channel detail
Indexing on Data
warehouse
• Bitmap Indexing
• B-tree Indexing
Why Bitmap Indexing?

Reduced response time for large classes of ad hoc queries

Reduced storage requirements compared to other indexing techniques

Efficient maintenance during parallel DML and loads


Change Data capture

Simple Pass through


Slowly growing Target
Slowly changing dimensions
Type-1
Type-2
Type-3
ETL(Extract Transform Load)

• Extract data from source systems


• Apply Transformation functions
(cleaning,filtering,checking for consistency etc)
• Load data into warehouse
• Schedule warehouse

You might also like