You are on page 1of 28

Data Warehouse

& Business Intelligence

Introducing Dimensional Model

Parmonangan Togatorop: mona.togatorop@del.ac.id


Del Institute of Technology
Overview:
Architecture of a Data Warehouse with a
Staging Area and Data Marts

DBI/PAT/2017 2
Kimball Life Cycle

DBI/PAT/2017 3
Normalized vs. Dimensional
• There are two leading approaches to storing data in a data
warehouse:
1. The normalized approach (Inmon )
2. The dimensional approach (Kimball )
there are other approaches

DBI/PAT/2017 4
Normalized
• The data in the data warehouse are stored following, to a
degree, database normalization rules.
• Tables are grouped together by subject areas that reflect
general data categories (e.g., data on customers, products,
finance, etc.)
• The main advantage of this approach is that it is
straightforward to add information into the database.
• Disadvantage : because of the number of tables involved, it
can be difficult for users to [1]join data from different sources
into meaningful information and [2]access the information
without a precise understanding of the sources of data and
of the data structure of the data warehouse.

DBI/PAT/2017 5
Dimensional Modeling
• Dimensional modeling is a logical design technique for
structuring data so that it’s intuitive to business users and
delivers fast query performance
• Transaction data are partitioned into facts (numeric
transaction data), and dimensions (reference information that
gives context to the facts).
– sales transaction can be broken up into
• facts such as the number of products ordered and the price paid
for the products,
• dimensions such as order date, customer name, product number,
order ship-to and bill-to locations, and salesperson responsible for
receiving the order.

DBI/PAT/2017 6
Benefits of Dimensional Modeling
• Understandability: data warehouse is easier for the user to
understand and to use.
• Query performance: the retrieval of data from the data
warehouse tends to operate very quickly.

DBI/PAT/2017 7
Fact and Dimensions
• Dimensional modeling implies two distinct types of data:
1. Facts
2. Dimensions
• These data are stored two types of tables:
1. Fact tables
2.Dimension tables

DBI/PAT/2017 8
Fact and Dimension (2)
• Dimensional modeling implies two distinct types of data:
1. Facts
2. Dimensions
• These data are stored two types of tables:
1. Fact tables
2.Dimension tables

DBI/PAT/2017 9
Fact Table

Q1. what is a factless fact table?

DBI/PAT/2017 10
Dimension Table

DBI/PAT/2017 11
Fact -Dimension Data Model

DBI/PAT/2017 12
Fact -Dimension Data Model

Advantages:
-easy to understand
-better performance
-extensible

DBI/PAT/2017 13
Granularity
• Granularity refers to the level of detail stored in a table
• When identifying the grain, we must specify exactly what a fact
table record means.
• Each fact and dimension table is said to have its own grain or
granularity. In other words, each table (either fact or dimension)
will have some level of detail associated with it.
• The more detail there is, the lower the level of granularity. The
less detail there is, the higher the level of granularity
• The grain of the dimensional model is the finest level of detail
implied by the joining of the fact and dimension tables. For
example, the granularity of a dimensional model consisting of
dimensions, date (year, quarter, month, and day), store (region,
district, and store) and product (category name, brand, and
product) is product sold in store by day

DBI/PAT/2017 14
Conformed Dimensions
• Common, standardized, master dimensions
• Conformed Dimension is the dimension which has the same
meaning and content when being referred from different fact
tables. A conformed dimension can refer to multiple tables in
multiple data marts within the same organization. For example
: Time is a common conformed dimension because its
attributes (day, week, month, quarter, year, etc.) have the
same meaning when joined to any fact table.
• Conformed dimensions deliver consistent descriptive attributes
across dimensional models. They support the ability to drill
across and integrate data from multiple business processes.
• Reusing conformed dimensions shortens the time-to-market by
eliminating redundant design and development efforts.
DBI/PAT/2017 15
Conformed Facts

• Fact conformation means that if two facts exist in two separate


locations, then they must have the same name and definition.
• As examples, revenue and profit are each facts that must be
conformed. By conforming a fact, then all business processes
agree on one common definition for the revenue and profit
measures. Then, revenue and profit, even when taken from
separate fact tables, can be mathematically combined.

DBI/PAT/2017 16
Slowly Changing Dimensions

Q2. what is hybrid SCD?

DBI/PAT/2017 17
Slowly Changing Dimensions (2)
Type 1 Type 2

Type 3

DBI/PAT/2017 18
Enterprise Data Warehouse Bus
Architecture
• Planning the construction of overall DW/BI environment is a
critical activity
• Building all at once is too daunting
• Building it as isolated pieces defeats the overall goals
What is to be done?
“start with a quick and sufficient effort that defines the
overall enterprise DW/BI data architecture”
-> the enterprise data warehouse bus matrix

DBI/PAT/2017 19
Enterprise Data Warehouse Bus
Architecture
The Enterprise Data Warehouse Bus Matrix
• The matrix delivers the big picture perspective, regardless of
database or technology preferences, while also identifying
reasonably manageable development efforts.
• Each business process implementation incrementally builds
out the overall architecture
• Multiple development teams can work on component of the
matrix fairly independently and asynchronously

DBI/PAT/2017 20
Enterprise Data Warehouse Bus
Architecture
• Enterprise Data Warehouse Bus allows for incremental data
warehouse and business intelligence (DW/BI) development. It
decomposes the DW/BI planning process into manageable
pieces by focusing on the organization’s core business
processes, along with the associated conformed dimensions.

DBI/PAT/2017 21
Four-Step Dimensional Design Process

Four-Step Dimensional Design Process :


1. Select the subject area/business process to model
2. Declare the grain of the business process.
3. Choose the dimensions that apply to each fact table
row and define their attributes.
4. Identify the numeric facts that will populate each fact
table row.

DBI/PAT/2017 22
1. Select the business process
• A process is a natural business activity performed in the
organization
• Typically is supported by a source data collection system.
• Example business processes include:
 raw materials purchasing,
 orders,
 shipments,
 invoicing,
 inventory,
 general ledger
• It is not an organizational business department or function. For
example, we build a single dimensional model to handle orders
data rather than building separate models for the sales and
marketing departments, which both want to access orders data.
DBI/PAT/2017 23
2. Declare the grain
• Declaring the grain means specifying exactly what an individual
fact table row represents.
• The grain conveys the level of detail associated with the fact
table measurements.
• It provides the answer to the question, “How do you describe a
single row in the fact table?”
• Example grain declarations include:
 An individual line item on a customer’s retail
 sales ticket as measured by a scanner device
 A line item on a bill received from a doctor
 An individual boarding pass to get on a flight
 A daily snapshot of the inventory levels for
 each product in a warehouse
 A monthly snapshot for each bank account
DBI/PAT/2017 24
2. Declare the grain (2)
• An inappropriate grain declaration will haunt a data warehouse
implementation.
• Declaring the grain is a critical step that can’t be taken lightly.
• If in steps 3 or 4 we see that the grain statement is wrong we
must return to step 2, redeclare the grain correctly, and revisit
steps 3 and 4 again.

DBI/PAT/2017 25
3. Choose the dimensions
• If we are clear about the grain, then the dimensions typically can
be identified quite easily.
• Represent all possible descriptions that take on single values in
the context of each measurement.
• Examples of common dimensions include:
 date,
 product,
 customer,
 transaction type,
 status.

DBI/PAT/2017 26
4. Identify the facts
• Facts are determined by answering the question, “What are we
measuring?”
• All candidate facts in a design must be true to the grain defined
in step 2.
• Facts that clearly belong to a different grain must be in a separate
fact table.
• Typical facts are numeric additive figures such as quantity
ordered or dollar cost amount.

DBI/PAT/2017 27
Bibliography
1. Ralph Kimball, Margy Ross - The Data Warehouse
Toolkit, Second Edition, Wiley & Sons, 2007

DBI/PAT/2017 28

You might also like