Professional Documents
Culture Documents
Class Activity
List down some of your daily activities
Analyse what kind of data you are generating for
companies
Visualize the quantum of data you are generating
Understand how these data is being used for CRM by
companies
Learning Objectives
Understand the different types of Customer related data
- Corporate customer data
- Structured and unstructured data
Know good quality or clean data
Get introduced to the concept of Data Warehousing
ABSA case study for understanding the application of
data warehouse in CRM
Relational database
Concept of primary key like for sales database each
customer is assigned a unique no. which is the primary
key
Share a common structure of files, records and fields
( tables, rows and columns)
3/14/16
Amit Kumar
Data Integration
Integration of multiple databases in a standardized
manner is data integration
It helps to create a single view of the customer across
different departments
Mar
sale keti
ng Cust
s
Sup
ply
Fina chai
nce n
Single view of
the customer
servi
ce
Single view of
the customer
Fun Quiz
1. What is a data warehouse
a. Name of a store of a company
b. Collection of data
c. Warehouse of data
Answer is C
3/14/16
Amit Kumar
15
Fun Quiz
2. Data in the data warehouse can be helpful for better
decision making for a business.
a. True
b. false
Answer is a
3/14/16
Amit Kumar
16
Fun Quiz
3. ETL is an abbreviation for:
a. Elevation, Transfer and Loading
b. Extraction, Transformation and Loading
Answer is b
3/14/16
Amit Kumar
17
Fun Quiz
4. OLAP is an abbreviation for:
Answer is Online Analytical Processing
3/14/16
Amit Kumar
18
Fun Quiz
5. OLTP is an abbreviation for:
Answer is Online Transaction Processing
3/14/16
Amit Kumar
19
Fun Quiz
6. ERP is an abbreviation for:
Answer is Enterprise Resource Planning
3/14/16
Amit Kumar
20
Data Warehouse
Subject oriented data organized around the essential
subjects of the business customers and products
rather than around applications such as inventory
management or order processing
Integrated It is consistent in the way that data from
several sources are extracted and transformed
Time-variant data are organized by various time
periods
Non-volatile The warehouse data is not updated in real
time. There is periodic bulk uploading of transactional
and other data
Data Marts
Scaled down version or
subset of the data warehouse
Customized for use in
particular department
Less complex and less expensive
Volume of data is less
Data Mart
Bigger size
Relatively simpler
Easier to manage
Expensive
Less expensive
Knowledge management
Practice of consciously gathering, organizing, storing,
interpreting, distributing and judiciously applying
knowledge to fulfill the customer management goals
and objectives of the organization
The STARTS attribute is valid for knowledge also as
Knowledge needs to e shareable, transportable,
accurate, relevant, timely updated and secured
Setting up a DW
Identify the sources of
Data
Where are the data
stored
Extract the data from
these systems
Transform the data in
standardized and clean
format
Upload the data from
these systems
Update/Refresh the data
in the warehouse
ETL ( Extract
Transform &
Load)
Simple DW Architecture
Data
Mart
Marketing
Sales
Billing
ETL
Integrati
on Layer
Data Warehouse
DM
DM
Supply Chain
Data Warehouse
CRM
Analytics
OLAP Analysis
Data Mining
Reporting - Example
Time
Dimensio
n
Order date,
Year,
Quarter,
Month
Customer
Dimension
Name
Address
Age
Income
Dimension
Fact
Table
Total sales
revenue
Total
quantity,
Freight
discount
Name
Category
Price
Employee
Dimension
Name
Supervisor
Department
Region
Territory
Estimatio
n
Prediction
Affinity
Grouping
Clustering
Descriptio
n
Characteristics
Data Collection
What is my average
total revenue over
the last 3 years?
Static Data
Data Access
Dynamic data
delivery at record
level
Data Navigation
OLAP ( Online
Analytical
Processing), multi
dimensional
databases
Dynamic data at
multiple levels
Data Mining
Advanced Algorithm,
Multiprocessor
computers, massive
Prospective,
Proactive information
delivery
MS Excel
Relevance
Database Marketing
Sales Forecasting
Clustering
Basket Analysis
Class Exercise - 1
Write 3 Data mining applications and their relevance in
CRM for following industry?
- Telecommunication
- Banking
How it works?
Decision trees recursively split data into smaller and
smaller cells which are increasingly pure "in the sense
of having similar values of the target
Decision tree uses target variable to determine how
each input should be partitioned
Breaks the data into segments, defined by splitting
rules at each step.
Taken together, the rules for all the segments form the
decision tree model
Talk time
recharge value
(Rs.)
50
50
100
150
150
50
150
50
3G data
recharge
value
200
250
200
200
200
200
250
250
Class
A
B
A
B
B
A
B
B
50
100
0-2,
2-5,
2-5,
0-2,
200
250
200
250
=
=
=
=
A
B
A
B
50
100
200
250
150
50
100
200
250
Actions
Will be retained
will churn
refer customers
yes
no
no
yes
no
no
yes
yes
no
no
yes
yes
yes
no
yes
no
no
no
yes
no
Regression Analysis
Regression Analysis
Predictive modeling technique
Both input and target variables should be numeric
Purpose of regression
Quantify the relationship among two or more variables.
Explain a dependent variable, from a set of predictor
variables, called the independent variables
Uses a linear additive relation between the dependent and
independent variables
Concepts
Variable
Dependent variable
Independent variable
Correlation
Correlation Coefficient
Line of best fit
Regression can
Estimate the value of target variable
Describe the relationship between variables
Residuals ( Actual Predicted)
R2 (Coefficient of Determination)
Demand Analysis
Salest = a + b1 Pricet + et
Simple Regression
Yt a b1 X1t e t
Future
Prices
Regression
Regression
Multiple Regression
Multiple independent variables
Yi ab1 X 1i b X 2i .... b X ki ei
Example
Salest = a + b1* Pricet + b2* Advt + et
Link Analysis
Link Analysis
Based on a branch of mathematics called Graph Theory,
which represents relationships between different objects
as edges in a graph.
It can be used for both directed and undirected data
mining
Graph theory
Helps in visualizing relationships
Is not applicable to all types of data
Cannot solve all types of problems
Yields good results in
Analyzing link between web pages
Analyzing telephone call patterns to find influential customers
Understanding physician referral patterns
A Graph consists of
Nodes: (Vertices) Things in the graphs that have
relationships Eg. People , Organization, Objects
Edges: Pairs of nodes connected by
relationships
Linkedin
Facebook (predicting the home address)
Dating sites
Six degrees of Separation (Stanley Milgram, 1967)