You are on page 1of 31

BUSINESS MATHEMATICS

&
STATISTICS
Module 6
Correlation
( Lecture 28-29)
Line Fitting
( Lectures 30-31)
Time Series and Exponential Smoothing
( Lectures 32-33)
LECTURE 28
Review Lecture 27
Measures of Dispersion
Correlation
Part 1
Variance

Important Measure of Variation


Shows Variation About the Mean:

 X 
2
For the Population:  
 i

N
  
2
For the Sample: X i X
s 2

n 1
For the Population: use N in the For the Sample : use n - 1 in
denominator. the denominator.
Standard Deviation

Most Important Measure of Variation


Shows Variation About the Mean
Same unit of measurement as the observations
For the Population: 
 iX   2

 
N

 X i  X 
For the Sample: 2
s 
n  1

For the Population: use N in the For the Sample : use n - 1 in


denominator. the denominator.
Sample Standard Deviation

 X i  X 
2
For the Sample : use n - 1
s  in the denominator.
n1

Data: X i : 10 12 14 15 17 18 18 24

n=8 Mean =16

s= (10  16) 2  (12  16) 2  (14  16) 2  (15  16) 2  (17  16) 2  (18  16) 2  (24  16) 2
81

= 4.2426
Comparing Standard Deviations

Xi :
Data : 10 12 14 15 17 18 18
24 N= 8 Mean =16

 X 
2
i X
s  4.2426
n 1

 X 
2

 i
 3.9686
N
Value for the Standard Deviation is larger for data considered
as a Sample.
Comparing Standard Deviations

Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21
s = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258

Data C
Mean = 15.5
s = 4.57
11 12 13 14 15 16 17 18 19 20 21
Coefficient of Variation

Measure of relative variation


Always a %
Shows variation relative to mean
Used to compare 2 or more groups
Formula ( for sample):

S 
CV     100%
X 
Comparing Coefficient of Variation

• Stock A: average price last year = Rs. 50


• Standard deviation = Rs. 5
• Stock B: average price last year = Rs. 100
• Standard deviation = Rs. 5

Coefficient of Variation:
S 
CV     100% Stock A: CV = 10%
X  Stock B: CV = 5%
Shape

Describes how data are distributed


Measures of shape:
Symmetric or skewed

Left-Skewed Symmetric Right-Skewed


Mean < Median < Mode Mean = Median =Mode Mode < Median < Mean
DISPERSION OF DATA
Mean Deviation About Mean
MD(mean) = Sum(xi- mean)/n

For Grouped data


MD(mean) = Sum fi (xi - mean)/Sum fi

Mean Deviation About Median


MD(median) = Sum(xi-median)/n

For Grouped data


MD(median) = Sum fi (xi-median)/Sum fi
CORRELATION

Types of regression models


Determining the simple linear regression equation
Measures of variation in regression and correlation
Assumptions of regression and correlation
CORRELATION
Residual analysis
Inferences about the slope
Estimation of predicted values
Pitfalls in regression and ethical issues
Correlation - measuring the strength of the association
Purpose of Regression Analysis

Regression Analysis Is Used Primarily to Model Causality


And Provide Prediction
Predict the values of a dependent (response) variable
based on values of at least one independent (explanatory
variable)
Explain the effect of the independent variables on the
dependent variable
Purpose of Regression Analysis

Correlation Analysis Is Used to Measure Strength of


Association Between Numerical Variables
Only Strength of the Relationship is Concerned
No Causal Effect is Implied
The Scatter Diagram

Plot of all (Xi , Yi) pairs

Y
80
60
40
20
X
0
0 20 40 60
Types of Regression Models

Positive Linear Relationship Negative Linear Relationship


BUSINESS MATHEMATICS
&
STATISTICS
Types of Regression Models

Relationship NOT Linear No Relationship


Simple Linear Regression Model

Relationship Between Variables Is


Described by A Linear
Function
The Change of One Variable Causes
The Other Variable to Change
A Dependency of One Variable on
the Other
Population Linear Regression
• Population Regression Line Is A Straight Line
that Describes The Dependence of The Average
Value of One Variable on The Other
Population Random
Population Slope Error
Y intercept Coefficient

Yi      X i   i
Dependent
(Response Population Independent
) Variable Regression (Explanatory)
Line Variable
Population Linear Regression
(continued)

Y Yi      X i   i
Observed
Value

 i = Random Error 

YX      X i

X
Observed Value
Sample Linear Regression

• Sample Regression Line Provides an Estimate of


The Population Regression Line as well as a
Predicted Value of Y Sample
Sample Slope
Y Intercept Coefficient
Yi  b0  b1 X i  ei Residual

Sample
b0 provides an estimate of   Regression
Y
b1 provides an estimate of 
Line
Sample Linear Regression
(continued)

Yi  b0  b1 X i  ei Yi      X i   i
Y b1
i 
ei
YX      X i
 Y i  b0  b1 X i
b0 X
Observed Value
Simple Linear Regression
Equation: Example
Annual
Store Square Sales
Feet (1000)
1 1,726 3,681
You wish to examine the
2 1,542 3,395
relationship between the
square footage of produce 3 2,816 6,653
stores and their annual 4 5,555 9,543
sales. Sample data for 7 5 1,292 3,318
stores were obtained. Find 6 2,208 5,563
the equation of the straight
7 1,313 3,760
line that fits the data best
Scatter Diagram Example

12000
Annua l Sa le s ($000)

10000

8000

6000

4000

2000

0
0 1000 2000 3000 4000 5000 6000

S q u a re F e e t
Excel Output
Equation for Sample
Regression Line

Y i  b0  b1 X i
 1636 . 415  1 . 487 X i

From Excel Printout:


C o e ffi c i e n ts
I n te r c e p t 1 6 3 6 .4 1 4 7 2 6
X V a ria b le 1 1 .4 8 6 6 3 3 6 5 7
Graph of the Sample
Regression Line
12000
Annua l Sa le s ($000)

10000

8000

6000

4000

2000

0
0 1000 2000 3000 4000 5000 6000

S q u a re F e e t
Interpreting the Results

Yi = 1636.415 +1.487Xi

The slope of 1.487 means that each increase of one


unit in X, we predict the average of Y to increase by
an estimated 1.487 units.

The model estimates that for each increase of 1


square foot in the size of the store, the expected
annual sales are predicted to increase by $1487.
BUSINESS
MATHEMATICS
&
STATISTICS

You might also like