You are on page 1of 24

LINEAR

REGRESSION &
CORRELATION
ANALYSIS
CHAPTER 6
CORRELATION ANALYSIS
 Correlation is a statistical method used to
determine the strength between two
variables. (x and y)

 The value that used to estimate the strength is


called the correlation coefficient.

 The symbol for the population correlation


coefficient is  (rho).
 The symbol for the sample correlation
coefficient is r.

 Therange of the correlation coefficient is


from -1 to +1.

 Thepositive (+) and negative (-) signs


shows the direction of the relationship
between two variables.
 Negative values indicate an inverse
relationship and positive values indicate a
direct relationship.
• Values of -1.00 or +1.00 indicate perfect and
strong correlation.

• Values close to 0.0 indicate weak correlation


Correlation Coefficient
 Refer Page 254 : Example 7

Representati
Sales Calls Units Sold
ve
Correlation Coefficient

n XY    X  Y 
r
n X    X  n Y    Y  
2 2 2 2
109661  199 408
r
104681  199 1020510  408 
2 2

r  0.924
What does this correlation means?

 First, it is positive, so we see there is a direct


relationship between number of sales calls and
number of units sold.

 The value of 0.924 is really close to 1.00, so


the relationship between number of sales calls
and number of units sold is strong.

 So we conclude that there is a strong positive


relationship between sales calls and number of
units sold.
THE SIGNIFICANCE OF THE
CORRELATION COEFFICIENT
 Test
on the value of correlation coefficient
need to be done to see whether there is
a significance relationship between the
two variables.

 We will test H0 :   0
H1 :   0
 Step 1 : Hypotheses
H0 :   0
H1 :   0

 Step 2 : Test Statistics

r n2
t
1 r 2
0.924 10  2

1  0.924 
2

 6.8345
 Step 3 : Critical Value
 df = n – 2 = 10 – 2 = 8
  = 0.05/2  two-tailed test
CV = t 

0.05 / 2 ,8  2.306
 Step 4 : Decision
 Reject H0

 Conclusion
 There is a significance relationship between
number of sales calls and number of units
sold.
Coefficient of Determination, r2
 The proportion of the total variation in the
dependent variable, Y that is explained by the
variation in the independent variable, X.
 In previous example, r = 0.924.
 So r2 = (0.924)2 = 0.854
 85.4% variation in the units sold is explained by
the variation in the sales calls.
 The greater value of r2 means that the model
can be a better predictor to predict the value
of Y.
REGRESSION ANALYSIS
 Regression is a statistical method used to
describe the nature of the relationship
between variables – that is: positive or
negative, linear or nonlinear.

 Regression analysis also can model the


relationship between variables.

 Thedependent variable is Y and the


independent variable is X.
 Analysis on a linear relationship between
Y and X is called a simple linear regression
analysis.

 Ifthe value of the correlation coefficient is


significant, the next step is to determine
the equation of the regression line which is
referred to as the “best-fitting” line.

 To get the equation, we will use the least


square method.
Refer to Example 1,
Page 241
Units Sold
80
70
60
Units 50
Sold 40
30
20
10
0
0 10 20 30 40
Sales Calls
 Fromthe plot, the data seems to be linear
and the appropriate model to be
construct is  y = mx + c
Computing the Slope of the
Line and the Y-intercept

n XY    X  Y 
Slope b
 
n  X 2   X 
2

Y X
Y-intercept a b
n n
Refer to Self Example 2,
Page 244
b  2.1387 a  1.7601

Yˆ  a  bX Yˆ  1.7601  2.1387 X

LINEAR REGRESSION EQUATION


 a -- The intercept with the y-axis is below the
origin (0,0).

 b – An increase of one call made, will result in


an increase of 2.1387 (= 2) units sold.
Estimates the number of units
sold when 20 calls was made
in a month.

Yˆ  1.7601  2.1387 X
Yˆ  1.7601  2.138720 

Yˆ  41.0139
Linear Regression by SPSS
Testing the Significance of the Model @
Global Test – using output
Step 1 : H0 :   0
H1 :   0
Step 2 : p-value = 0.000 (Refer to the ANOVA table)

Step 3 : compare with  = 0.05


 > p-value
Reject H0

Step 4 : Conclusion
**There is a significance relationship between
advertising expenses and sales revenue.
Testing the Significance of the
Model @ Global Test – manually
Step 1 : H0 :   0
H1 :   0
Step 2 : Test Value, F = 46.597(Refer to the ANOVA table)

Step 3 : Critical value,


F, 0.05, 1, 8 = 5.32

Step 4 : Reject H0 because TV > CV

Step 5 : Conclusion
**There is a significance relationship between
advertising expenses and sales revenue.

You might also like