You are on page 1of 21

Full-time MBA: Business Statistics (BST510)

Inferential Statistics:
Hypothesis Testing

Section 5
Dr. Paul Bottomley
Bottomleypa@cardiff.ac.uk (Room F03)
Silver, pp. 206-213.
Statistical Hypotheses
A statistical hypothesis is an assertion, claim or prediction
about a population parameter (e.g., , , ).
Statistical inference: does the sample statistic support such
a claim about the population parameter?
The truth is never known with certainty unless we examine
the whole population > problem of sampling error.
Hypotheses are usually formulated in the hope of rejecting
them! But we need strong empirical evidence to do so.

Null hypothesis (H0); Alternative hypothesis (H1).


(H0) and (H1) are mutually exclusive & together exhaustive.
Setting Up Hypotheses for Testing
Three Views of the World Competing Hypotheses
H(a): Customer satisfaction has H(d): Customer satisfaction is
not changed (is equal) between not equal between t1 and t2.
time 1 and time 2.
H(e): Customer satisfaction
H(b): Customer satisfaction has has increased or not changed
decreased between t1 and t2. between t1 and t2.

H(c): Customer satisfaction has H(f): Customer satisfaction


increased between t1 and t2. has decreased or not changed
constant between t1 and t2.

Each pair of hypotheses (row) covers all 3 possible outcomes.


Null hypo. always includes a statement of equality (no change).
One-Tailed and Two-Tailed Tests
A test of any hypothesis where the alternative hypothesis
(H1) is non-directional (two-sided) is called a two-tailed test.

H 0 : CS1 CS 2
H 1 : CS1 CS 2
A test of any hypothesis where the alternative hypothesis
(H1) is directional (one-sided) is called a one-tailed test.

H 0 : CS1 CS 2 H 0 : CS1 CS 2
H 1 : CS1 CS 2 H 1 : CS1 CS 2
Now for an Old Multiple Choice Question
A motor vehicle breakdown companys records suggest that it takes
on average 45 minutes to reach customers (motorists). The firm
makes many changes in an effort to improve its response time and
conducts a survey to test the success of the changes. Identify the
appropriate pair of null & alternative hypotheses to test this claim.

Option A: H0: = 45; H1: < 45


Option B: H0: = 45; H1: = 45
Option C: H0: 45; H1: < 45 OK
Option D: H0: 45; H1: 45

Each pair of hypotheses must be mutually exclusive (have nothing


in common) and exhaustive (cover all three states of the world).
Statistical Significance ():
Chance of Committing a Type I Error

H0 Retained H0 Rejected Verdict: Verdict:


Null OK (no error) Type I error: Innocent Guilty
True Correct: Wrong:
hypothesis (1 - ) () Situation: (1 - ) ()
is true Innocent
Null Type II error: OK (no error) True Wrong: Correct:
hypothesis () (1 - ) Situation: () (1 - )
is false Guilty

Type I error: reject the null hypothesis (H0) when it is


true. Denoted by alpha ()
Type II error: retain the null hypothesis (H0) when it is
false. Denoted by beta ().
Significance (): f(level of risk, importance of decision).
Sampling Distribution of the
Sample Means (SDSM)
is a probability distribution. It consists of the means of all
random samples of size n that can be drawn from a population.

Central Limit Theorem: the sampling distribution of the sample


means (SDSM) is Normally distributed, when n is large (n 30).

The mean of all possible sample means = the population mean.


_ Unbiased estimate
x
The standard deviation of all the possible sample means (SD of
the sampling distribution) is known as the standard error.
s

_ Estimate of true standard error
x n n
Sampling Distribution
Of the Sample Means
(SDSM)
Testing Means for Large Samples
Null hypothesis: population mean
equals K, implies that the SDSM
Standard Normal (Z)
has a mean equal to K (CLT).
Only 5% of all sample means H0
lie beyond 1.96 standard errors Reject Reject
from the hypothesized value (K).
Test: find out how many standard
errors our sample mean (X bar) 2.5% Dont 2.5%
is from K. Reject
_
H0
X
Z -1.96 K +1.96
s n
Decision Rule
If |Z| < 1.96 we retain (accept) Ho as being true.
If |Z| > 1.96 we reject H0 and H1 is supported.
Department Store Shopping
Average monthly spending on department store shopping
of people living in Bath is claimed to be 180. A random
sample of 300 households shows spending on average is
185 with a standard deviation of 55. Use a 5% level of
significance to check the validity of this claim.

H0: = 180; H1: 180


Determine significance level (): = 0.05 (two-tailed test).
Identify the appropriate test: n = 300 (CLT); Z test.
Critical region: Z table with 2.5% in each tail, Z = +/- 1.96.

Computations:
Department Store Shopping Cont.
H0 _
X Reject Reject
Z(0,1)

=180 185 -1.96 0 +1.96


_
X 185 180 5
Z 1.57
s / n 45 / 300 3.175

Retain H0. Pop. mean expenditure on department store


shopping appears to be 180, at the 5% significance level.
What if Dept. Store Shopping was
found to be on average 188?
H0 _
X Reject Reject
Z(0,1)

= 180 188 -1.96 0 +1.96


_
X 188 180 8
Z 2.52
s / n 55 / 300 3.175

Reject Ho. Pop. mean expenditure on department store


shopping is different from 180, at the 5% signif. level.
Park and Ride Scheme
Following reorganisation of the park-and-ride scheme, it is
believed journey times to the city centre have decreased.
Before reorganising, mean journey time was 40 minutes.
After reorganisation, a random sample of 100 passengers
were found to take on average 35 minutes with a standard
deviation of 15 minutes. Have services improved? Test
your claim at the 1% level of significance.

H0: 40; H1: < 40


Determine significance level: = 0.01 (one-tailed test)
Identify the appropriate test: n = 100 (CLT); Z test.
Critical region: Z = - 2.33 (1% area in the lower tail).
Computations:
Park and Ride Scheme cont.
_ H0
Reject
X Z(0,1)

35 = 40 -2.33 0
_
X 35 40 5
Z 3.33
s / n 15 / 100 1.5
As the calculated Z value (absolute) is greater than the critical
value, we reject H0. At the 1% significance level, we can conclude
that average journey times to and from the city have decreased.
When to Retain H0
5% Significance Level Rule
Decision Rule:
If the chance of the result occurring under H0 is < 5%, reject H0.
If the chance of our result occurring under H0 is > 5%, retain H0.

Be Careful:
Rejection of H0 never proves H1 is truebut it does signify
support for H1.
Retention of H0 never proves H0 is trueour tests are only
capable of disproving (but not confirming) a hypothesis.

In Practice:
Computers report significance levels () YOU decide whether
to retain or reject H0.
Recap: Is my correlation a real effect?
Value of Cumulative % of values
r Frequency Frequency below r
-1 0 0 0
Q: Is there a correlation between room temp. and
-0.99
-0.97
0
2
0
2
0
0.0002
productivity? Take a small sample (n=7), r = -0.81
-0.95 2 4 0.0004 But, could a value of r as big as this have
-0.93 5 9 0.0010
-0.91 4 13 0.0015
occurred by chance? How well could chance do?
-0.89 11 24 0.0027 The histogram shows values of r calculated from
-0.87 14 38 0.0042
-0.85 10 48 0.0053
random 8973 7-point scatter plots (dartboards).
-0.83 14 62 0.0069
-0.81 17 79 0.0088
Very few will Most will look Very few will
-0.79 14 93 0.0104
-0.77 32 125 0.0139 look like this like this look like this
-0.75 37 162 0.0181
12 10 14
-0.73 38 200 0.0223 10
9 12
8
10
-0.71 42 242 0.0270 8 7
6 8
6 5
-0.69 38 280 0.0312 4
4
3
6
4
2
-0.67 40 320 0.0357 2
1
2
0
0 0
0 2 4 6 8
0 2 4 6 8 0 2 4 6 8

0.67 48 8721 0.9719


0.69 38 8759 0.9762
0.71 37 8796 0.9803
0.73 32 8828 0.9838
250
0.75 29 8857 0.9871
0.77 27 8884 0.9901 200
0.79 17 8901 0.9910
2.5% 2.5%
0.81 19 8920 0.9941 150 in
in
0.83 12 8932 0.9954 here here
100
0.85
0.87
17
4
8949
8953
0.9973
0.9978
95% in here
0.89 8 8961 0.9987 50
0.91 6 8967 0.9993
0.93 1 8968 0.9994 0

09

19

29

39

49

59

69

79

89

99
-1

1
0.95 3 8971 0.9998
.9

.8

.7

.6

.5

.4

.3

.2

.1

.0
0.

0.

0.

0.

0.

0.

0.

0.

0.

0.
-0

-0

-0

-0

-0

-0

-0

-0

-0

-0
0.97 2 8973 1
0.99 0 8973 1
1.00 0 8973 1
Recap: Hypothesis Test for Correlations
Null: no correlation between room temperature and production (r = 0); Alt: r 0.
Step1: Decide on a value for the significance
Usually 0.05 (5%) for social science data (industry standard)
Step 2: Is it a 1- or 2-tail test?
Could correl be either +ve or ve (2-tail), only +ve (1-tail) or only ve (1-tail)
Calculate r
Step 3: Work out degrees of freedom (df) ( n-2 for a correlation coefficient )
Two random points will always lie on a straight line (r = +1 or -1 every time)
Step 4: Calculate a critical value for r and identify the critical region

2=0.1 2=0.05 2=0.01


df
r = -0.754 r = +0.754 =0.05 =0.025 =0.005
Non-
critical 4
region
5 0.754

r = -1 r =0 r = +1
r =-0.81 Seven pairs of data points (n = 7)
If r is in critical region, reject null hypothesis
A New Corporate Identity
A design agency develops a new logo for a client. A
random sample of 60 customers views the logo and their
thoughts and opinions are recorded. The logo receives an
average rating of 4.25 with a standard deviation of 0.75 on
a seven-point scale (1 = very bad, 7 = very good). Test
whether the logo scored above the scale midpoint (at the
5% level of significance).

H0: 4; H1: > 4.


Determine significance level (): = 0.05 (1-tailed test).
Identify the appropriate test: n = 60 CLT; Z test.
Critical region: Z table with 5% in the upper tail, Z = +1.64.
Computations:
Corporate Identity cont.
H0 _
X Z(0,1) Reject

= 4 4.25 0 1.64
_
X 4.25 4 0.25
Z 2.58
s / n 0.75 / 60 0.75 / 7.75
Since the calculated Z value is greater than the critical value,
we reject H0. The logo was favorably received, scoring above
the scale midpoint, at the 5% (and 1%) level of significance.
Conclusions
Be careful selecting the significance level ().
Less than 5% (1%) - difference was (highly) significant.
Is somewhat conservative - favors retaining H0.
But are these statistical differences practically meaningful?

Directional and non-directional hypotheses.


Two-tailed tests are conservative - critical value for a 1-tailed test
is always smaller than for a corresponding 2-tailed test.

Beware: statistically insignificant findings can be just


as illuminating but are often discriminated against.

What about Type II errors? (beyond this course!)

You might also like