You are on page 1of 36

Biostatistics

Lecture 7
Lecture 6 Review
Comparing two groups
Population mean difference

1
-
0
???

x
1

x
0

Sample mean difference =
Lecture 6: 95% Confidence Interval
Lecture 7: P-value
Sample of n
1
from group 1

Sample of n
0
from group 0

Example: Randomised controlled trial
UK
of weight loss programmes in the
Estimate of difference in population mean weight loss
after 4 weeks between Atkins & Weight Watchers groups
= 4.40 2.86 = 1.54 kg


Group

n
Sample mean
weight loss after
4 weeks (kg)
Sample
standard
deviation
Sample
standard error
Atkins - 1 57 4.40 2.45 0.32
Weight
Watchers - 0
58 2.86 2.23 0.29
Sampling distribution for mean
difference between two groups
Suppose we were to:


Collect many samples of 115 people:
Atkins diet
Weight Watchers group
Allocate
57
Allocate
58
Calculate the
people
people
mean
to the
to the


weight loss in each group after
x
0
)

4 weeks ( x
1
and

x
1

x
0


Calculate the sample mean difference
We could then graph all the sample mean differences
(this is the sampling distribution)
Sampling distribution for difference
in means between two groups
8
6
Percent
4
2
around the
0
from different samples

The sample
differences in
means
approximately
follow a normal
distribution
centred

population
mean difference

1
-

0

Sample difference in means
Variability of sample differences
in means
The standard deviation of the sampling distribution of

the sample differences in means is called the
error of the differences in means
standard
This measures the variability of the sample differences
in means that would arise from repetitions of the same
study
The standard error of the differences in means is:
x
0
) = (

)
2

+
(

)
2

s.e. / n / n ( x
1 1 0 0 1
which we can estimate by
x
0
) = (s /
)
2

+ (s
/
)
2

s.e. n n ( x
1 1 0 0 1
Standard error of
the mean in:
group 0
group 1
Sampling distribution for difference
in means between two groups
8
6
Percent
4
2
1.96
s.e.( x
1

+ 1.96 s.e.( x
1
x

x
0
)

0

0

-
Sample difference in means
from different samples

95% of the sample
differences in means
lie within a distance
of 1.96 x
s.e.( x
1
x
0
)
of the population
mean difference

1
-
0
.

0
)

Calculating a 95% confidence
interval for the differences in means
So in 95% of the possible samples that we could
collected, the interval
have
Fro
m
to
( x
1
x
0
)

1.96 s.e.( x
1

x
0
)

( x
1
x
0
)

+ 1.96 s.e.( x
1

x
0
)

contains the (unknown) difference in population means

1
-
0
.

Interpreting
interval for the
a 95% confidence
differences in means
The estimated difference in population mean weight loss
weeks between Atkins & Weight Watchers groups is 1.54
kg (95% CI: 0.70 kg to 2.38 kg).
95% confidence interval for the difference in means:


A range of values that we are 95% confident
contains the true population difference in means
A range of plausible values for the population
difference in means


Difference between groups:
process of analysis
We follow a standard three-step process
Estimate
Estimate the size of the difference
exposure groups
in outcome between
Confidence interval
Calculate a confidence interval for
this difference
P-value
Derive a p-value to test the hypothesis that there is no
association between the exposure and the outcome in
the population
Lecture 7
Using confidence intervals and P-values
to interpret the results of statistical analyses
Hypothesis tests (P-values)





Explanation of null hypothesis
What is a P-value?
Interpreting a P-value
Calculating a P-value
Type I & II error
Interpreting results of statistical analyses using
P-values and confidence intervals

Disproving hypotheses
We

could:
Prove the hypothesis by finding
every single swan in the world and
checking that they are all white.
Disprove the hypothesis by
finding just one black swan.

It is easier to find evidence against a
hypothesis than to prove it to be correct.
Hypothesis: All swans are white
Null hypothesis
A null hypothesis is one that proposes there is
no difference in outcome between exposed and
unexposed in the population
Examples of null hypotheses:
The new treatment has no effect on mean rotation of the neck
(for people with restricted neck movement)
MMR vaccination does not affect a childs subsequent risk of
autism
Birthweight is not associated with subsequent IQ

We commonly design research to disprove a null hypothesis
Weight loss example
In the weight loss trial, people using the Atkins diet lost,
on average, 1.54 kg more weight than people in a Weight
Watchers group.

Could this be a chance finding?
i.e. Could it be consistent with the null hypothesis of no
true (population) difference in weight loss?
Or does it constitute evidence against the null hypothesis?

Null hypothesis:
The population mean weight loss under the Atkins diet
is the same as the population mean weight loss under
the Weight Watchers program.
Weight loss example: P-value
We address this question by calculating a P-value:
probability of getting a sample difference at
least this big
if
the mean weight loss in the two populations
really the same
is
This P-value quantifies the evidence against the null
hypothesis.
P-value: Comparing two groups
What is the
probability (P-value)
of finding the
observed difference
How likely is it
we would see a
difference this big
IF
IF
The null hypothesis
is true?
There was NO real
difference between
the populations?

8
6
Weight loss example: The P-value
-0.84 0 0.84
Sample difference in means
from different samples

Standard error of the difference = 0.43
95% of sample difference in means lie between 1.96 x 0.43

Population mean
difference is 0
if the null
hypothesis is true

Percent
4

2
0


8
6
Percent
4

2
0
-0.84 0 0.84
Sample difference in means
from different samples

8
6
Percent
4

2
Weight loss example: The P-value
Proportion of
samples where
Our sample
observe a
mean
difference at
difference
= 1.54
as our one
-0.84 0 0.84 1.54
P-val
Prop
sa
we w
obse
diffe
least
as o
ue =
ould
as large

Our
diffe
=

Our
diffe
=

0
-0.84 0 0.84
Sample difference in means
from different samples
0.03% of the area is shaded in red
i.e. P-value = 0.0003
Interpretation of P-values
1
!
The smaller the P-value,
the lower the chance of
getting a difference as
big as the one observed
if the null hypothesis is true.
Weak evidence against
the null hypothesis
0.1!
Increasing evidence against
the null hypothesis with
decreasing P-value
0.01!
Therefore:
the smaller the P-value,
the stronger the evidence
against the null hypothesis.
0.001!
Strong evidence against
the null hypothesis
0.0001!
P
-
v
a
l
u
e
!

How do we calculate
this P-value?
Test statistic: Comparing two groups
To calculate the required tail
8
area, we transform our observed
6
sample mean difference into
standard normal deviate.
a
Percent
4
2
0
-0.84 0 0.84 1.54
Sample difference in means
from different samples
1.96 3.58 -1.96 0



This is called a test statistic
We can look up the tail area in standard tables.
For z=3.58 we get a (two-sided) tail area of 0.0003.
The t-test
In small samples we need to use the t distribution
instead of the normal distribution (because our

estimated standard error may not be a good estimate
of the population standard error).
This test is valid in large and small samples (since in
large samples the t distribution is almost identical to the
normal distribution).
This gives rise to the name t-test.
The t-test tests the null hypothesis that two
population
means are equal.
The test statistic is sometimes called the t-statistic.
Stata t-test

. ttesti 57 4.40 2.45 58 2.86 2.23

Two-sample t test with equal variances
-----------------------------------------------------------------------------
- | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+-------------------------------------------------------------------
- x |
y |
57
58
4.4
2.86
.3245104
.2928133
2.45
2.23
3.749927
2.273651
5.050073
3.446349
---------+-------------------------------------------------------------------
- combined | 115 3.623304 .2290453 2.456237 3.169567 4.077041
---------+-------------------------------------------------------------------
- diff | 1.54 .4367293 .6747605 2.40524
-----------------------------------------------------------------------------
- diff = mean(x)
Ho: diff = 0
- mean(y) t =
degrees of freedom =
3.5262
113
Ha:
X
diff

Ha:
X
diff

< 0
0.9997
Ha: diff != 0
Pr(|T| > |t|) = 0.0006
> 0
0.0003 Pr(T < t) = Pr(T > t) =
NB: Slight numerical differences are due to
rounding error in our lecture calculations!

Group

n
Sample mean
Weight loss after
4 weeks (kg)
Sample
standard
deviation
Sample
standard
error
Atkins (group 1) 57 4.40 2.45 0.32
Weight Watchers (group 0) 58 2.86 2.23 0.29
Interpretation of t-test P-value:
Weight loss trial
Null hypothesis that population mean weight loss under
the Atkins diet and Weight Watchers group are equal.
T-test gives: P=0.0006
The probability of observing a sample mean difference
1.54 kg if there truly is no difference, is 0.06%.
of
Therefore, our data provides strong evidence against the
null hypothesis.
Confidence intervals and p-values
Information used to estimate a confidence interval is the
same as for a p-value so can show they are related to
each other.

For example, we see
means for the weight
As 0 is not contained
null hypothesis at the
a 95% CI for the difference of
loss groups of (0.67, 2.38).
in this interval we can reject the
5% significance level.
In general, for example, if p=0.0006 (0.06%) then the largest
confidence interval that would not contain 0 (and thus we can
reject the null hypothesis) is 99.94%.
Type I and II error
Statistical significance
You may have learned that when P<0.05 people
sometimes say that the difference is statistically
significant and reject the null hypothesis, while
if P>0.05 they accept the null hypothesis


We do NOT recommend this approach
Is P=0.051 very different from P=0.049?
Both contain similar amounts of evidence
against the null hypothesis.
So why reject one and accept the other?
Type I error
Type II error
Investigator concludes
from sample: The
new treatment
improves neck
rotation
(i.e. reject the null
hypothesis)
Investigator concludes
from sample:
The new treatment does
not improve neck rotation
(i.e. do not reject the null
hypothesis)
WHEN
WHEN
There is a REAL
difference in neck rotation
between the new
treatment & placebo
(i.e. Null Hyp. is not true)
There was NO REAL
difference in neck
rotation between the
new treatment &
placebo
(i.e. Null Hyp. is true)
Interpretation of results
from statistical analyses:
COMPARISON OF TWO MEANS
Five trials of drugs to reduce serum cholesterol
Trial Drug Cost No. of
patients
per group
Observed
difference
in mean
cholesterol
(mmol/L)
s.e.
of
difference
(mmol/L)
95% CI for
population
difference in
mean
cholesterol
P-value
1 A Cheap 30 -1.00 1.00 -2.96 to 0.96 0.32
2 A Cheap 3000 -1.00 0.10 -1.20 to -0.80 <0.001
3 B Cheap 40 -0.50 0.83 -2.13 to 1.13 0.55
4 B Cheap 4000 -0.05 0.083 -0.21 to 0.11 0.55
5 C Expensive 5000 -0.125 0.05 -0.22 to -0.03 0.012
A reduction of 0.5 mmol/L or more corresponds to a
clinically important effect of the drug
Kirkwood & Sterne, 2003, pg 77

Trial

Drug

Cost

No. of
patients
per group

Observed
difference
in mean
cholesterol
(mmol/L)

s.e. of
difference
(mmol/L)

95% CI for
Population
difference in
mean cholesterol

P-value

1

A

Cheap

30

-1.00

1.00

-2.96 to 0.96

0.32
Drug A: Trials 1 and 2
in mean
per group (mmol/L) difference in
mean cholesterol
2 A Cheap 3000 -1.00 0.10 -1.20 to -0.802 <0.001


The estimated effect of Drug A was the same in Trials 1 and 2


Trial 1:
No
Small sample size Wide 95% CI
evidence against the null hypothesis.


Trial 2: Large sample size Narrow 95% CI
Strong evidence against the null hypothesis.
In small studies, a large P value does not mean the null hypothesis is true
(may be a Type II error)
Trial Drug Cost No. of Observed s.e. of 95% CI for P-value
patients
difference
difference Population
cholesterol
(mmol/L)
1 A Cheap 30 -1.00 1.00 -2.96 to 0.96 0.32

Trial

Drug

Cost

No. of
patients
per group

Observed
difference
in mean
cholesterol
(mmol/L)

s.e. of
difference
(mmol/L)

95% CI for
Population
difference in
mean cholesterol

P-value

3

B

Cheap

40

-0.50

0.83

-2.13 to 1.13

0.55
Drug B: Trials 3 and 4
in mean
per group (mmol/L) difference in
mean cholesterol
-0.05 0.083 -0.21 to 0.11 0.55
4 B Cheap 4000


Trial 3: Plausible values (95% CI) consistent with either
substantial benefit or substantial harm from Drug B


Trial 4: 95% CI specifically excludes any substantial effect
The p-value doesnt tell us about clinical importance;
we always also need a 95% CI as well
Trial Drug Cost No. of Observed s.e. of 95% CI for P-value
patients
difference
difference Population
cholesterol
(mmol/L)
3 B Cheap 40
-0.50 0.83 -2.13 to 1.13 0.55
Drug C: Trial 5
Trial Drug Cost No. of
patients
per group
Observed
difference
in mean
cholesterol
(mmol/L)
s.e. of
difference
(mmol/L)
95% CI for
population
difference in
mean cholesterol
P-value
5 C Expensive 5000 -0.125 0.05 -0.22 to -0.03 0.012


Trial 5:
Large sample size
Moderate evidence against the null hypothesis
BUT 95% CI doesnt include any clinically significant differences
Expensive drug
Unlikely to be used routinely
Five trials of drugs
P=0.32
to reduce
P=0.55
serum cholesterol
1
0
Reduction in
serum cholesterol
-1
(mmol/L)
-2
-3
1 2 3 4 5
Drug C
Drug A Drug B
Trial

No effect Estimated effect
Clinically important effect 95% CI
P=0.55 P=0.012

P<0.001
Lecture 7 Objectives
Understand the null hypothesis
Explanation and interpretation of confidence
and p-values
intervals
CIs give us a plausible range
x
1
of values for the population
parameter

P-values index the strength of evidence against the null
hypothesis
The smaller the p-value, the stronger the evidence against the
null hypothesis
In interpreting the results of statistical analyses, we need to
examine the observed effect size, CI & P-value


Thank You
www.HelpWithAssignment.com

You might also like