You are on page 1of 65

CH 6 INFERENTIAL

STATISTICS
Inferential statistics
 type of statistics used to draw conclusions on a
population based from data collected from a
sample
 most important step taken by an investigator to
generalize results found in the study to the
population under consideration
 make the link between the results of the sample
obtained and the population which is the target of the
research question
2. Probability distributions
 Inferential statistics is highly related to probability
distributions
 probability distribution
 statistical
function describing the probability of all
possible values of a continuous variable
 Standard normal distribution
 most frequently used distribution
 Unique mainly being symmetrical and having its mean,
median and mode equal to each other.
 completely described by the mean and its SD
68-95-99 rule
 major characteristic of any normal distribution
 area (which is referred to the probability) between -1
SD and +1 SD is 68%
 95% between -2 SD and +2 SD

 99% between -3 SD and +3 SD


Types of inferential statistics
 confidence interval
 hypothesis testing (p-value)
a. Confidence interval (CI)
 consists of 2 numerical values defining a range of
values that with a specified degree of confidence
includes the parameter being estimated
 Practically,
since a researcher is interested in estimating
average SBP in specified population impossible to
estimate value with single number
 bestway to estimate  providing a range of values, which
are attached with a specific confidence
a. Confidence interval
 way to measure how well your sample
represents the population you are
studying
 Smaller the sample size, the more
variable the responses will likely be
 Can not measure all 100,000
population of Metro Vigan ,,so how
can I use the responses from the 30
people you sampled in order to
hypothesize what all 100,000 might
say?
 And if you sample 30 other people in
a second study, how likely is it
that their responses will be the same as
the first 30 people you sampled?
a. Confidence interval (CI)
 Example: Average Height
 Heights of 40 randomly chosen men
 mean height of 175cm
 standard deviation 20cm.

 The 95% Confidence Interval (we show how to


calculate it later) is
 smaller sample,  more variable responses will bigger
margin of error
 If you talk to only 2 people, their answers could be complete
opposites
 more people you talk to, the more representative your data will be
 margin of error accounts for the range that your calculation can vary;
every time you’ve seen someone cite in a study that “the result was
X, plus or minus 8%,” that is the margin of error
 calculate confidence intervals at varying degrees of
confidence
 95% confidence interval - standard,
 but depending on scenario you might want to have more or
less confidence
 For example, if you’re designing a new test for detecting
cancer, you want to be as certain as possible it will work for
99% of the population you’re testing it for.
 research about a new pattern baldheadedness, you might
need to be only 80% confident that it fits with most of the
population’s needs.
 Your confidence interval will get wider with the
following situations:
 Higher variability in the sample
 Higher confidence required
 Smaller sample size
 Continuous data  calculate confidence interval
using average (mean)
 Discrete binary data (1 or 0, yes or no, pass or fail,
etc.) calculate confidence interval using
proportion
A. Confidence interval (CI)
 narrower CI better estimation
 wider CI  worse estimation
 Applying normal distribution’s characteristics, there will be a
95% probability that the average obtained will be between
2 SD‟s from the mean. The equation used for this calculation
will be:
T table for n<30
Z n>30
1. Average (mean)/ proportion
2. Standard deviation
3. Standard error of the mean
4. Confidence interval
5. Upper limit
6. Lower limit
 survey for number of children per family in
barangary 4
 need to test how representative your results from
your 11-person sample are in the greater
population
1. Average

2+6+5+5+6+3+7+4+1+1+7 = 4.27
11
2. Standard Deviation
3. Standard error of the mean
 standard error of the mean estimates how much
variability there could be between samples, while
the standard deviation measures how much each
piece of data within your sample might vary
3. Standard error of the mean

 FORMULA:
 bigger the sample size,  smaller SE
 smaller the sample size  bigger error
 reason SD is divided by square root of sample size is to
account for the sample size
 bigger the sample size, better estimate by obtaining a CI
which is narrower (by having a small SE),
 smaller the sample size, the estimate will be worse, by
obtaining a CI which is wide (the SE will be big)
3. Standard error of the mean
4. Calculate the confidence interval

 Margin of error= Standard error X t or z


 you can say with 95% confidence that ave number
of children per family fall between 2.92 and a
5.739
 Descriptive statistics on variables measured in a sample of a
n=3,539 participants attending the 7th examination of the
offspring in the Framingham Heart Study are shown below

 Compute true mean systolic blood pressure in the


population
 A point estimate for the true mean systolic blood
pressure in the population is 127.3, and we are
95% confident that the true mean is between 126.7
and 127.9. The margin of error is very small here
because of the large sample size
B. Hypothesis test (p-value)
 hypothesis testing related to p-value
 P value, or calculated probability
 researcher evaluates a hypothesis about a
population rather than simply estimating it
 Hypothesis - statement that is not proven, which the
investigator postulates, and according to data
obtained will either accept it or reject
 types of hypotheses:
 null hypothesis (Ho): no association
 hypothesis of "no difference"
 e.g. no difference between blood pressures in group A and group B
 Define a null hypothesis for each study question clearly before the
start of your study
 alternative hypothesis (Ha): there is an association or a difference
 For example, "is there a significant (not due to chance) difference in
blood pressures between groups A and B if we give group A the test
drug and group B a sugar pill?" and alternative hypothesis is " there
is a difference in blood pressures between groups A and B if we
give group A the test drug and group B a sugar pill".
 significance level (alpha)
 used to refer to a pre-chosen probability

 "P value"
 used to indicate a probability that you calculate after a given
study
 if your P value is less than the chosen significance level then you
reject the null hypothesis i.e. accept that your sample gives
reasonable evidence to support the alternative hypothesis. It does
NOT imply a "meaningful" or "important" difference; that is for
you to decide when considering the real-world relevance of your
result.
 significance level at which you reject H0 is arbitrary
 5% (P < 0.05 :less than 1 in 20 chance of being wrong)
 1% (P < 0.01 less than 1 in 100 chance of being wrong)

 0.1% (P < 0.001 less than 1 in 1000 chance of being


wrong)

 Most authors refer to statistically significant as P < 0.05


and statistically highly significant as P < 0.001 (less
than one in a thousand chance of being wrong)
 In previous SBP example, researchers were interested to compare
the specific population’s SBP to the normal value (considered 120
Mmhg).
 Thus, the hypotheses will be:
 Ho: µ = 120
 Ha: µ ≠ 120
 Sample population with result ave SBP = 140 mmHg
 next step: decide whether sample supports null or alternative hypothesis
 If sample supports null hypothesis: conclusion will be that population average
is not significantly different from the normal value (120 mmHG)
 If sample supports alternative hypothesis; conclusion will be
population average is significantly different from normal value
 basis on which it is decided whether sample supports the null or
alternative hypothesis is mainly based on statistical grounds
 difference between observed value and
normal value is 20
 Could this observed difference be due to chance?
 If shown that difference is due to chance then conclusion:
 there is no significant difference
 if shown that difference is too big to be due to chance
conclusion:
 there is a significant difference.
 p-value is a probability that the observed difference is due
to chance or due to sampling error.
 Small p-value is (conventionally taken as < 0.05) -> difference is
unlikely to due to chance  accepting alternative hypothesis
 conclusion : there is significant difference.
 large p-value (conventionally taken as ≥ 0.05) difference is
likely due to chance  accepting null hypothesis
 Conclusion: there is no significant difference
 Type I error α level  Type II error
 cut-off point at which  B
significance is indicated, and  error when accepting a false
is usually taken at 5% (0.05) null hypothesis
α level
 beta depends upon sample
 error taking place when size and alpha
rejecting true null hypothesis
 can't be estimated except as
 not affected by sample size a function of the true
as it is set in advance population effect
 increases with the number of  beta gets smaller as the
tests or end points (i.e. do 20 sample size gets larger
rejections of H0 and 1 is likely
to be wrongly significant for  beta gets smaller as the
alpha = 0.05) number of tests or end points
increases

 At the design stage of an investigation, you should
aim to minimise the probability of failing to detect
a real effect (type II error, false negative).
 probabilityof type II error is equal to one minus the
power of a study (probability of detecting a true
effect)
 sample size estimation gives you the minimum
number of experimental subjects needed to detect
a true difference
Sample size calculation
 calculate the number of subjects needed to be included in the
research project.
 Different research questions require use of different equations for
the calculation of the sample size
 basic information needed for sample size calculation:
 Level of statistical significance (𝛂), usually considered at 0.05 (5%)
 value of the power desired (1-𝛃), usually considered at 0.8 (80%).
 estimate of expected prevalence or incidence rate in the control group
(or unexposed), as well as the expected difference in response rates to
be detected between two groups
 could be achieved by considering difference that would be clinically
important in management of the specific patients, or from previous work
carried out on the same topic
4. General rule for normal distributions

 It is only logical that not all distributions of


continuous variables are normally distributed, which
implies inapplicability of the rules discussed earlier
 BUT shown by mathematicians that distribution used
for calculation for confidence interval and p-value
(sampling distribution of the mean) is always
normal, given that the same size is larger than 30
 reason why the above calculations are relevant even in
non-normal
Statistical tests
 Chi-square test: used for assessing association between
two categorical variables, such as association between
physical activity (yes versus no) and hypertension (yes
versus no).
 One sample t-test: assessing difference between a
continuous variable and a fixed reference value, such
as the difference between the observed SBP and the
normal value (of 120 as an example)
 Paired t-test: assess the association between two
continuous measurements, which are related to each
other, such as heart rate before and after taking a
certain medication
 Independent t-test: assess the association between two
categorical variable and a continuous one, where the
measurement between the two groups are
independent, such as the SBP between males and
females
 Analysis of variance (ANOVA): assess the difference
between a categorical variable with more than two
levels and a continuous one, such as the difference in
age between patients of different severity of illness
(mild, moderate, and severe)
 Correlation: It is a test used to assess the association
between two continuous variables, such as SBP and age.
Examples 1.
 Jackson et al. (2013) wanted to know whether it is
better to give the diphtheria, tetanus and pertussis
(DTaP) vaccine in either the thigh or the arm, so they
collected data on severe reactions to this vaccine in
children aged 3 to 6 years old. One categorical
variable is severe reaction vs. no severe reaction; the
other variable is thigh vs. arm
 Null hypothesis: no significant difference on severity of
reaction between injecting vaccine in thigh or arm
 Alternative hypothesis: there is significant difference
between injecting vaccine in thigh or arm
 chi-square=2.04 with 1 degree of freedom, P value is 0.15
 P value =0.15  large
 Accept null

 No significant difference in severity of reaction


 Determining Which Mean(s) Is/Are Different
 If you fail to reject the null hypothesis in an ANOVA
then you are done. You know, with some level of
confidence, that the treatment means are
statistically equal. However, if you reject the null
then you must conduct a separate test to determine
which mean(s) is/are different.
Case 1: Weight Loss
 Study of weight loss for Diet vs Exercise
 Diet Only:
 sample mean = 5.9 kg
 sample standard deviation = 4.1 kg
 sample size = n = 42
 standard error = SEM1 = 4.1/ √42 = 0.633
 Exercise Only:
 sample mean = 4.1 kg
 sample standard deviation = 3.7 kg
 sample size = n = 47
 standard error = SEM2 = 3.7/ √47 = 0.540
Step 1. Determine the null and
alternative hypotheses
 Null hypothesis: No difference in average fat lost in
population for two methods. Population mean
difference is zero.
 Alternative hypothesis: There is a difference in
average fat lost in population for two methods.
Population mean difference is not zero.
Step 2. Collect and summarize data
into a test statistic.
Step 3. Determine the p-value
 p-value of 0.03
Step 4. Make a decision.
 p-value of 0.03 is less than or equal to 0.05
 Reject the null hypothesis
 We conclude that there is a statistically significant
difference between average fat loss for the two
methods.
Case 2 Mozart, Relaxation, and
Performance on Spatial Tasks
 Three listening conditions— Mozart, a relaxation
tape, and silence—and all subjects participated in
all three conditions.
 Null hypothesis: No differences in population mean
spatial reasoning IQ scores after each of three
listening conditions.
 Alternative hypothesis: Population mean spatial
reasoning IQ scores do differ for at least one of
the conditions compared with the others.
 A one-factor (listening condition) repeated measures
analysis of variance ... revealed that subjects
performed better on the abstract/spatial reasoning
tests after listening to Mozart than after listening to
either the relaxation tape or to nothing (F[2,35] =
7.08, p = 0.002).
-(Rauscher et al.)
 A one-factor (listening condition) repeated measures analysis of variance ... revealed
that subjects performed better on the abstract/spatial reasoning tests after listening to
Mozart than after listening to either the relaxation tape or to nothing (F[2,35] =
7.08, p = 0.002).
-(Rauscher et al)

 Conclusion: At least one of the means differs


from the others. If there were no population
differences, sample mean results would vary as
much as the ones in this sample did, or more,
only 2 times in 1000 (0.002).
 The music condition differed significantly from both
the relaxation and silence conditions (Scheffé’s t =
3.41, p = 0.002; t = 3.67, p = 0.0008, two-tailed,
respectively). The relaxation and silence conditions
did not differ (t = 0.795, p = 0.432, two-tailed).
-(Rauscher et al)
 The music condition differed significantly from both
the relaxation and silence conditions (Scheffé’s t = 3.41, p = 0.002; t
= 3.67, p = 0.0008, two-tailed, respectively). The relaxation and
silence conditions did not differ (t = 0.795, p = 0.432, two-tailed).
-(Rauscher et al)

 Conclusion: Significant differences were found between


the musicand relaxation conditions (p-value = 0.002)
and between the music and silence conditions (p-value
= 0.0008). The difference between the relaxation and
silence conditions, however, was not statistically
significant (p-value = 0.432)
Case 3 Quitting Smoking with
Nicotine Patches
 Compared the smoking cessation rates for smokers
randomly assigned to use a nicotine patch versus a
placebo patch.
 Null hypothesis: The proportion of smokers in the
population who would quit smoking using a nicotine
patch and a placebo patch are the same.
 Alternative hypothesis: The proportion of smokers in
the population who would quit smoking using a
nicotine patch is higher than the proportion who
would quit using a placebo patch.
 Higher smoking cessation rates were observed in the
active nicotine patch group at 8 weeks (46.7% vs
20%) (P < .001) and at 1 year (27.5% vs 14.2%)
(P = .011)
- (Hurt et al.
 Higher smoking cessation rates were observed in the
active nicotine patch group at 8 weeks (46.7% vs 20%)
(P < .001) and at 1 year (27.5% vs 14.2%) (P =
.011)
 - (Hurt et al.

 Conclusion: p-values are quite small: less than 0.001


for difference after 8 weeks and equal to 0.011 for
difference after a year. Therefore, rates of quitting are
significantly higher using a nicotine patch than using a
placebo patch after 8 weeks and after 1 year.

You might also like