You are on page 1of 30

Statistical inference

Ian Jolliffe University of Aberdeen

CLIPS module 3.4b

Probability and Statistics

Probability starts with a population, often described by a probability distribution, and predicts what will happen in a sample from that population. Statistics starts with a sample of data, and describes the data informatively, or makes inferences about the population from which the sample was drawn. This module concentrates on the inferential, rather than descriptive, side of Statistics.

Probability vs. Statistics: an example

Probability: suppose we know that the maximum temperature at a particular station has a Gaussian distribution with mean 27o C and standard deviation 3o C. We can use this (population) probability distribution to make predictions about the maximum temperature on one or more days.

Probability vs. Statistics: an example II

Statistics: for a new station we may wish to estimate the long-term (population) mean maximum temperature, and its standard deviation, based on the small amount of relevant data collected so far.

Three types of statistical inference

Point estimation: given a sample of relevant data find an estimate of the (population) mean maximum temperature in June at a station? Or an estimate of the (population) probability of precipitation in October at a station? Interval estimation: given a sample of relevant data, find a range of values within which we have a high degree of confidence that the mean maximum temperature (or the probability of precipitation) lies

Three types of statistical inference II

Hypothesis testing: given a sample of relevant data, test one or more specific hypotheses about the population from which the sample is drawn. For example, is the mean maximum temperature in June higher at Station A than at Station B? Is the daily probability of precipitation at Station A in October larger or smaller than 0.5?

Representative samples

In the previous two Slides we have used the phrase sample of relevant data It is crucial that the sample of data you use to make inferences about a population is representative of that population; otherwise the inferences will be be biased Designing the best way of taking a sample is a third (as well as description and inference) aspect of Statistics. It is important, but it will not be discussed further in this module

Estimates and errors

The need for probability and statistics arises because nearly all measurements are subject to random variation One part of that random variation may be measurement error By taking repeated measurements of the same quantity we can quantify the measurement error

build a probability distribution for it, then use the distribution to find a confidence interval for the true value of the measurement

Confidence intervals

A point estimate on its own is of little use For example, if I tell you the probability of precipitation tomorrow is 0.4, you will not know how to react. Your reaction will be different if I then say

the probability is 0.4 +/- 0.01 the probability is 0.4 +/- 0.3

We need some indication of the precision of an estimate, which leads naturally into confidence intervals

Confidence intervals II

The two statements on the previous slides were of the form estimate +/- error They could also be written as intervals
0.4 +/- 0.01 ( 0.39, 0.41) 0.4 +/- 0.3 ( 0.10, 0.70)

For these intervals to be confidence intervals we need to associate a level of confidence with them for example we might say that we are 95% confident that the interval includes the true probability

Confidence intervals for what?

Confidence intervals can be found for any parameter or parameters (a quantity describing some aspect of a population or probability distribution). Examples include

probability of success in a binomial experiment mean of a normal distribution mean of a Poisson distribution

for what II?

More parameters

Differences between normal means Differences between binomial probabilities Ratios of normal variances Parameters describing Weibull, gamma or lognormal distributions

Calculation of confidence intervals

No algebraic details are given, just the general principles behind most formulae for confidence intervals

Find an estimate of the parameter of interest The estimate is a function of the data, so is a random variable, and hence has a probability distribution Use this distribution to construct probability statements about the estimate Manipulate the probability statements to turn them into statements about confidence intervals and their coverage

Calculation of confidence intervals an outline example

Suppose we have n independent observations on a Gaussian random variable (for example maximum daily temperature in June). We write Ui ~ N(m,s2), i = 1,2, , n, where m is the mean of the distribution and s2 is its variance The sample mean, or average, of the n observations is an obvious estimator for m. We denote this average by , and it can be shown that ~ N(m,s2/n)

Confidence interval calculations II

Let Z = n( - )/. Then Z ~ N(0,1) We have tables of probabilities for N(0,1) and can use these to make statements such as

P( -1.96 < Z < 1.96 ) = 0.95 P( -2.58 < Z < 2.58 ) = 0.99 P (-1.65 < Z < 1.65 ) = 0.90

Probabilities for N(0,1) 90% interval

0.4

0.3

f(z)

0.2

0.1

0.90

0.0

0.05 -1.645

0.05 1.645

z
Numbers indicate areas under curve, left of -1.65, right of 1.65, and between.

Probabilities for N(0,1) 95% interval


0.4

0.3

f(z)

0.2

0.1

0.95

0.0

0.025 -1.96

0.025 1.96

z
Numbers indicate areas under curve, left of -1.96, right of 1.96, and between.

Confidence interval calculations III

Substituting the expression above for Z, in terms of , , and n, into the probability statements, and manipulating those statements gives

P( - 1.96/n < < + 1.96/n ) = 0.95 P( - 2.58/n < < + 2.58/n ) = 0.99 P( - 1.65/n < < + 1.65/n ) = 0.90

The intervals defined by these 3 expressions are 95%, 99%, 90% confidence intervals respectively for

Confidence interval for -example

Measurements are taken of maximum temperature at a station for a sample of 20 June days. Find a confidence interval for , the (population) mean of maximum daily temperatures in June at that station. The data give = 25.625, and we assume that = 1.5. Substituting these values in the expressions on the previous Slide gives 90% interval (25.07, 26.18) 95% interval (24.97, 26.28) 99% interval (24.76, 26.49)

Comments on confidence interval example

Note how we pay for a greater degree of confidence with a wider interval We have deliberately chosen the simplest possible example of a confidence interval though intervals for other parameters (see Slides 11, 12) have similar constructions, most are a bit more complicated in detail One immediate complication is that we rarely know it is replaced by an estimate, s, the sample standard deviation. This leads to an interval which is based on a socalled t-distribution rather than N(0,1), and which is usually wider.

Confidence interval example more comments

As well as assuming known, the interval has also assumed normality (it needs to be checked whether this assumption is OK), and independence of observations, which wont be true if the data are recorded on consecutive days in the same month

Interpretation of confidence intervals

Interpretation is subtle/tricky and often found difficult or misunderstood A confidence interval is not a probability statement about the chance of a (random) parameter falling in a (fixed) interval It is a statement about the chance that a random interval covers a fixed, but unknown, parameter

Hypothesis testing -introduction

For any parameter(s) where a confidence interval can be constructed, it may also be of interest to test a hypothesis. For simplicity, we develop the ideas of hypothesis testing for the same scenario as our confidence interval example, namely inference for , a single Gaussian mean, when is known, though hypothesis testing is often more relevant when comparing two or more parameters

Hypothesis testing - example

Suppose that the instrument or exposure for measuring temperature has changed. Over a long period with the old instrument/ exposure, the mean value of daily June maximum temperature was 25oC, with standard deviation 1.5oC The data set described on Slide 19 comprises 20 daily measurements with the new instrument/ exposure. Is there any evidence of a change in mean?

The steps in hypothesis testing


1.

2.

Formulate a null hypothesis. The null hypothesis is often denoted as H0. In our example we have H0: = 25 Define a test statistic, a quantity which can be computed from the data, and which will tend to take different values when H0 is true/false. is an obvious estimator of , which should vary as varies, and so is suitable as a test statistic. Z (see Slide 15) is equivalent to , and is more convenient, as its distribution is tabulated. Thus use Z as our test statistic.

Hypothesis testing steps II


3. Calculate the value of Z for the data.
Here Z = n( - )/, and Z ~ N(0,1). Assuming = 1.5, and the null value =25, we have Z = 20 (25.565 25) / 1.5 = 1.86 4. Calculate the probability of obtaining a value of the test statistic at least as extreme as that observed, if the null hypothesis is true. This probability is called a p-value. Here we calculate P( Z > 1.86) if we believe before seeing the data that the change in mean temperature could only be an increase, or P(Z > 1.86) + P(Z < -1.86), if we believe the change could be in either direction.

Hypothesis testing steps III

The p-values are calculated from tables of the cumulative distribution of N(0,1). From such tables, P( Z > 1.86 ) = 0.03, and because and symmetry of N(0,1) about zero, P ( Z < -1.86 ) = 0.03, and the two-sided p-value is 0.06 Whether we calculate a one- or two-sided p-value depends on the context. If felt certain, before collecting the data, that the change in instrument/exposure could only increase mean temperature, a one-sided p-value is appropriate. Otherwise we should use the two-sided (two-tailed) value

Confidence intervals and hypothesis testing

There is often, though not always, an equivalence between confidence intervals and hypothesis testing A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence interval

Hypothesis testing: p-values

We havent yet said what to do with our calculated p-value A small p-value casts doubt on the plausibility of the null hypothesis, H0, but It is NOT equal to P(H0|data) How small is small? Frequently a single threshold is set, often 0.05, 0.01 or 0.10 (5%, 1%, 10%), and H0 is rejected when the p-value falls below the threshold (rejected at the 5%, 1%, 10% significance level). It is more informative to quote a p-value than to conduct the test a single threshold level

Confidence intervals and hypothesis testing

There is often, though not always, an equivalence between confidence intervals and hypothesis testing A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence intervals

You might also like