Professional Documents
Culture Documents
Probability starts with a population, often described by a probability distribution, and predicts what will happen in a sample from that population. Statistics starts with a sample of data, and describes the data informatively, or makes inferences about the population from which the sample was drawn. This module concentrates on the inferential, rather than descriptive, side of Statistics.
Probability: suppose we know that the maximum temperature at a particular station has a Gaussian distribution with mean 27o C and standard deviation 3o C. We can use this (population) probability distribution to make predictions about the maximum temperature on one or more days.
Statistics: for a new station we may wish to estimate the long-term (population) mean maximum temperature, and its standard deviation, based on the small amount of relevant data collected so far.
Point estimation: given a sample of relevant data find an estimate of the (population) mean maximum temperature in June at a station? Or an estimate of the (population) probability of precipitation in October at a station? Interval estimation: given a sample of relevant data, find a range of values within which we have a high degree of confidence that the mean maximum temperature (or the probability of precipitation) lies
Hypothesis testing: given a sample of relevant data, test one or more specific hypotheses about the population from which the sample is drawn. For example, is the mean maximum temperature in June higher at Station A than at Station B? Is the daily probability of precipitation at Station A in October larger or smaller than 0.5?
Representative samples
In the previous two Slides we have used the phrase sample of relevant data It is crucial that the sample of data you use to make inferences about a population is representative of that population; otherwise the inferences will be be biased Designing the best way of taking a sample is a third (as well as description and inference) aspect of Statistics. It is important, but it will not be discussed further in this module
The need for probability and statistics arises because nearly all measurements are subject to random variation One part of that random variation may be measurement error By taking repeated measurements of the same quantity we can quantify the measurement error
build a probability distribution for it, then use the distribution to find a confidence interval for the true value of the measurement
Confidence intervals
A point estimate on its own is of little use For example, if I tell you the probability of precipitation tomorrow is 0.4, you will not know how to react. Your reaction will be different if I then say
the probability is 0.4 +/- 0.01 the probability is 0.4 +/- 0.3
We need some indication of the precision of an estimate, which leads naturally into confidence intervals
Confidence intervals II
The two statements on the previous slides were of the form estimate +/- error They could also be written as intervals
0.4 +/- 0.01 ( 0.39, 0.41) 0.4 +/- 0.3 ( 0.10, 0.70)
For these intervals to be confidence intervals we need to associate a level of confidence with them for example we might say that we are 95% confident that the interval includes the true probability
Confidence intervals can be found for any parameter or parameters (a quantity describing some aspect of a population or probability distribution). Examples include
probability of success in a binomial experiment mean of a normal distribution mean of a Poisson distribution
More parameters
Differences between normal means Differences between binomial probabilities Ratios of normal variances Parameters describing Weibull, gamma or lognormal distributions
No algebraic details are given, just the general principles behind most formulae for confidence intervals
Find an estimate of the parameter of interest The estimate is a function of the data, so is a random variable, and hence has a probability distribution Use this distribution to construct probability statements about the estimate Manipulate the probability statements to turn them into statements about confidence intervals and their coverage
Suppose we have n independent observations on a Gaussian random variable (for example maximum daily temperature in June). We write Ui ~ N(m,s2), i = 1,2, , n, where m is the mean of the distribution and s2 is its variance The sample mean, or average, of the n observations is an obvious estimator for m. We denote this average by , and it can be shown that ~ N(m,s2/n)
Let Z = n( - )/. Then Z ~ N(0,1) We have tables of probabilities for N(0,1) and can use these to make statements such as
P( -1.96 < Z < 1.96 ) = 0.95 P( -2.58 < Z < 2.58 ) = 0.99 P (-1.65 < Z < 1.65 ) = 0.90
0.4
0.3
f(z)
0.2
0.1
0.90
0.0
0.05 -1.645
0.05 1.645
z
Numbers indicate areas under curve, left of -1.65, right of 1.65, and between.
0.3
f(z)
0.2
0.1
0.95
0.0
0.025 -1.96
0.025 1.96
z
Numbers indicate areas under curve, left of -1.96, right of 1.96, and between.
Substituting the expression above for Z, in terms of , , and n, into the probability statements, and manipulating those statements gives
P( - 1.96/n < < + 1.96/n ) = 0.95 P( - 2.58/n < < + 2.58/n ) = 0.99 P( - 1.65/n < < + 1.65/n ) = 0.90
The intervals defined by these 3 expressions are 95%, 99%, 90% confidence intervals respectively for
Measurements are taken of maximum temperature at a station for a sample of 20 June days. Find a confidence interval for , the (population) mean of maximum daily temperatures in June at that station. The data give = 25.625, and we assume that = 1.5. Substituting these values in the expressions on the previous Slide gives 90% interval (25.07, 26.18) 95% interval (24.97, 26.28) 99% interval (24.76, 26.49)
Note how we pay for a greater degree of confidence with a wider interval We have deliberately chosen the simplest possible example of a confidence interval though intervals for other parameters (see Slides 11, 12) have similar constructions, most are a bit more complicated in detail One immediate complication is that we rarely know it is replaced by an estimate, s, the sample standard deviation. This leads to an interval which is based on a socalled t-distribution rather than N(0,1), and which is usually wider.
As well as assuming known, the interval has also assumed normality (it needs to be checked whether this assumption is OK), and independence of observations, which wont be true if the data are recorded on consecutive days in the same month
Interpretation is subtle/tricky and often found difficult or misunderstood A confidence interval is not a probability statement about the chance of a (random) parameter falling in a (fixed) interval It is a statement about the chance that a random interval covers a fixed, but unknown, parameter
For any parameter(s) where a confidence interval can be constructed, it may also be of interest to test a hypothesis. For simplicity, we develop the ideas of hypothesis testing for the same scenario as our confidence interval example, namely inference for , a single Gaussian mean, when is known, though hypothesis testing is often more relevant when comparing two or more parameters
Suppose that the instrument or exposure for measuring temperature has changed. Over a long period with the old instrument/ exposure, the mean value of daily June maximum temperature was 25oC, with standard deviation 1.5oC The data set described on Slide 19 comprises 20 daily measurements with the new instrument/ exposure. Is there any evidence of a change in mean?
2.
Formulate a null hypothesis. The null hypothesis is often denoted as H0. In our example we have H0: = 25 Define a test statistic, a quantity which can be computed from the data, and which will tend to take different values when H0 is true/false. is an obvious estimator of , which should vary as varies, and so is suitable as a test statistic. Z (see Slide 15) is equivalent to , and is more convenient, as its distribution is tabulated. Thus use Z as our test statistic.
The p-values are calculated from tables of the cumulative distribution of N(0,1). From such tables, P( Z > 1.86 ) = 0.03, and because and symmetry of N(0,1) about zero, P ( Z < -1.86 ) = 0.03, and the two-sided p-value is 0.06 Whether we calculate a one- or two-sided p-value depends on the context. If felt certain, before collecting the data, that the change in instrument/exposure could only increase mean temperature, a one-sided p-value is appropriate. Otherwise we should use the two-sided (two-tailed) value
There is often, though not always, an equivalence between confidence intervals and hypothesis testing A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence interval
We havent yet said what to do with our calculated p-value A small p-value casts doubt on the plausibility of the null hypothesis, H0, but It is NOT equal to P(H0|data) How small is small? Frequently a single threshold is set, often 0.05, 0.01 or 0.10 (5%, 1%, 10%), and H0 is rejected when the p-value falls below the threshold (rejected at the 5%, 1%, 10% significance level). It is more informative to quote a p-value than to conduct the test a single threshold level
There is often, though not always, an equivalence between confidence intervals and hypothesis testing A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence intervals