You are on page 1of 34

Estimation

CHAPTER

Presented By
Asst. Prof. Dr. Mohammad
Alzubi
1. Point and Confidence Interval Estimates for the proportion.

2. Sample size determination for the proportion. Estimation and

3. simple size determination for finite populations

2
Point and Interval Estimates
Suppose we want to estimate a parameter, such as p or µ, based
on a finite sample of data. There are two main methods:
1. Point estimate: Summarize the sample by a single number
that is an estimate of the population parameter;
2. Interval estimate: A range of values within which, we believe,
the true parameter lies with high probability.

Example. Suppose I wanted to estimate the mean height of all


female students at UNC. I took a sample in this class and the
sample mean wasx¯= 65.7 (inches). So the obvious thing to
do is to take that as an estimate for the population mean. But
I didn’t have to use the sample mean. I could have taken the
sample median (66) or the sample mode (also 66). It makes
sense to ask which is better.
1
What properties make a good point estimator?
1. It’s desirable that the sampling distribution be centered around
the true population parameter. An estimator with this prop-
erty is called unbiased.
2. It’s desirable that our chosen estimator have a small standard
error in comparison with other estimators we might have
chosen.

The sample mean is exactly unbiased (whereas the sample me-


dian may not be), and also, if the true population is normal,
the sample mean has a smaller standard error than the sample
median. Both of these would indicate that the sample mean is
preferable to the sample median as an estimator of the popula-
tion mean. However there are other properties that could never-
theless make the median preferable (e.g. it’s more resistant to
outliers).
2
In the case of a binomial proportion, the obvious point estimator
is the sample proportion. For example, consider our example
about President Obama’s popularity rating (class 18).

In this example, 68% of respondents gave Obama a positive rat-


ing after he had been in office for one month (the answer could
be different if we repreated the poll now). The most natural in-
terpretation of this is that 68% or 0.68 is a statistic which serves
as an estimator of the true but unknown proportion of people
who would have approved of Obama if the whole population had
been surveyed. It seems obvious that we would use the sample
proportion as an estimator of the population proportion, but we
don’t have to.

3
Now let’s turn to interval estimates. The simplest way to intro-
duce this is through an example.

Example. In a college of 25,000 students, the administration


would like to know for what proportion of students both parents
had completed college. A sample of 350 students was drawn at
random and in that sample, 276 of the students said that both
their parents had completed college.

The sample proportion is 276 350 = 0.789 (or 78.9%), so by the


same logic as in the last example, it makes sense to use that
number also as an estimator of the population proportion (in
this case the “population” is all 25,000 students at this college).
But, how accurate is that?

4
Part I: Sampling Distribution
• Sample Statistics are used to estimate
Population Parameters
ex: X is an estimate of the population mean, μ
• Problems:

– Different samples provide different estimates of


the population parameter
– Sample results have potential variability, thus
sampling error exits
7
Calculating Sampling Error
• Sampling Error:
The difference between a value (a statistic) computed
from a sample and the corresponding value (a parameter)
computed from a population

Example: (for the mean)


Sampling Error  x - μ
where:
x  sample mean
μ  population mean
8
Review

• Population mean: Sample Mean:

μ
 x i
x
 x i

N n
where:
μ = Population mean
x = sample mean
xi = Values in the population or sample
N = Population size
n = sample size
9
Example

If the population mean is μ = 98.6 degrees and a


sample of n = 5 temperatures yields a sample
mean of = 99.2 degrees,
x then the sampling
error is

x  μ  99.2  98.6  0.6 degrees

10
Sampling Errors

• Different samples will yield different sampling


errors
• The sampling error may be positive or
x
negative
( may be greater than or less than μ)

• The expected sampling error decreases as the


sample size increases
11
Sampling Distribution

• A sampling distribution is a distribution


of the possible values of a statistic for a
given size sample selected from a
population

12
Developing a
Sampling Distribution
• Assume there is a population …
D
• Population size N=4 A B C

• Random variable, x,
is age of individuals
• Values of x: 18, 20,
22, 24 (years)

13
Developing a
Sampling Distribution
(continued)

Summary Measures for the Population Distribution:

μ
 x i
P(x)
N .3

18  20  22  24 .2
  21
4 .1

σ
 (x i  μ) 2

 2.236
18
A B
20
C
22
D
24 x
N
Uniform Distribution

14
Developing a
Sampling Distribution
(continued)
Now consider all possible samples of size n=2
st nd
1 2 Observation
Obs 18 20 22 24 16 Sample Means

18 18,18 18,20 18,22 18,24

20 20,18 20,20 20,22 20,24 1st 2nd Observation


Obs 18 20 22 24
22 22,18 22,20 22,22 22,24
18 18 19 20 21
24 24,18 24,20 24,22 24,24
20 19 20 21 22
16 possible samples
(sampling with 22 20 21 22 23
replacement)
24 21 22 23 24
15
Developing a
Sampling Distribution
(continued)
Sampling Distribution of All Sample Means

16 Sample Means Sample Means


Distribution
1st 2nd Observation
P(x)
Obs 18 20 22 24
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0
18 19 20 21 22 23 24
_x
24 21 22 23 24
(no longer uniform) 16
Developing a
Sampling Distribution
(continued)
Summary Measures of this Sampling Distribution:

μx 
 x

18  19  21    24
i
 21
N 16

σx 
 ( x i  μ x
) 2

(18 - 21)2  (19 - 21)2    (24 - 21)2


  1.58
16
17
Comparing the Population with its
Sampling Distribution
Population Sample Means Distribution
N=4 n=2
μ  21 σ  2.236 μx  21 σ x  1.58
P(x) P(x)
.3 .3

.2 .2

.1 .1

0
18 20 22 24 x
0
18 19 20 21 22 23 24
_x
A B C D

18
If the Population is Normal
(THEOREM 6-1)

If a population is normal with mean μ and standard


deviation σ, the sampling distribution

of x is also normally distributed with


and σ
μx  μ σx 
n
19
z-value for Sampling Distribution
of x
• Z-value for the sampling distribution of x :
(x  μ)
Z
σ
n
where: x = sample mean
μ = population mean
σ = population standard deviation
n = sample size

20
Finite Population Correction
• Apply the Finite Population Correction if:
– the sample is large relative to the population
(n is greater than 5% of N)
and…
– Sampling is without replacement

(x  μ)
Then Z
σ Nn
n N 1
21
Sampling Distribution Properties

Normal Population
• μx  μ Distribution

x μ x
(i.e. is unbiased )
Normal Sampling
Distribution
(has the same mean)

μx
x
22
Sampling Distribution Properties
(continued)

• For sampling with replacement:


As n increases, Larger sample
size
σx
decreases

Smaller sample
size

μ x
23
If the Population is not Normal
• We can apply the Central Limit Theorem:
– Even if the population is not normal,
– …sample means from the population will be
approximately normal as long as the sample size is
large enough
– …and the sampling distribution will have
σ
μx  μ σx 
and n
24
Central Limit Theorem

the sampling
As the n↑
distribution
sample
becomes
size gets
almost normal
large
regardless of
enough…
shape of
population

x
25
If the Population is not Normal
(continued)
Population Distribution
Sampling distribution
properties:

Central Tendency

μx  μ
μ x
Variation Sampling Distribution
σ (becomes normal as n increases)
σx  Larger
n Smaller sample
size
sample
size
(Sampling with
replacement)
μx x
26
How Large is Large Enough?

• For most distributions, n > 30 will give a


sampling distribution that is nearly normal
• For fairly symmetric distributions, n > 15
• For normal population distributions, the
sampling distribution of the mean is always
normally distributed

27
Example

• Suppose a population has mean μ = 8 and


standard deviation σ = 3. Suppose a random
sample of size n = 36 is selected.

• What is the probability that the sample mean


is between 7.8 and 8.2?

28
Example
(continued)
Solution :
 
 7.8 - 8 μ - μ 8.2 - 8 
P(7.8  μ x  8.2)  P  
x
3 σ 3 
 
 36 n 36 
 P(-0.4  Z  0.4)  0.3108

Population Sampling Standard Normal


Distribution Distribution Distribution 0.6554-(1 –
??? 0.6554)
? ??
? ? Sample Standardize
= 0.3108
?? ?
?
-0.4 0.4
μ8 x 7.8 8.2
μx  8 x μz  0 z
29
Population Proportions, p
p = the proportion of population having
some characteristic
• Sample proportion ( p ) provides an estimate
of p:

x number of successes in the sample


p 
n sample size
• If two outcomes, p has a binomial distribution

30
Sampling Distribution of p

• Approximated by a
Sampling Distribution
P(p)
normal distribution if: .3
– np  5 .2
.1
n(1  p)  5 0
0 .2 .4 .6 8 1 p

where p(1  p)
μp  p σp 
and n
(where p = population proportion)
31
z-Value for Proportions
Standardize p to a z value with the formula:

pp pp
Z 
σp p(1  p)
n
• If sampling is without replacement and
n is greater than 5% of the
p(1  p) Nn
population size, then σmustp
use the σp 
finite population correction factor: n N 1

32
Example

• If the true proportion of voters who support


Proposition A is p = .4, what is the
probability that a sample of size 200 yields a
sample proportion between .40 and .45?

 i.e.: if p = .4 and n = 200, what is


P(.40 ≤ p ≤ .45) ?

33
Example
(continued)
• if p = .4 and n = 200, what is
P(.40 ≤ p ≤ .45) ?
p(1  p) .4(1  .4)
Find :σ
p
σp    .03464
n 200

Convert to
 .40  .40 .45  .40 
standard normal:
P(.40  p  .45)  P Z 
 .03464 .03464 
 P(0  z  1.44)
34

You might also like