You are on page 1of 12

Economics Dr.

Sauer
Ch 6: Sampling Distributions for Means and Proportions

Chapter Overview:
I. Sampling Distributions: Means
II. The Central Limit Theorem
III. The Normal Distribution
IV. Sampling Distributions: Proportions
V. Desirable Properties of Estimators
_________________________________________________________________________________
We can use sample statistics to infer things about the population parameters.

Sometimes the sample statistic (e.g. mean) will be close to the population parameter, sometimes it will
not.

Recall: Greek letters are used for the population, English letters are used for the corresponding sample
characteristic.
________________________________________________________________________________
Review:
Suppose our population consists of five numbers.
3, 1, 5, 6, 2

Calculate the population mean, variance, and standard deviation.









____________________________________________________________________
I. Sampling Distributions: Means
Suppose we want a sample of size 2. In the table, list all possible combinations of samples of size 2.
Repeat for a sample of size 3.



Now, calculate each samples mean.

Sample of 2 Sample of 3


Notice that the sample means vary from the population mean.
Depending on the sample chosen, the sample mean could be a good estimate of the population mean, or
not.

Calculate the mean of all the sample means and the standard deviation of the sample means.





For sample of size 2: For sample of size 3:
mean
standard deviation
___________________________________________________________________
The mean of all the sample means is the same as the population mean.
The standard deviation of all the sample means decreases as the sample size increases.

The standard deviation of all the sample means is called the standard error of the mean.

It can be calculated directly from the samples (as we just did) or by using the formula when the
population standard deviation is known:









The difference between the population mean and its point estimate is called the sampling error.

If point estimates are the same as the population parameters, there is no sampling error and the standard
error is zero.

A probability distribution is a list of every possible outcome with the corresponding probability.
______________________________________________________________________
For our example, there are 10 possible samples of size 2. The probability of each sample being selected
is 0.10.

Plot the probability distribution of our sample means.
Step 1: Construct a frequency distribution table.
- 3 intervals is probably appropriate
- 1.5 to less than 3, 3 to less than 4.5, 4.5 to less than 6


Interval Frequency
1.5< x < 3
3 < x < 4.5
4.5 < x < 6


Step 2: Calculate the relative frequencies.
probability of particular sample x frequency

Step 3: Plot the probability histogram.


Notice, even for this very small population and sample size, the probability distribution is tending
toward the bell shape of the Normal Distribution.

II. The Central Limit Theorem says that the probability distribution of the sample means
- for samples of size 30 or greater
- selected from any population whose mean and variance are known
- approaches a Normal distribution
- with mean and standard deviation .


The distribution of sample means for sample sizes of n > 30.

In addition, the Central Limit Theorem applies for small samples from Normal populations, when the
population variance is known.


for samples of any size from a Normal distribution with known variance.



The Central Limit Theorem allows us to calculate
- probabilities regarding sample means
- the limits that contain various percentages of sample means
( later it will also help us construct confidence intervals)


III. The Normal probability distribution
It has long been recognized that large numbers of measurements, when sorted and plotted in a
histogram, tend to look like a bell-shaped form.

This bell-shaped curve is the Normal probability distribution curve.

Formula:

0
0.1
0.2
0.3
0.4
0.5
0.6
1.5< x < 3 3 < x < 4.5 4.5 < x < 6
R
e
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y
Sample Mean
Distribution of Sample Means
2
2
1
2
1
) (
|
.
|

\
|

=
o
u
t o
x
e x f


This formula would trace out a bell-curve, symmetrical around the mean of .

The area under the curve sums to 1.
- true of any probability distribution



The probability that a random variable, X, has a value between x = a and x = b is given by the area
under the curve between x = a and x = b.


However, we actually dont need to do the integration because the Normal curve has some special
characteristics that let us find the area from a single table.

Special properties of the Normal distribution:
1. Total area under the curve is one. (true of any probability distribution)

2. The curve is symmetrical about the mean.
- the area to the left of the mean is 0.5
- the area to the right of the mean is 0.5

3. The area under the curve between the mean and any point depends on the number of standard
deviations between the point and the mean.
- the Z-score is the number of standard deviations between the point and the mean


The area between the mean and a point which is one standard deviation from the mean is 0.3413.
68.26% of the total area is within one standard deviation

The area between the mean and a point which is two standard deviations from the mean is 0.4772.
95.44% of the total area is within two standard deviations

The area between the mean and a point which is three standard deviations from the mean is 0.4986.
99.72% of the total area is within three standard deviations






The Z-score is calculated as


Z is the number of standard deviations between the point (x) and the mean.

Calculate Z to two decimal points.

Once you have Z, use a Normal probability distribution table to find the area under the curve.
_________________________________________________________________________________
Example: Suppose the time it takes to process an email inquiry is normally distributed with a mean time
of 500 seconds and a standard deviation of 10 seconds. What is the probability that a selected email will
be processed in more than 505 seconds?

Step 1: Sketch the curve and indicate relevant information.







Step 2: Calculate Z.






Step 3: Look up in table.

Final answer:






_____________________________________________________________________
What if instead we wanted to know the probability that processing an email will take less than 485
seconds?
Step 1: Sketch the curve and indicate relevant information.









Step 2: Calculate Z.








Step 3: Look up in table.

Final answer:


_____________________________________________________________________
What if instead we wanted to know the probability that processing an email will take between 485 and
505 seconds?
Step 1: Sketch the curve and indicate relevant information.









Step 2: Calculate Z.






Step 3: Look up in table.

Final answer:


______________________________________________________________________
Example: An importer of Herbs and Spices claims that the average weight of packets of saffron is 20
grams. However, packets are actually filled to an average weight of 19.5 grams with a standard
deviation of 1.8 grams. A random sample of 36 packets is selected. Find the probability that the average
weight is 20 grams or more.

In this example we are dealing with a sample of size n > 30. Well apply the CLT and calculate the
mean and standard error of the distribution of means.







Step 1: Sketch the curve and indicate relevant information.


Step 2: Calculate Z.




Step 3: Look up in table.

Final answer:

______________________________________________________________________
Instead, lets find the lower and upper limit within which 95% of all packets weigh.
In this case, we are dealing with the population, not the sample. Use the population mean and standard
deviation.

Step 1: Sketch the curve and indicate relevant information.





Step 2: Look up the Z that corresponds to a tail area of:

Step 3: Find the upper and lower limits.



Final answer:

____________________________________________________________________
Instead, lets calculate the two limits within which 95% of all average weights fall.
Now we are dealing with the sample of n = 36.
The methodology is the same as when we use the entire population, except well use the standard error
of the means instead of the standard deviation for the population.

Step 1: Sketch the curve and indicate relevant information.



Step 2: Look up the Z that corresponds to a tail area of:

Step 3: Find the upper and lower limits.




Final answer:

____________________________________________________________________

IV. Sampling Distribution of Proportions

A proportion is the number of elements with a given characteristic divided by the total number of
elements in the group.
ex: The proportion of people who vote in an election is the number who vote divided by the
number eligible to vote.

X or x are the number of elements with a given characteristic.

Often times proportions are quoted as percentages.

The sample proportion is a point estimate of the population proportion.

_____________________________________________________________________


Example: Suppose we have the following population of data.
3, 1, 5, 6, 2

Calculate the population proportion of even numbers.



Referring back to our samples of size 2 and 3, calculate the sample proportion of even numbers.



Calculate the mean of all sample proportions for each sample size.
sample of 2:

sample of 3:


The standard deviation of all the sample proportions decreases as the sample size increases.

The standard error of all sample proportions is given by





For our samples of size 2:


For our samples of size 3:


The list of every possible sample proportion with its probability is called the sampling distribution of
proportions.





3,1 3,1,5
3,5 3,1,6
3,6 3,1,2
3,2 3,5,6
1,5 3,5,2
1,6 3,6,2
1,2 1,5,6
5,6 1,5,2
5,2 1,6,2
6,2 5,6,2
Sample of 2
Sample
Proportion Sample of 3
Sample
Proportion


Plot the probability distribution of our proportions for the samples of size 2.

Step 1: Construct a frequency distribution table.
- we only have 3 values for p (0, 0.5, 1)



Step 2: Calculate the relative frequency distribution. (probability distribution)
probability of particular sample x frequency

Step 3: Plot the probability histogram.


Notice, even for this very small population and sample size, the probability distribution is tending
toward the bell shape of the Normal Distribution.


For samples of size 30 or greater the distribution of sample proportions is approximately Normal with
mean and standard deviation .



The distribution of sample proportions for sample sizes of n > 30.

______________________________________________________________________
Example: In a certain neighborhood, it is known that 12% of people age 16 to 24 are unemployed. If a
random sample of 150 people age 16 to 24 is selected, what is the probability that the sample contains at
most 10% unemployed?
Step 1: Calculate
p
and
p
.









p Frequency
0
0.5
1


Step 2: Sketch the curve and indicate relevant information.


Step 3: Calculate Z




Step 4: Look up Z in the table.

Final answer:


_______________________________________________________________________
Instead, lets calculate the probability that the sample contains at most 25 unemployed people.

Step 1: Convert the number into a proportion.



Step 2: Calculate
p
and
p
.



Step 3: Sketch the curve and indicate relevant information.

Step 4: Calculate Z




Step 5: Look up Z in the table.

Final answer:

_____________________________________________________________________


Many times the value of the population proportion is unknown. We can approximate the mean and
standard error of the proportions by :




V. Some desirable properties of estimators
1. Estimators should be unbiased.
accurate

An estimator is unbiased if the average value of all the point estimates is equal to the population
parameter being estimated.

To prove that x is an unbiased estimator of we would need to show that the expected value of the
sample mean is equal to the population mean.
E(x) = .

2. The values of sample statistics vary around the population parameter. It is desirable to keep this
variance at a minimum
minimum variance
precise

An estimator is precise when the values of the estimates are close.


_____________________________________________________________________
Concepts:
- Central Limit Theorem
- Normal distribution
- desirable properties of estimators

Skills:
For both means and proportions:
- calculate the mean of all the sample means (proportions) and the standard
deviation of the sample means (proportions)

- construct a probability distribution table

- calculate the probability of an event

You might also like