Review - Lecture 6: - Normaldistribution - Normal Approximation To The Binomial and Poisson Distributions

Review Lecture 6
NormalDistribution
Normal Approximation to the Binomial and Poisson
Distributions
MTH410-S16- Lecture 07
1/103
Lecture 7
Data Collection and Sampling
The Central Limit Theorem
Introduction to Estimation
Point Estimation (For Mean)
Interval Estimation Confidence Intervals
(For mean, standard deviation known)
Interval Estimation Confidence Intervals
(For mean, standard deviation known)
Determining Sample Sizes for given
confidence levels
2/103
Introduction & Recap

Statistics is a way to get information from data
Statistics
Data
Information
But where do data come from? How do we ensure them

accurate? Are the data reliable? Are they representative of
the population from which they were drawn? This lecture
explores some of these issues
3/103
Introduction & Recap(Contd)

Parameter
A descriptive measure of a population.

Statistic
A descriptive measure of a sample.
Because populations tend to be very large, most population
parameters are not only unknown but also unknowable.
We can only use statistics inference to obtain an estimate if
willing to accept less than 100% accuracy. Instead of
investigating the entire population, we choose to study a
sample.
4/103
Methods of Collecting Data

There are many methods used to collect or obtain data for
statistical analysis. Three of the most popular methods are:
Direct Observation
Experiments
Surveys.
5/103
Observational and experimental studies

Experimental study is one in which measurements
representing a variable of interest are observed and
recorded, while controlling factors that might influence
their values.
Observational study is one in which measurements
representing a variable of interest are observed and
recorded, without controlling any factor that might
influence their values.
6/103
Surveys
A survey solicits information from people; e.g. Gallup polls;
pre-election polls; marketing surveys.
The Response Rate (i.e. the proportion of all people selected
who complete the survey) is a key survey parameter. A low
response rate can destroy the validity of any conclusion
resulting from this survey.
7/103
Surveys(Contd)
Surveys may be administered in a variety of ways, e.g.
Personal Interview: higher response rate, less incorrect
responses due to misunderstanding, but expensive;
Telephone Interview: less expensive, but less personal
and lower expected response rate.
Self Administered Survey (which is usually mailed to a
sample of people): inexpensive, but with lower response rate
and relatively high misunderstanding
8/103
Questionnaire Design
Over the years, a lot of thought has been put into the science
of the design of survey questions. Key design principle:
KISS Keep it simple & stupid
E.g. Keep the questionnaire as short as possible,
Ask short, simple, and clearly worded questions,
Start with simple questions,
Use Yes-No or multiple choice questions,
Avoid leading questions,
Make it easy to analyze & present the collected data
9/103
Sampling
Recall: statistical inference permits us to draw conclusions
about a population based on a sample.
Rationale: a) cost (less expensive to sample 1,000 television
viewers than 100 million TV viewers), and
b) practicality (e.g. performing a crash test on every vehicle
produced is impractical).
The sampled population and the target population should

always be similar to each other.
10/103
Self-selected Samples
are almost always biased, because the individuals who
participate in them are most likely more interested in this
issue than other members of the population.
E.g. Radio and television stations always ask people to call
and give their opinion on an issue of interest.
However, only listeners who are concerned about this topic
and have enough patience to get through to the station will
be included in the sample.
That means, the sampled population is different with the
target population the conclusions drawn from such
surveys are frequently wrong.
11/103
Sampling Plans
A sampling plan is just a method or procedure for
specifying how a sample will be taken from a population.
We will focus our attention on these three methods:
Simple Random Sampling,
Stratified Random Sampling, and
Cluster Sampling.
12/103
Simple Random Sampling

A simple random sample is a sample selected in such a way
that every possible sample of the same size is equally likely
to be chosen.
E.g. Drawing three names from a hat containing all the

names of the students in the class is an example of a simple
random sample: any group of three names will be picked
equally likely.
Normally, assign each element of the entire population a
unique number, then sample numbers can be selected at
random. We may use a random number table or Excel
13/103
Stratified Random Sampling(Contd)

A stratified random sample is obtained by separating the
population into mutually exclusive sets, or strata, and then
drawing simple random samples from each stratum.
Strata 1: Gender
Male
Female
Strata 2: Age
< 20
20-30
31-40
41-50
51-60
> 60
Strata 3: Occupation
professional
white collar
blue collar
other
Besides acquiring about the total population, we can also make

inferences within a stratum, or make comparisons across strata
14/103
Stratified Random Sampling(Contd)

After the population has been stratified, we can use simple
random sampling to generate the complete sample:
If we plan to sample 400 people total, wed draw 100 (=400*.25)

of them from the low income group
if we are sampling 1000 people, wed draw 50 (=1000*.05)
of them from the high income group.
15/103
Cluster Sampling
A cluster sample is a simple random sample of groups or
clusters of elements (vs. a simple random sample of
individual objects).
This method is useful when it is difficult or costly to
develop a complete list of the population members or when
the population elements are widely dispersed geographically.
Cluster sampling may increase sampling error due to

similarities among cluster members.
16/103
Sample Size
Numerical techniques for determining sample sizes will
be described later, but at least we can say that the larger
the sample size is, the more accurate we can expect the
sample estimates will be.
17/103
Sampling and Non-Sampling Errors

Two major types of errors can arise when a sample of
observations is taken from a population:
Sampling error refers to differences between the sample and
the population that exist only because of the observations
that happened to be selected for the sample.
Nonsampling errors are more serious and are due to
mistakes made in the acquisition of data, or non-response
error or due to the sample observations being selected
improperly.
18/103
Sampling Error
Sampling error refers to differences between the sample
and the population, because of the specific observations that
happen to be selected.
Sampling error is expected to occur when making a
statement about the population based on the sample taken.
Increasing the sample size will reduce this type of error.
19/103
Nonsampling Error
Nonsampling errors are more serious and are due to
mistakes made in the acquisition of data, or non-response
error, or due to the sample observations being selected
improperly.
Three types of nonsampling errors:

Errors in data acquisition,
Nonresponse errors, and
Selection bias.
Note: increasing the sample size will not reduce this type of
error.
20/103
Sampling Distributions
21/103
Agenda
Sampling Distribution of the Mean
Sampling Distribution of a Proportion
Sampling Distribution of the Difference Between
Two Mean
22/103
Introduction
In real life calculating parameters of populations is
prohibitive because populations are very large.
Rather than investigating the whole population, we take a

sample, calculate a statistic related to the parameter of
interest, and make an inference.
The sampling distribution of the statistic is the tool that
tells us how close is the statistic to the parameter.
23/103
Sampling Distributions
A sampling distribution is created by, as the name suggests,
sampling.
The method we will employ on the rules of probability and
the laws of expected value and variance to derive the
sampling distribution.
24/103

An Example
A fair die is thrown infinitely many times,
with the random variable X = # of spots on any throw.
The probability distribution of X is:
x
P(x)
1/6
1/6
1/6
1/6
1/6
1/6
and the mean and variance are calculated as well:
25/103
Throwing a Die Twice: Sample Mean

Suppose we want to estimate mean from the mean x of a
sample of size n = 2.
What is the distribution of x ?
26/103
Sampling Distribution of Two Dice

A sampling distribution is created by looking at
all samples of size n=2 (i.e. two dice) and their means
While there are 36 possible samples of size 2, there are only

11 values for
, and some (e.g.
=3.5) occur more
frequently than others (e.g.
=1).
27/103
Sample
1
2
3
4
5
6
7
8
9
10
11
12
1,1
1,2
1,3
1,4
1,5
1,6
2,1
2,2
2,3
2,4
2,5
2,6
Mean Sample
Mean
1
13
3,1
2
1.5
14
3,2
2.5
2
15
3,3
3
2.5
16
3,4
3.5
3
17
3,5
4
3.5
18
3,6
4.5
1.5
19
4,1
2.5
2
20
4,2
3
2.5
21
4,3
3.5
3
22
4,4
4
3.5
23
4,5
4.5
4
24
4,6
5
Sample
25
26
27
28
29
30
31
32
33
34
35
36
Mean
5,1
5,2
5,3
5,4
5,5
5,6
6,1
6,2
6,3
6,4
6,5
6,6
The distribution of x when n = 2

2
2
x
Note : x x and x
2
3
3.5
4
4.5
5
5.5
3.5
4
4.5
5
5.5
6
E( x) =1.0(1/36)+
1.5(2/36)+.=3.5
6/36
5/36
V( x ) = (1.0-3.5)2(1/36)+
(1.5-3.5)2(2/36)... = 1.46
4/36
3/36
2/36
1/36
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5 6.0
28/103
Sampling Distribution of the Mean(Contd)
n5
x 3.5
2x
.5833 ( )
5
2
x
n 10
x 3.5
2x
.2917 ( )
10
2
x
n 25
x 3.5
2x
.1167 ( )
25
2
x
29/103

n5
x 3.5
n 10
x 3.5
n 25
x 3.5
2x .5833 ( x )
5
2x .2917 ( x )
10
2x .1167 ( x )
25
Notice that x2 is equal

x/n..
smallertothan
The larger the sample size the
2
Therefore, x tends
tends
smaller x .. Therefore,
to fall closer to , as the sample
size increases.
2
30/103

Demonstration: The variance of the sample mean is smaller
than the variance of the population.
Mean = 1.5 Mean = 2 Mean = 2.5
Population
Let us take samples

of two observations
1.5
2.5
22
3
1.5
2.5
22
1.5
2.5
1.5
2
2.5
1.5
2.5
2
Compare
the variability
of
the population
1.5
2.5
1.5
2
2.5
1.5
2of the sample
2.5
to the variability
mean.
1.5
2.5
2
1.5
2.5
1.5
2
2.5
1.5
2
2.5
1.5
2
2.5
1
31/103

Also, notice
Expected value of the population=(1+2+3)/3 = 2
Expected value of the sample mean=(1.5+2+2.5)/3 = 2
32/103
Sampling Distribution of the Mean(Sum)

1.
2.
3. If X is normal, x is normal. If X is nonnormal, x is

approximately normal for sufficiently large sample sizes.
Note: the definition of sufficiently large depends on the
extent of nonnormality of x (e.g. heavily skewed;
multimodal)
33/103
Opening Example
Example Deans claim: The average weekly income of
B.B.A graduates one year after graduation is $800. And
suppose the distribution of weekly income has a standard
deviation of $100.
What is the probability that 25 randomly selected graduates
have an average weekly income of less than $750?
Solution
750 800
P( x 750) P(
)
x
100 25
P( z 2.5) 0.0062
34/103
Opening Example(Contd)
Example continued
If a random sample of 25 graduates actually had an average weekly
income of $750, what would you conclude about the validity of the
claim that the average weekly income is 800?
Solution
With = 800 the probability of observing a sample
mean as low as 750 is very small (0.0062). The claim
that the mean weekly income is $800 is probably
unjustified.
It will be more reasonable to assume that is smaller
than $800, because then a sample mean of $750 becomes
more probable.
35/103
Standardizing the Sample Mean

The sampling distribution can be used to make inferences
about population parameters. In order to do so, the sample
mean can be standardized to the standard normal distribution
using the following formulation:
36/103
Using Sampling Distributions for Inference

To make inference about population parameters we use sampling
distributions.
The symmetry of the normal distribution along with the sample
distribution of the mean lead to:
P( 1.96 z 1.96) .95, or P( 1.96
- Z.025
Z.025
x
1.96) .95
n
This can be written as
P( 1.96
x 1.96
) .95
n
n
which become
P( 1.96
x 1.96
) .95
n
n
37/103

Standard normal distribution Z
Normal distribution of
P(800 1.96
100
100
x 800 1.96
) .95
25
25
.95
.025
.025
-1.96
Z
-1.96
.025
.95
.025
100
P(
800
.96
11.96
25
n
800
100
n 25
P(800
1.96
1.96
38/103

100
100
P(800 1.96
x 800 1.96
) .95
25
25
Which reduces to P(760.8 x 839.2) .95
Conclusion
There is 95% chance that the sample mean falls within the
interval [760.8, 839.2] if the population mean is 800.
Since the sample mean was 750, the population mean is
probably not 800.
39/103
Generally
P z / 2
X z / 2
1
n
n
All are probability statements about X , which well use in

statistical inference.
In this formula, (Greek letter alpha) is the probability that
X does not fall into the interval.
40/103
Return to the Opening Example

Replacing = 800,
100, n = 25, and
= .05, we get

1 .05
P z .025
X z .025
n
n
100
100
.95
P 800 1.96
X 800 1.96
25
25
P 760.8 X 839.2 .95
This is another way of checking the deans claim. The probability that
X falls between 760.8 and 839.2 is 95%. It is unlikely that we would
observe a sample mean as low as $750 when the population mean is
$800.
41/103
Using the Sampling Distribution for Inference

Changing the probability from .95 to .90 changes the probability
statement to
P( 1.645
X 1.645
) .90
42/103
Agenda
43/103

The estimator of a population proportion of successes is the
sample proportion. That is, we count the number of
successes in a sample and compute:
(read this as p-hat).
X is the number of successes, n is the sample size.

Recall X is binomially distributed.
44/103
Mean, Variance & Standard Deviation of p

By the Laws of expected value and variance, we can
determine the mean, variance and standard deviation of p
E ( p ) p
p (1 p)
n
p (1 p)
n
V ( p ) p2
Then the variable
p p
p(1 p) / n
is approximately standard normally distributed provided

that the sample size is large.
45/103
Example
In the last election a MP received 52% of the votes cast. One
year later, the MP organized a survey that asked a random
sample of 300 people whether they would vote for him in the
next election. If we assume his popularity has not changed,
what is the probability that more than half of the sample
would vote for him?
Solution: Here n = 300, p =.52, we want to determine the
probability that the sample proportion is greater than 50%,
that is, we want to find P( p >.50)
p p
.50 .52
p( p .50) p(
) p( z .69) .7549
p(1 p) / n
(.52)(.48) / 300
46/103
Example
Contd
The number of respondents who would vote for the representative

is a binomial random variable with n = 300 and p = .52.
We want to determine the probability that the sample proportion
is greater than 50%. That is, we want to find
P(P .50)
We now know that the sample proportion P is approximately
normally distributed with mean p = .52 and standard deviation
p(1 p) / n (.52)(1 .52) / 300 .0288
47/103
Example
Contd
Thus, we calculate
P(P .50)
p
P
.
50
.
52
p(1 p) / n
.
0288
P( Z .69)
.7549
If we assume that the level of support remains at 52%, the

probability that more than half the sample of 300 people
would vote for the representative is 75.49%.
48/103
Agenda
Sampling Distribution of the Difference Between
Two Mean
49/103
Difference Between Two Means

Independent samples are drawn from each of two normal
populations
Were interested in the sampling distribution of the
difference between the two sample means x1 x 2
50/103
Difference Between Two Means(Contd)

The distribution of x1 x 2 is normal if
The two samples are independent, and
The parent populations are normally distributed.
If the two populations are not both normally distributed,
but the sample sizes are 30 or more, the distribution of
x1 x 2 is approximately normal.
51/103
Difference Between Two Means(Contd)
Applying the laws of expected value and variance we have:
We can define:
Z
( x1 x 2 ) (1 2 )
12 22
n1 n2
52/103
Example
Starting salaries for MBA grads at two universities are
normally distributed with the following means and standard
deviations. Samples from each school are taken
University 1
University 2
Mean
62,000 $/yr
60,000 $/yr
Std. Dev.
14,500 $/yr
18,300 $/yr
50
60
sample size
What is the probability that the sample mean starting salary

of University #1 graduates will exceed that of the #2 grads?
53/103
Example
Contd
What is the probability that the sample mean starting

salary of University #1 graduates will exceed that of the #2
grads?
We are interested in determining P(X1 > X2). Converting this
to a difference of means, what is: P(X1 X2 > 0) ?
there is about a 74% chance that the sample mean

starting salary of U. #1 will exceed that of U. #2
54/103
The Central Limit Theorem(Contd)

If the population is normal, then X is normally distributed
for ALL values of n.
If the population is non-normal, then X is approximately
normal only for larger values of n.
In many practical situations, a sample size of 30 may be
sufficiently large to allow us to use the normal distribution
as an approximation for the sampling distribution of X.
63/103
The Central Limit Theorem

If the underlying distribution is close to a normal density curve,
then the approximation will be good even for a small n, whereas
if it is far from being normal, then a large n will be required.
Rule of Thumb
If n > 30, the Central Limit Theorem can be used.
There are population distributions for which even an n of 40 or
50 does not suffice, but such distributions are rarely encountered
in practice.
On the other hand, the rule of thumb is often conservative; for
many population distributions, an n much less than 30 would
suffice.
64/103
Example (a)
The amount of soda pop in each bottle is normally
distributed with a mean of 32.2 ounces and a standard
deviation of .3 ounces.
Find the probability that a bottle bought by a customer
will contain more than 32 ounces.
Solution
0.7486
The random variable X is the

amount of soda in a bottle.
x 32 32.2
P( x 32) P(
)
x
.3
x = 32 = 32.2
P( z .67) 0.7486
67/103
Example (b)
Find the probability that a carton of four bottles will have
a mean of more than 32 ounces of soda per bottle.
Solution
Define the random variable as the mean amount of soda
per bottle.
0.9082
x 32 32.2
P( x 32) P(
)
x
.3 4
P( z 1.33) 0.9082
0.7486
x = 32
x 32 = 32.2
x 32.2
68/103
Example
The amount of a particular impurity in a batch of a certain
chemical product is a random variable with mean value 4.0 g and
standard deviation 1.5 g.
If 50 batches are independently prepared, what is the
(approximate) probability that the sample average amount of
impurity is between 3.5 and 3.8 g?
According to the rule of thumb to be stated shortly, n = 50 is

large enough for the CLT to be applicable.
69/103
Example
Contd
then has approximately a normal distribution with mean

value = 4.0 and
so
70/103
Statistical Inference
Statistical inference is the process by which we acquire information
and draw conclusions about populations from samples.
Statistics
Information
Data
Population
Sample
Inference
Statistic
Parameter
In order to do inference, we require the skills and knowledge of

descriptive statistics, probability distributions, and sampling
distributions.
72/103
Estimation
There are two types of inference:
estimation and
hypothesis testing
estimation is introduced first.
The objective of estimation is to determine the approximate
value of a population parameter on the basis of a sample
statistic.
E.g., the sample mean (
) is employed to estimate the
population mean ( ).
We refer to the sample mean as the estimator of population
mean. Computed value of sample mean is called the estimate.
73/103
Estimation
The objective of estimation is to determine the
approximate value of a population parameter on the basis of
a sample statistic.
There are two types of estimators:
Point Estimator
Interval Estimator
74/103
Point Estimator
A point estimator draws inferences about a population by
estimating the value of an unknown parameter using a single
value or point.
We saw earlier that point probabilities in continuous

distributions were virtually zero. Likewise, wed expect that
the point estimator gets closer to the parameter value with an
increased sample size, but point estimators dont reflect the
effects of larger sample sizes. Hence we will employ the
interval estimator to estimate population parameters
75/103
Interval Estimator
An interval estimator draws inferences about a population
by estimating the value of an unknown parameter using an
interval.
That is we say (with some ___% certainty) that some

interval will include the population parameter of interest.
76/103
Point & Interval Estimation

For example, suppose we want to estimate the mean summer
income of a class of business students. For n=25 students,
is calculated to be 400 $/week.
point estimate
interval estimate
An alternative statement is:

The mean income is between 380 and 420 $/week.
77/103
Qualities of Estimators
Qualities desirable in estimators include unbiasedness,
consistency, and relative efficiency:
An unbiased estimator of a population parameter is an
estimator whose expected value is equal to that parameter.
An unbiased estimator is said to be consistent if the
difference between the estimator and the parameter grows
smaller as the sample size grows larger.
If there are two unbiased estimators of a parameter, the one
whose variance is smaller is said to be relatively efficient.
78/103
Unbiased Estimators
An unbiased estimator of a population parameter is an
estimator whose expected value is equal to that parameter.
E.g. the sample mean X is an unbiased estimator of the

population mean
, since:
E(X) =
79/103
Consistency
An unbiased estimator is said to be consistent if the
difference between the estimator and the parameter grows
smaller as the sample size grows larger.
E.g. X is a consistent estimator of
because:
V(X) is
That is, as n grows larger, the variance of X grows smaller.
80/103
Relative Efficiency
If there are two unbiased estimators of a parameter, the one
whose variance is smaller is said to be relatively efficient.
E.g. both the the sample median and sample mean are
unbiased estimators of the population mean, however, the
sample median has a greater variance than the sample mean,
so we choose since it is relatively efficient when
compared to the sample median.
81/103
Estimating
when
is known
We can calculate an interval estimator from a sampling

distribution, by:
Drawing a sample of size n from the population
Calculating its mean,
And, by the central limit theorem, we know that X is
normally (or approximately normally) distributed so
will have a standard normal (or approximately normal)

distribution.
82/103
Estimating
when
Looking at this in more detail

Known, i.e. standard
normal distribution
is known
Known, i.e. sample
mean
Unknown, i.e. we
want to estimate
the population mean
Known, i.e. its

assumed we know
the population
standard deviation
Known, i.e. the

number of items
sampled
83/103
Estimating
when
is known
the confidence
interval
the sample mean is

in the center of the
interval
Thus, the probability that the interval:
contains the population mean

confidence interval estimator for
is 1
. This is a
84/103
Confidence Interval Estimator for

The probability 1
is called the confidence level.
Usually represented
with a plus/minus
( ) sign
Confidence Interval
Estimator
upper confidence
limit (UCL)
lower confidence
limit (LCL)
85/103
Graphically
here is the confidence interval for
width
86/103
Graphically
the actual location of the population mean
may be here
or here
or possibly even here
The population mean is a fixed but unknown quantity. Its incorrect to interpret the
confidence interval estimate as a probability statement about . The interval acts as the
lower and upper limits of the interval estimate of the population mean.
87/103
Table: Four Commonly Used Confidence Levels and z / 2
1-
/2
z / 2
.90
.10
.05
z.05 1.645
.95
.05
.025
z.025 1.96
.98
.02
.01
z.01 2.33
.99
.01
.005
z.005 2.575
88/103
Example
A computer company samples demand during lead time over
25 time periods:
235
421
394
261
386
374
361
439
374
316
309
514
348
302
296
499
462
344
466
332
253
369
330
535
334
Its is known that the standard deviation of demand over lead

time is 75 computers. We want to estimate the mean demand
over lead time with 95% confidence in order to set inventory
levels
89/103
Example
contd
We want to estimate the mean demand over lead time with

95% confidence in order to set inventory levels
IDENTIFY
Thus, the parameter to be estimated is the popn mean:

And so our confidence interval estimator will be:
90/103
Example contd
CALCULATE
In order to use our confidence interval estimator, we need the

following pieces of data:
370.16
Calculated from the data
1.96
75
Given
25
therefore:
The lower and upper confidence limits are 340.76 and 399.56.
91/103
Example contd
INTERPRET
The estimation for the mean demand during lead time lies
between 340.76 and 399.56 we can use this as input in
developing an inventory policy.
That is, we estimated that the mean demand during lead

time falls between 340.76 and 399.56, and this type of
estimator is correct 95% of the time. That also means that
5% of the time the estimator will be incorrect.
Incidentally, the media often refer to the 95% figure as 19
times out of 20, which emphasizes the long-run aspect of
the confidence level.
92/103
Interval Width
A wide interval provides little information.
For example, suppose we estimate with 95% confidence that
an accountants average starting salary is between $15,000
and $100,000.
Contrast this with: a 95% confidence interval estimate of
starting salaries between $42,000 and $45,000.
The second estimate is much narrower, providing accounting
students more precise information about starting salaries.
93/103
Interval Width
The width of the confidence interval estimate is a function of
the confidence level, the population standard deviation, and
the sample size
94/103
Interval Width
the sample size
A larger confidence level

produces a w i d e r
confidence interval:
95/103
Interval Width
the sample size
Larger values of standard

deviation produce w i d e r
confidence intervals
96/103
Interval Width
the confidence level, the population standard deviation,
and the sample size
Increasing the sample size decreases the width of the

confidence interval while the confidence level can
remain unchanged.
Note: this also increases the cost of obtaining additional data
97/103
Selecting the Sample Size

Before we pointed out that sampling error is the difference
between an estimator and a parameter.
We can also define this difference as the error of

estimation.
In this chapter this can be expressed as the difference
between x and .
98/103

The bound on the error of estimation is
B = Z / 2
With a little algebra we find the sample size to estimate a

mean.
z / 2
n
99/103

To illustrate suppose that the data the manager had decided
that he needed to estimate the mean demand during lead time
to with 16 units, which is the bound on the error of
estimation.
We also have 1 = .95 and = 75. We calculate
2
z / 2
(1.96)( 75)
n

84.41
16
100/103

Because n must be an integer and because we want the
bound on the error of estimation to be no more than 16 any
non-integer value must be rounded up.
Thus, the value of n is rounded to 85, which means that to be
95% confident that the error of estimation will be no larger
than 16, we need to randomly sample 85 lead time intervals.
101/103
Example
A lumber company must estimate the mean diameter of
trees to determine whether or not there is sufficient lumber
to harvest an area of forest. They need to estimate this to
within 1 inch at a confidence level of 99%. The tree
diameters are normally distributed with a standard
deviation of 6 inches.
How many trees need to be sampled?
102/103
Example
contd
B=1, = 6
1 = .99, . =0.01, /2=0.05 From Table
z.005 2.575
We compute
z / 2 2
(2.575)(6) 2
n(
) (
) 239
B
1
That is, we will need to sample at least 239 trees to have a
99% confidence interval of x 1
103/103

Review - Lecture 6: - Normaldistribution - Normal Approximation To The Binomial and Poisson Distributions

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Review - Lecture 6: - Normaldistribution - Normal Approximation To The Binomial and Poisson Distributions

Uploaded by

Copyright:

Available Formats

Review Lecture 6

Introduction & Recap

But where do data come from? How do we ensure them

Introduction & Recap(Contd)

A descriptive measure of a population.

Methods of Collecting Data

Observational and experimental studies

The sampled population and the target population should

Simple Random Sampling

E.g. Drawing three names from a hat containing all the

Stratified Random Sampling(Contd)

Besides acquiring about the total population, we can also make

Stratified Random Sampling(Contd)

If we plan to sample 400 people total, wed draw 100 (=400*.25)

Cluster sampling may increase sampling error due to

Sampling and Non-Sampling Errors

Three types of nonsampling errors:

Rather than investigating the whole population, we take a

Sampling Distribution of the Mean

and the mean and variance are calculated as well:

Throwing a Die Twice: Sample Mean

Sampling Distribution of Two Dice

While there are 36 possible samples of size 2, there are only

The distribution of x when n = 2

Sampling Distribution of the Mean(Contd)

Sampling Distribution of the Mean(Contd)

Notice that x2 is equal

Sampling Distribution of the Mean(Contd)

Let us take samples

Sampling Distribution of the Mean(Contd)

Sampling Distribution of the Mean(Sum)

3. If X is normal, x is normal. If X is nonnormal, x is

Standardizing the Sample Mean

Using Sampling Distributions for Inference

This can be written as

Using Sampling Distributions for Inference

Using Sampling Distributions for Inference

All are probability statements about X , which well use in

Return to the Opening Example

100, n = 25, and

P 760.8 X 839.2 .95

Using the Sampling Distribution for Inference

Sampling Distribution of a Proportion

(read this as p-hat).

X is the number of successes, n is the sample size.

Mean, Variance & Standard Deviation of p

Then the variable

is approximately standard normally distributed provided

The number of respondents who would vote for the representative

p(1 p) / n (.52)(1 .52) / 300 .0288

If we assume that the level of support remains at 52%, the

Difference Between Two Means

Difference Between Two Means(Contd)

Difference Between Two Means(Contd)

Applying the laws of expected value and variance we have:

What is the probability that the sample mean starting salary

What is the probability that the sample mean starting

there is about a 74% chance that the sample mean

The Central Limit Theorem(Contd)

The Central Limit Theorem

The random variable X is the

According to the rule of thumb to be stated shortly, n = 50 is

then has approximately a normal distribution with mean