Professional Documents
Culture Documents
Variables
Consider the following table of sales,
divided into intervals of 1000 units each,
interval
(0,1000]
(1000,2000]
(2000,3000]
(3000,4000]
(4000,5000]
(5000,6000]
(6000,7000]
and the relative frequency of each interval.
relative
interval
freq.
(0,1000] 0
(1000,2000] 0.05
(2000,3000] 0.25
(3000,4000] 0.30
(4000,5000] 0.25
(5000,6000] 0.10
(6000,7000] 0.05
1.00
We’re going to divide the relative frequencies
by the width of the cells (which here is 1000).
This will make the graph have an area of 1.
relative f(x) relative freq.
interval
freq. cell width
(0,1000] 0 0
(1000,2000] 0.05 0.00005
(2000,3000] 0.25 0.00025
(3000,4000] 0.30 0.00030
(4000,5000] 0.25 0.00025
(5000,6000] 0.10 0.00010
(6000,7000] 0.05 0.00005
Graph
f(x) = p(x)
relative freq.
interval f(x)
cell width 0.00030
(0,1000] 0
0.00025
(1000,2000] 0.00005
0.00020
(2000,3000] 0.00025
(3000,4000] 0.00030 0.00015
(4000,5000] 0.00025 0.00010
(5000,6000] 0.00010
0.00005
(6000,7000] 0.00005
0
0 1000 2000 3000 4000 5000 6000 7000
sales
The area of each bar is the frequency of the category,
so the total area is 1.
Graph
f(x) = p(x)
relative freq.
interval f(x)
cell width 0.00030
(0,1000] 0
0.00025
(1000,2000] 0.00005
0.00020
(2000,3000] 0.00025
(3000,4000] 0.00030 0.00015
(4000,5000] 0.00025 0.00010
(5000,6000] 0.00010
0.00005
(6000,7000] 0.00005
0
0 1000 2000 3000 4000 5000 6000 7000
sales
f(x) = p(x)
sales
If we made the intervals infinitesimally small, the
bars and the frequency polygon would become
smooth, looking something like this:
This what the distribution
f(x) = p(x) of a continuous random
variable looks like.
This curve is denoted f(x)
or p(x) and is called the
probability density
function.
sales
pmf versus pdf
For a discrete random variable, we had a
probability mass function (pmf).
The pmf looked like a bunch of spikes, and
probabilities were represented by the
heights of the spikes.
For a continuous random variable, we have
a probability density function (pdf).
The pdf looks like a curve, and probabilities
are represented by areas under the curve.
Pr(a < X < b)
f(x) = p(x)
a b sales
A continuous random variable has an infinite
number of possible values & the probability
of any one particular value is zero.
If X is a continuous random variable,
which of the following probabilities is largest?
(Hint: This is a trick question.)
1. Pr(a < X < b)
2. Pr(a ≤ X < b)
3. Pr(a < X ≤ b)
4. Pr(a ≤ X ≤ b)
They’re all equal.
They differ only in whether they include the
individual values a and b, and any one particular
value has zero probability!
Properties of
probability density functions (pdfs)
1. f(x) ≥ 0 for values of x
This means that when we draw the pdf
curve, while it may be on the left side of
the vertical axis (have negative values of
x), it can not go below the horizontal axis,
where f would be negative.
Pr( - ∞ < X < ∞) = 1
The total area under the pdf curve, which
corresponds to the total probability, is 1.
Example
f(x) = 2 if 1 ≤ x ≤ 1.5 and
f(x) = 0 otherwise
This function satisfies both the properties of pdfs.
f(x) First, it’s never negative.
Second, the total area under the curve is (1/2) (2) = 1.
2.0
0 1.0 1.5 x
Cumulative Distribution Function for a
Continuous Random Variable
f(x)
2.0
0 1.0 1.5 x
Rectangle Example: What is F(1.2)?
F(1.2) = Pr(X ≤ 1.2)
= the area under the pdf up to where x is 1.2.
= (0.2) (2.0)
f(x) = 0.4
2.0
1
f ( x) if a �x �b
b-a
f ( x) 0 otherwise
f ( x)
Notice that the area of the rectangle
will always be 1 because
1
area = length • width
b-a
�1 �
(b - a) � � 1
�b - a �
0 a b x
Mean, variance, and standard deviation
of the continuous uniform distribution
a+b
mean: m
2
(b - a ) 2
variance: s2
12
(b - a)2
standard deviation: s
12
The most famous distribution is the
Normal or Gaussian distribution.
Its probability density function (pdf) is
1 1
- [( x - m ) / s ]2
f ( x) e 2
for - x
s 2
m is the mean of the distribution, s is the standard deviation,
e 2.718, and 3.14 .
m1 m2 m3
If you had the same mean but different standard
deviations, it would look like this:
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0
0.1
0.2
2.6 .9957
3.0
Do not memorize a lot of rules.
You just need to remember 2 easy facts.
1. The graph is symmetric about 0.
2. The total area under the curve is 1.
0 Z
Example
Pr(Z < 1.85)
0 1.85 Z
Example
0.9678
0 1.85 Z
Example
0.9678
0 1.85 Z
Example
0.5 0.4678
0.9678
0 1.85 Z
Example
0.9678
?
0 1.85 Z
Example
0.9678
0.0322
0 1.85 Z
Example
Pr(Z < -1.85)
There are two ways to do this.
0.9678
0.0322
-1.85 0 1.85 Z
-1.85 0 Z
-1.85 0 1.85 Z
Example
-1.85 0 1.85 Z
Example
Pr(-1< Z < 2)
-1.00 0 2.00 Z
Example
0.158
7 0.9772
-1.00 0 2.00 Z
Example
Pr(1< Z < 2)
0 1 2 Z
Example
0.841
3 0.9772
0 1 2 Z
Example
Pr( Z < 7)
0 7
Z
Example
0 7
Z
Example
Pr( Z > 7)
0 7
Z
Example
0 7
Z
Example
0 7
Z
Example
0 7
Z
Example
What is the value of a such that Pr(Z < a) = 0.9207 ?
0.9207
0 a Z
Example
What is the value of a such that Pr(Z < a) = 0.9207 ?
0.9207
0 1.41 Z
Example
What is the value of b such that Pr(Z > b) = 0.0250 ?
0.0250
0 b Z
Example
What is the value of b such that Pr(Z > b) = 0.0250 ?
1 – 0.0250 = 0.9750
0.0250
0 b Z
Example
What is the value of b such that Pr(Z > b) = 0.0250 ?
1 – 0.0250 = 0.9750
0.0250
0 1.96 Z
Example
What is the value of k such that Pr(0 < Z < k) = 0.4750 ?
0.4750
0 k Z
Example
What is the value of k such that Pr(0 < Z < k) = 0.4750 ?
0.4750
0 1.96 Z
Example:
If X is N(2, 9), determine Pr(X ≤ 5).
Pr(X 5)
X-m 5-m
Pr
s s
X-m 5-2
Pr 0.8413
s 3
0 1.00 Z
Pr( Z 1)
0.8413
Useful Fact
Pr(X 60)
X - m 60 - m
Pr
s s
X - m 60 - 64 -1.33 0 Z
Pr
s 3
x x
sample sample mean sample mean probability
1,1 1.0 1.0 1/9
1,2 1.5 1.5 2/9
2,1 1.5
2.0 3/9
1,3 2.0
3,1 2.0 2.5 2/9
2,2 2.0 3.0 1/9
2,3 2.5
3,2 2.5
3,3 3.0
Graph the distribution of the sample mean.
x Probability
sample mean probability
1.0 1/9 3/9
1.5 2/9
2.0 3/9 2/9
2.5 2/9
1/9
3.0 1/9
x
Graph the distribution of the original population of chips.
Since there are three equally likely values (1, 2, and 3),
the distribution looks like this.
Probability
1/3
0 1 2 3 x
What are the mean & variance
of the original population?
x p(x) xp(x)
1 1/3 1/3
2 1/3 2/3
3 1/3 3/3
m=6/3=2
What are the mean & variance
of the original population?
x p( x ) xp( x )
1.0 1/9 1/9
1.5 2/9 3/9
2.0 3/9 6/9
2.5 2/9 5/9
3.0 1/9 3/9
E ( X ) 18 / 9 2
What are the mean & variance of the sample mean X ?
2 2
x p( x ) xp( x ) x x p( x )
1.0 1/9 1/9
1.5 2/9 3/9
1.00 0.111
2.0 3/9 6/9 2.25 0.500
2.5 2/9 5/9
3.0 1/9 3/9
4.00 1.333
6.25 1.389
9.00 1.000
E ( X ) 18 / 9 2 2
E ( X ) 4.333
2 2
V ( X ) E ( X ) - [ E ( X )]2
4.333 - 2 2
0.333 or 1/3
In our chip example, we found that
E( X ) 2 E( X ) 2
2 1
V (X ) V (X )
3 3
The expected value (or mean) of the original population
and the sample mean are the same.
However, the variance of the sample mean is smaller
than the variance of the original population.
In general,
E( X ) m E( X ) m
s 2
V (X ) s 2
V (X )
n
The mean of the original population and the mean of the
sample are the same.
However, as long as the sample size is more than one,
the variance of the sample mean is smaller than the
variance of the original population.
Intuition: Suppose that each person in the class
randomly sampled 50 men from a population of
men whose average height is 70 inches. Each
student then calculated his/her sample mean.
Pr Z 2.78
0.9973
= 1 - 0.9973
0 2.78 Z = 0.0027
If the population is normal, what is the probability
that an individual student drawn at random
will have a grade over 77?
X - m 77 - 72
Pr( X 77) Pr
s 9
Pr Z 0.56
0.7123
= 1 - 0.7123
0 0.56 Z = 0.2877
Notice
The probability of selecting an individual at
random who has a grade five points above
the class mean is much greater than the
probability of randomly selecting a sample
of 25 students who have an average that
is five points above the class mean.
(0.2877 versus 0.0027)
Problem
Suppose we are sampling (without replacement) and we
sample the entire population.
Then our sample mean will always be the same as the
population mean. No spread. Zero variance.
If we don’t sample the entire population but do sample a
large part of it, our variance will not be zero but it will be
very small.
Thus, as the sample size n approaches the population size
N, the variance of the sample mean approaches zero.
Our formula for the variance of the sample mean was s2/n.
s2/n approaches zero as n approaches infinity, not as n
approaches N.
So we need to make some adjustment when we sample a
large part of our population.
Finite Population Correction Factor
When we sample a substantial part of
our population (more than 5%), we need N -n
to multiply the standard deviation of our N -1
sample mean by this factor:
s
So the standard deviation of the
sample mean which was: n
s N -n
becomes:
n N -1
The standard deviation of the sample
mean adjusted for a large sample
s N -n
n N -1
Notice that if you sample just one observation
(n=1), the square root part becomes 1, and the
s
formula becomes the old formula:
n
If you sample the entire population (n=N), the
formula has a value of 0.
So the adjustment works the way it should.
So our standardized X is now
X-m
s N-n
n N -1
Example: What is the probability that the mean of a
sample of 36 observations, from a population of 300, will be
less than 14, if the population mean & population standard
deviation are 15.30 & 4.10 respectively?
If we sample more than 5% of our population, we need to use the finite
population correction factor, and here we are sampling 12%.
X -m 14.00 - 15.30
Pr( X 14.00) Pr
s N -n 4.10 300 - 36
n N -1 36 300 - 1
Pr Z -2.02
= 0.0217
-2.02 0 Z
Up until now, as long as our sample X -m
was not too large, we have used this Z
s
standardization formula: n
However, what if we don’t know
what the population standard
deviation s is?
n
s i 1
100
∞
Example: What is the probability that the sample
mean is less than 800, if the population mean is
843.3419 and the sample standard deviation is
105, based on a sample of 25.
Pr( X 800)
� �
X - m 800 - 843.3419 �
Pr �
�s �
� 105 �
� n 25 �
-2.0639 0 t24
Pr( t24 -2.0639 )
0.025
Using Excel to solve t distribution problems
0.9978
-2.85 0 Z