Professional Documents
Culture Documents
I
.
2
Equation (1) is called the joint frequency distribution function of A and 1 ; for each
unique (r
)
.
I
) pair it computes the fraction of sample observations where A
i
= r
)
and
1
i
=
I
. Put dierently it computes the frequency of the event A
i
= r
)
and 1
i
=
I
in the
sample. Thus 7.07 percent of the workers in the sample have six years of schooling and earn
between two and three thousand Lempiras per month.
3
If we sum (1) over all possible ,. /
pairs we get
X
J
)=1
X
1
I=1
j (r
)
.
I
) = 1.
i.e., the sum of all the individual frequencies equals one. Since Table 1 enumerates all possible
joint realizations of A and 1 , the sum of the individual event frequencies must sum to one.
This is just a more complex manifestation of the fact that for any number of coin ips, the
sum of the fraction tails and fraction heads will always equal one!
Table 1 completely characterizes the joint distribution of schooling and earnings in our
sample. What can it tell us about the relationship between these two variables? The table is
a bit bewildering, but with a little work we can discern an important regularity. Note that
most of the cells in the lower left-hand and upper right-hand regions of the table are close
to zero. There are few workers in our sample with few years of schooling and high monthly
earnings, conversely there are few workers in the sample with many years of schooling and
low monthly earnings. Most of the frequency mass is concentrated on the diagonal of cells
running from the upper left-hand to the lower right-hand portions of the table. This suggests
that higher levels of schooling tend to be associated with higher levels of earnings in our
sample.
What does Table 1 tell us about the distribution of years schooling in our sample, irre-
spective of a workers earnings? The marginal frequency distribution function of A is
j(r
)
) =
X
1
I=1
j(r
)
.
I
). (2)
Equation (2) computes the marginal frequency of the event A
i
= r
)
in our sample. By
marginal we mean that we are only interested in the frequency distribution of A alone, in-
dependent of any possible relationship between A and 1 . To calculate the fraction of workers
with exactly A
i
= r
)
years of schooling in the sample we sum over the / = 1. . . . . 1 frequen-
cies for earnings and schooling combinations where schooling is held xed at r
)
. Looking at
Table 1, we simply sum up all the elements in a given column. For example, the marginal
2
A comment on notation before we continue: a capitalized variable indicates a random draw from the
population. For example
I
j(
I
|6) = 500 0.151 + 1. 500 0.213 + + 10. 500 0.024 = 2. 956.
The average monthly earnings among workers with exactly six years of schooling in our
sample is 2,956 Lempiras. The conditional mean is simply an average among individuals for
which the conditioning criterion is true. It is an average over a subsample. From Table 2 and
the second panel of Figure 1 we know that some workers with six years of schooling earn more
and others less, but the conditional mean is equal to 2,956 Lempiras. If we calculate mean
earnings among workers with no schooling in our sample using (5) we get 1,721 Lempiras;
for workers with nine years of schooling we get 3,617 Lempiras. These three conditional
means are plotted as dashed vertical lines in Figure 1. Among workers with higher levels of
schooling we observe a higher mean level of earnings (i.e., the conditional mean vertical lines
shift rightward as we move from the rst to third panels).
Conditional means provide a simple way to summarize the relationship between schooling
and earnings in our sample. Figure 2 plots :
Y |A
(r
)
) for each of the r
1
. . . . . r
J
possible
schooling levels. The gure clearly shows that, in our sample, mean earnings are higher for
workers with higher levels of schooling. I have connected the individual points in Figure 2
to clarify the sample trend.
The conditional mean function provides a simple summary of the relationship between
schooling and earnings among workers in our sample. It is certainly far clearer than the 231
joint frequencies we began with in Table 1! It is also easier to read that Table 2 and Figure
1. However this clarity comes at a cost. The conditional frequency distribution, in addition
to showing how earnings tends to increase with schooling in our sample, also shows the range
of earnings realizations among workers with the same level of schooling. By simply plotting
:
Y |A=a
Y |A
(r
)
) =
X
1
I=1
(
I
:
Y |A=a
)
2
j(
I
|r
)
). (6)
Equation (6) gives the average of the squared dierences between an individuals actual
earnings and the mean earnings of individuals with the same schooling level this is a
common measure of variability around the mean. For workers with six years of schooling we
5
have, using the information in Table 2, a conditional sample variance of
Y |A
(6) = (5002. 956)
2
0.151+(1. 500 2. 956)
2
0.213+ +(10. 500 2. 956)
2
0.024 = 45. 273.
The square root of (6) gives what is called root mean square error (rmse). For six years of
schooling we have a rmse of
45. 273 ' 213.
Figure 2, in addition to plotting :
Y |A
(r
)
) for r
1
. . . . . r
J
also plots :
Y |A
(r
)
) 0.675
p
Y |A
(r
)
) for , = 1. . . . . J. These dashed lines give us a sense of the sample variability in
earnings around each conditional mean earnings level. Later in the course we will learn how
to interpret these dashed lines in a precise way and why, for example, I chose to multiply
rmse by 0.675 and not some other number, or no number at all.
Our goal was to characterize the relationship between schooling and earnings in our
sample of 4,400 Honduran male workers. We began with the joint frequency distribution
of earnings in Table 1. This table gives the fraction of workers in our sample in each
of the J 1 = 21 11 = 231 possible schooling-by-earnings cells. Using these joint
frequencies we calculated the marginal frequency distribution of years of schooling in our
sample. The conditional frequency distribution of earnings given schooling helped to clarify
the relationship between schooling and earnings in our sample. We found that workers with
more years of schooling tended to be in the higher earnings brackets, while workers with
few years of schooling tended to be in the lower earnings brackets. We then computed the
conditional mean of earnings given schooling. We found that the average or mean earnings of
individuals with little schooling was substantially lower than the mean earnings of individuals
with lots of schooling. Finally, we ended by dening and calculating the conditional sample
variance a measure of earnings variability. Our nal product is Figure 2, which provides
a compact summary of the relationship between schooling and earnings in our sample.
2 Samples, populations and the analogy principle
All the information in Tables 1 and 2 and Figures 1 and 2 pertains to our sample of 4,400
Honduran workers. Ideally wed like to learn about more than just the workers actually in
our sample; wed like to learn about the entire population of Honduran male workers aged
20 to 50.
In this course we will use random samples to learn about characteristics of an underlying
population. The basic idea is quite straightforward. Say we are interested in computing
the average years of schooling for individuals aged 30 to 35 who live in the United States.
One way to compute this would be to conduct a census we could interview every single
individual between the ages of 30 and 35 living in the United States, ask them about their
educational attainment, and average the results. This would give us the population mean
years of schooling where our population includes all individuals between the ages of 30
6
and 35 living in the United States. Alternatively we could draw a random sample of such
individuals and ask themabout their schooling. The average years of schooling for individuals
in our (random) sample (i.e., the sample mean) provides an estimate of the true or actual
population mean.
Let o equal years of schooling and let
o
0
= E[o
i
]
equal the expected years of schooling for the i
tI
randomly drawn individual from the pop-
ulation of interest. E[] denotes the expectations operator. Imagine we randomly sampled
an individual from the population we are interested in learning about (to make this process
concrete we can imagine every individual in the population is assigned a unique number and
that we use a computer to randomly generate numbers and hence select one of these indi-
viduals).
4
Once we sample an individual we observe her years of schooling. Prior to making
this observation, however, we would expect her years of schooling to equal o
0
. By expect
we mean that if we were to repeatedly (randomly) sample individuals an innite number of
times that the average of their observed years of schooling would equal o
0
(we will dene the
expectation or a random variable more carefully in the lectures that follow).
Say we sample ` individuals. This gives us the random sample of schooling observations
o
1
. o
2
. . . . . o
.
.
If we take the sample average we get
b
o
.
=
1
`
X
S
i=1
o
i
.
Our sample average,
b
o
.
, is an estimate of the population mean, o
0
. It is intuitive that as `
gets very large (approaches innity) that our estimate will be close to o
0
. Unfortunately for
any given random sample of individuals we do not know for certain when our estimate,
b
o
.
,
is close to the population mean, o
0
, that we are actually interested in. Intuitively our belief
is that if we randomly sample a large (but nite) number of individuals from the relevant
population that our estimate will be pretty close to the truth (i.e., to o
0
), but there is
always some chance that this will not be the case. Later in the course we will learn how to
(approximately) characterize the sampling variability of our estimate.
Our estimator for the average years of schooling in the population of interest is the sample
mean of a random sample of schooling observations. This estimator is an example of the
analogy principle in action. We are interested in the expectation (i.e., population mean) of
4
For technical reasons we assume that we sample with replacement. That is, once we sample an individual
and observe her years of schooling she is placed back into the population and can in theory be selected
again by our computer.
7
the random variable o. To estimate o
0
= E[o] we replace the population mean E[o]
with its sample analog, the sample mean `
1
P
S
i=1
o
i
. Intuitively, we believe that the
sample mean years of schooling is a sensible estimate of the true population mean years of
schooling. All of the estimators we will learn about in this course are based on the idea of
replacing population means (i.e, expectations of random variables) with sample means (i.e.,
averages of a random sample of observations).
We have already used the analogy principle extensively in this Lecture. Consider the
population joint frequency distribution function or joint probability mass function (pmf)
associated with our sample of Honduran workers.
:(r
)
.
I
) = Pr (A
i
= r
)
. 1
i
=
I
) . (7)
Equation (7) gives the probability that a randomly sampled male Honduran worker between
the ages of 20 and 50 has both years of schooling equal to r
)
and earnings equal to
I
.
Put dierently :(r
)
.
I
) gives the fraction of workers in the entire population for which
A
i
= r
)
and 1
i
=
I
. The joint frequency distribution associated with our sample is an
analog estimator for the true joint probability mass function underlying the population
from which our sample was collected. To see that this note that
:(r
)
.
I
) = Pr (A
i
= r
)
. 1
i
=
I
)
= Pr (1(A
i
= r
)
) 1(1
i
=
I
) = 1)
= E[1(A
i
= r
)
) 1(1
i
=
I
)] .
Then observe that (1) is simply the sample analog of E[1(A
i
= r
)
) 1(1
i
=
I
)] .
3 Introduced concepts
1. joint, marginal and conditional frequency distribution
2. marginal and conditional mean
3. conditional sample variance and root mean square error (rmse)
4. population
5. random samples
6. expected value
7. the analogy principle
8
Table 1: Joint frequency distribution (jfd) of Y (= Monthly Wages) and X (= Years of Schooling)
\ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0Y1 4.55 1.07 2.05 2.57 1.36 1.05 4.39 0.23 0.14 0.61 0.07 0.05 0.23 0.05 0.05 0.00 0.02 0.00 0.00 0.00 0.00
1Y2 2.82 0.68 1.66 1.86 1.30 1.07 6.16 0.57 0.41 0.93 0.39 0.34 0.68 0.05 0.00 0.05 0.05 0.05 0.00 0.02 0.00
2Y3 1.73 0.55 1.00 1.68 1.02 1.05 7.07 0.45 0.73 1.23 0.39 0.41 1.86 0.23 0.05 0.14 0.11 0.02 0.00 0.00 0.00
3Y4 0.75 0.16 0.61 1.02 0.50 0.45 5.07 0.45 0.73 1.16 0.20 0.57 1.77 0.39 0.20 0.09 0.18 0.11 0.05 0.02 0.00
4Y5 0.16 0.14 0.30 0.45 0.23 0.32 2.30 0.16 0.25 0.36 0.23 0.34 1.41 0.20 0.20 0.16 0.16 0.16 0.05 0.00 0.02
5Y6 0.23 0.07 0.07 0.20 0.05 0.07 1.73 0.14 0.16 0.45 0.11 0.23 0.91 0.16 0.14 0.14 0.11 0.16 0.07 0.02 0.00
6Y7 0.00 0.05 0.09 0.00 0.18 0.05 0.66 0.07 0.07 0.23 0.07 0.20 0.89 0.11 0.20 0.05 0.14 0.18 0.00 0.00 0.02
7Y8 0.07 0.02 0.02 0.02 0.05 0.02 0.50 0.05 0.09 0.20 0.09 0.30 0.66 0.16 0.16 0.09 0.14 0.23 0.00 0.00 0.02
8Y9 0.00 0.00 0.05 0.07 0.05 0.00 0.32 0.00 0.00 0.16 0.00 0.11 0.30 0.02 0.05 0.05 0.07 0.02 0.07 0.02 0.02
9Y10 0.00 0.02 0.00 0.02 0.02 0.07 0.11 0.05 0.07 0.05 0.02 0.14 0.34 0.05 0.09 0.00 0.09 0.16 0.07 0.05 0.05
Y10 0.20 0.02 0.07 0.18 0.14 0.05 0.68 0.11 0.02 0.25 0.07 0.48 0.84 0.09 0.16 0.25 0.89 1.27 0.36 0.16 0.23
() 10.5 2.8 5.9 8.1 4.9 4.2 29.0 2.3 2.7 5.6 1.6 3.2 9.9 1.5 1.3 1.0 2.0 2.4 0.7 0.3 0.4
Notes: Empirical frequencies calculated from a random sample of 4,400 non-farm Honduran male workers aged 20 to 50. Raw frequencies are
multiplied by one hundred. Monthly wage data is expressed in thousands of Lempiras. In May of 2004, when the data were collected, 1 US$ bought
approximately 18 Honduran Lempiras. Data collected as part of the biennial Encuesta Permanente de Hogares de Propsitos Mltiples (EPHPM).
9
Table 2: Conditional frequency distribution (cfd) of Y (= Monthly Wages) given X (= Years of Schooling)
\ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0Y1 43.3 38.5 34.6 31.7 27.9 25.0 15.1 10.0 5.1 10.9 4.2 1.4 2.3 3.0 3.5 0.0 1.2 0.0 0.0 0.0 0.0
1Y2 26.8 24.6 28.1 23.0 26.5 25.5 21.3 25.0 15.4 16.5 23.6 10.8 6.9 3.0 0.0 4.5 2.3 1.9 0.0 7.7 0.0
2Y3 16.5 19.7 16.9 20.8 20.9 25.0 24.4 20.0 27.4 21.8 23.6 12.9 18.9 15.2 3.5 13.6 5.8 1.0 0.0 0.0 0.0
3Y4 7.1 5.7 10.4 12.6 10.2 10.9 17.5 20.0 27.4 20.6 12.5 18.0 17.9 25.8 15.8 9.1 9.3 4.8 6.9 7.7 0.0
4Y5 1.5 4.9 5.0 5.6 4.7 7.6 7.9 7.0 9.4 6.5 13.9 10.8 14.3 13.6 15.8 15.9 8.1 6.7 6.9 0.0 6.2
5Y6 2.2 2.5 1.2 2.5 0.9 1.6 6.0 6.0 6.0 8.1 6.9 7.2 9.2 10.6 10.5 13.6 5.8 6.7 10.3 7.7 0.0
6Y7 0.0 1.6 1.5 0.0 3.7 1.1 2.3 3.0 2.6 4.0 4.2 6.5 9.0 7.6 15.8 4.5 7.0 7.7 0.0 0.0 6.2
7Y8 0.6 0.8 0.4 0.3 0.9 0.5 1.7 2.0 3.4 3.6 5.6 9.4 6.7 10.6 12.3 9.1 7.0 9.6 0.0 0.0 6.2
8Y9 0.0 0.0 0.8 0.8 0.9 0.0 1.1 0.0 0.0 2.8 0.0 3.6 3.0 1.5 3.5 4.5 3.5 1.0 10.3 7.7 6.2
9Y10 0.0 0.8 0.0 0.3 0.5 1.6 0.4 2.0 2.6 0.8 1.4 4.3 3.4 3.0 7.0 0.0 4.7 6.7 10.3 15.4 12.5
Y10 1.9 0.8 1.2 2.2 2.8 1.1 2.4 5.0 0.9 4.4 4.2 15.1 8.5 6.1 12.3 25.0 45.3 53.8 55.2 53.8 62.5
Notes: See notes to Table 1 for details of the data.
10
Figure 1: Conditional frequency distribution of earnings given
schooling equal to zero, six and nine years respectively.
Notes: Based on conditional frequencies reported in columns 1, 7, and 10 of Table
2.
11
Figure 2: Conditional mean of monthly earnings given years of schooling
Notes: Calculated used data in Table 2 and equations (5) and (6).
12