Professional Documents
Culture Documents
Simple linear regression is probably the starting point of constructing econometrics theory. The idea behind regression is to explain the dependence of one
variable yi usually referred to as dependent on another variable xi (independent)
using a simple form
yi = + xi + "i ;
where is the intercept and is the slope, "i is unobserved random variable
called error. Subscript i indicates that this observation belongs to individual i.
Parameters and are xed and unknown to the researcher. Since and E"i
are not separately identied the restriction E"i = 0 is impossed. Then
Eyi =
+ xi ;
+ xi ;
1.1
x =
Sxx
1X
1X
xi and y =
yi
n i=1
n i=1
n
X
(xi
x) and Syy =
i=1
Sxy
n
X
n
X
i=1
(xi
x) (yi
y) :
i=1
n
X
(yi
i=1
( + xi ))
(yi
y)
Theorem
n
X
min
(xi
a
n
X
b + b xi
yi
and
yi
b xi
(xi
x) :
i=1
a) =
i=1
n
X
i=1
b=
Then
n
X
1X
yi
n i=1
i=1
n
X
b + b xi
yi
b xi = y
(yi
b x:
b (xi
y)
i=1
= Syy
x)
2 b Sxy + b Sxx :
Taking the derivative of this and setting it to zero leads to the solution
b = Sxy ;
Sxx
2
which is a minimum since coe cient b is positive. The least squares estimates
of and are
1.2
OLS
b OLS
Sxy
=
Sxx
n
P
(xi
x) (yi
y)
i=1
n
P
;
(xi
x)
i=1
= y
OLS x:
When there is no intercept in the regression the estimator of the slope coe cient
collapses to
n
P
xi yi
i=1
b
:
OLS = P
n
x2i
i=1
Next we show that the least square estimator is best linear unbiased estimator
(BLUE). This estimator is linear because it can be presented in the form
b=
n
X
i=1
d i yi :
It is also unbiased
n
P
Eb =
i=1
n
P
i=1
Since
Eb =
n
P
xi
i=1
n
P
xi Eyi
=
x2i
n
X
i=1
di Eyi =
i=1
xi
= :
x2i
n
X
di xi
i=1
di xi = 1:
i=1
n
X
d2i
. Then
i=1
i=1
n
X
n
X
d2i
i=1
di xi
1 :
i=1
xi
di
0
xi
2
n
X
di xi
di xi
i=1
Then since
n
P
i=1
x2i
2
n
X
2 i=1
di xi = 1
2
= P
:
n
x2i
i=1
Then
xi
di = P
n
x2i
i=1
x2i
1.3
OLS
n
P
xi yi
= i=1
:
n
P
x2i
i=1
We made very few assumptions to derive the least squares estimator. Specically, we specied the conditional mean of yi , its variance and assumed statistical
independence of the individual observations. As a results of these assumptions
we were able to derive the least squares estimator and show its unbiasedness
and e ciency.
Now let us make a much stronger assumption that
yi
+ xi ;
; i = 1; :::; n:
iid
n 0;
+ xi + "i ;
; i =
1; :::; n:
n
Y
i=1
n
Y
f yi j ; ;
1
p
=
2
i=1
=
exp
1
n=2
(2 )
"
(yi
xi )
2
n
P
6
exp 6
4
#
2
(yi
xi )
i=1
7
7:
5
log f yj ; ;
n
log
2
n
log (2 )
2
xi )
i=1
(yi
xi )
i=1
(yi
n
P
ML
bM L
Sxy
=
Sxx
n
P
x) (yi
(xi
should
y)
i=1
n
P
and
;
2
(xi
x)
i=1
= y
M L x:
n
log
2
n
P
yi
i=1
b xi
n
P
yi
i=1
1X
yi
n i=1
is set to
=0
b xi
b xi
b2 =
which is the RSS evaluated at the least squares line divided by the sample size.
Dene the residusl from the regression to be
b
"i = yi
V arb2 =
b xi :
would be
n
sb =
b2
n 2
2
Theorem
The sampling distributions of the estimators b , b for the conditional normal
model are
!
n
2 X
2
x ;
b
n
;
nSxx i=1 i
cov b ; b
Sxx
5
Sxx
(n
2
n 2
Proof
Since b and b are linear combinations of normally distributed yi they must
be normal. Then
E b OLS
V ar b
n
P
(xi
x) E (yi
y)
i=1
n
P
=
2
(xi
x)
n
P
n
P
i=1
n
P
x)
=
(xi
x)
i=1
2
(xi
x) V ar (yi
y)
i=1
n
P
x) (xi
(xi
i=1
2
2
(xi
x)
Sxx
i=1
where
ci =
1
n
n
X
(xi
ci yi
i=1
x) x
:
Sxx
and V arb OLS =
nSxx
n
P
i=1
2
that cov b ; b = Sxxx ; which I leave as an exersize.
The second part of the theorem is somewhat more di cult to prove because
it involves considerable algebraical manipulations. In fact, it is much easier to
prove this using matrix algebra which we will do later. Here I just give a sketch
of the proof.
s2
Random variable (n 2)b
is a function of the estimated residuals b
" i = yi b
2
b xi : The regression itself could be thought of as a decomposition of the observed
values yi into the predicted value ybi = b + b xi and the residual b
"i = yi ybi :
They are uncorrelated because
n
X
i=1
where
n
P
b
"i = 0 and
i=1
n
P
i=1
ybib
"i = b
n
n
X
X
b
"i + b xib
"i = 0;
i=1
i=1
xib
"i = 0 because it is consistent the F.O.C. of the ML
i
2
2
n
i=1
by construction. And
n
X
"2i
i=1
n
X
(yi
xi ) =
i=1
n
X
ybi + ybi
(yi
i=1
n
n
X
X
=
b
"2i +
b
i=1
i=1
2
n 2
and
b
2
2
x2i
sb
and
n
P
i=1
xi )
tn
2;
tn
2:
x2i = (nSxx )
b
p
sb= Sxx
=0
H1 :
6= 0;
against
can be based on
b
p
sb= Sxx
or equivalently on
b2
b2
sb2 =Sxx
F1;n
sb2 =Sxx
In fact,
=
tn
2:
2
2
2
Sxy
=Sxx
Sxy
=Sxx
=
:
sb2 =Sxx
RSS= (n 2)
2
The quantity in the numerator Sxy
=Sxx has a special name: Regression sum of
squares. In the identity
n
X
n
X
y) =
(yi
n
X
y) +
(b
yi
i=1
i=1
ybi ) ;
(yi
i=1
2
y) = Sxy
=Sxx :
(b
yi
i=1
(b
yi
y)
=
2
(yi
y)
2
Sxy
:
Sxx Syy
i=1
1.4
Consistency
n
P
(xi
x) (yi
y)
i=1
n
P
(xi
x) [(xi
x)
i=1
n
P
(xi
+ (ui
u)]
n
P
(xi
x)
i=1
2
x)
i=1
x)
i=1
(xi
n
P
n
P
(xi
x) (ui
i=1
n
P
(xi
1
n
u)
=
x)
i=1
n
P
(xi
i=1
n
P
1
(xi
n
i=1
x) ui
:
2
x)
i=1
n
P
i=1
(xi
x)
x) ui such that
E (xi
x) ui
V ar (xi
x) ui
(xi
1
n
n
P
(xi
i=1
x) < 1:
x) ui as the sampling
An extension of the law of large numbers discussed before applied to independently but not identically distributed sequence of random variables can be
utilized here to establish that
" n
#
n
1X
1X
p
(xi x) ui ! E
(xi x) ui = 0:
n i=1
n i=1
Therefore,
b=
1.5
n
P
1
n
(xi
i=1
n
P
1
(xi
n
i=1
x) ui
x)
Asymptotic Normality
1
n
n
P
(xi
x)
i=1
1
n
n
P
(xi
x) ui
i=1
Denote
such that
1 X
zn = p
(xi
n i=1
Ezn
V arzn
=
=
x) ui
0
n
2X
i=1
(xi
x) < 1:
5.108
3.397
1.183
0.910
Then a central limit theorem could be applied to such an i.n.i.d. series of random
variables such that
zn Ezn d
p
! N (0; 1)
V arzn
or
d
zn ! N 0; 2 Qxx :
Then
p
n b
p1
n
=
1
n
n
P
i=1
n
P
(xi
x) ui
(xi
! N 0;
x)
Qxx1 :
i=1
Dental Expenditure
We investigate the eect of income on the total dental expenditure. The data
set is derived from the Medical Expenditure Panel Survey (MEPS). MEPS
is a nationally representative survey of health care, including dental, expenditure, sources of payment and insurance coverage for the US civilian noninstitutionalized population. We use data from the 1996, 1997, 1998, 1999 and
2000 surveys. The sampling scheme of the MEPS data is a two-year overlapping
panel, i.e., in each calendar year after the rst survey year, one sample of persons is in its second year of responses while another sample of persons is in its
rst year of responses. The sample is restricted to the U.S. population between
the ages of 25 and 64 years and those who are employed. Further we take only
those observations whose total dental expenditure and income are positive. To
reduce heterogeneity coming from the fact that the eect of income should be
structurally dierent for those with and without dental insurance we restrict our
sample to only those without dental insurance. The total number of observation
in the data set is 2737. We use logarithm retransformation of income and dental
expenditure to have more symmetric distributions. Tables 1 gives a summary
statistics.
Here is the plot of the data indicating that Lntdcexp and Lnincome are potentially bivaraite normally distributed. When two random variables (yi ; xi )
BN x ; y ; 2x ; 2y ; the conditional distribution of yi given xi is
E (yi jxi ) =
y
x
(x
x) :
This means that the bivariate normal model implies that yi is a linear function
of xi : The estimated regression isand the null hypothesis
10
10
8
6
lntdcexp
4
2
0
0
lnincome
Source
SS
df
MS
Number of obs =
F(
1,
2737
2735) =
32.37
Model
44.8104909
44.8104909
Prob > F
0.0000
Residual
3786.1013
2735
1.38431492
R-squared
0.0117
Total
lntdcexp
3830.91179
Coef.
2736
1.40018706
Std. Err.
Adj R-squared =
0.0113
Root MSE
1.1766
P>|t|
lnincome
.140637
.0247188
5.69
0.000
.0921676
.1891064
_cons
4.630147
.0869183
53.27
0.000
4.459715
4.800579
11
10
8
6
4
2
0
0
lnincome
lntdcexp
Fitted values
H0 :
=0
is rejected with the t-ratio of 5.69 and the p-value smaller that 0.1%. The
plotted regression line isAs you can see from the regression results the linear
model is a poor t for the data. Just roughly 1% of variation in expenditure is
explained by income. More explanatory variables are needed to improve the t.
12
Problems
1. Prove that
n
X
min
(xi
a
i=1
V arb2M L =
b
ci =
x) :
n
X
can be expressed as
ci yi
i=1
(xi
1
n
OLS x:
b OLS =
where
(xi
i=1
2. Show that
n
X
a) =
x) x
Sxx
and
2
V arb OLS =
n
X
nSxx i=1
x2i
Solution:
b=y
Note that
cov b ; b =
bx =
n
X
i=1
1
yi
n
(xi
x) yi
Sxx
n
X
Sxx
ci
ci xi
i=1
13
n
X
i=1
i=1
n
X
1
n
(xi
x) x
Sxx
yi
V arb
n
X
c2i
i=1
n
X
n 6
X
61
6 2+
4n
i=1
(xi
n
P
2
(xi x) x2
n
P
2
(xi x)
i=1
3
61
6 +
n
4n
P
x) x 7
7
25
(xi x)
i=1
3
61
6
4n
i=1
32
7
7
27
5
n
2 X
7
7=
x2 :
nSxx i=1 i
25
x)
x2
(xi
i=1
OLS
n
P
(xi
x) (yi
y)
i=1
n
P
x)
(xi
i=1
n
P
(xi
n
P
x) yi
i=1
(xi
x) y
i=1
n
P
(xi
x)
i=1
n
P
i=1
n
P
(xi
x) yi
:
(xi
x)
i=1
Then
cov b ; b
n
X
B
= cov B
@
=
i=1
n
X
i=1
1
n
1
n
(xi
x) x
Sxx
(xi
x) x
Sxx
n
P
yi ; i=1
(xi
Sxx
(xi x)
=
Sxx
4. Show that
n
X
i=1
(yi
y) =
n
X
(b
yi
i=1
14
y) +
n
X
i=1
(yi
x)
ybi ) :
Sxx
C
yi C
A
(b
yi
ybi ) =
y) (yi
i=1
Substitute b = y
=
n
X
i=1
b x to get
n h
X
i=1
n
X
b (xi
(yi
b + b xi
ih
x) (yi
y) (xi
x)
i=1
5. Show that
n
X
y)
b2
b (xi
n
X
b + b xi
yi
(xi
i
x)
2
x) = 0
i=1
2
y) = Sxy
=Sxx
(b
yi
i=1
Solution:
n
X
(b
yi
i=1
2
y) = b
2
n
X
(xi
2
x) = Sxy
=Sxx :
i=1
6. Run four simple regressions for the data sets provided in the table below.
Report your results and plot the data sets on four dierent graphs. What
conclusion
can
you
make
from
this?
Anscombes
quartet:
x1
y1
x2
y2
x3
y3
x4
y4
10.0 8.04
10.0 9.14 10.0 7.46
8.0
6.58
8.0
6.95
8.0
8.14 8.0
6.77
8.0
5.76
13.0 7.58
13.0 8.74 13.0 12.74 8.0
7.71
9.0
8.81
9.0
8.77 9.0
7.11
8.0
8.84
11.0 8.33
11.0 9.26 11.0 7.81
8.0
8.47
.
14.0 9.96
14.0 8.10 14.0 8.84
8.0
7.04
6.0
7.24
6.0
6.13 6.0
6.08
8.0
5.25
4.0
4.26
4.0
3.10 4.0
5.39
19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15
8.0
5.56
7.0
4.82
7.0
7.26 7.0
6.42
8.0
7.91
5.0
5.68
5.0
4.74 5.0
5.73
8.0
6.89
7. Let x and x be independent random variables with means 1 , 2 and
variances 21 ; 22 : Determine the correlation coe cient of x and z = x y
in terms of 1 , 2 , 21 ; 22 .
8. Suppose we estimate the model:
yi =
where ui
N 0;
; i = 1; :::; N:
15
+ ui ;
y = ( + x)e
where y and x are scalar observables, e is unobservable. Let E[ejx] = 1
and V ar[ejx] = 1. How would you estimate and by OLS? How would
you construct the standard errors?
10. Let
zn ! N 0;
n
P
1
(xi
n
n !1 i=1
(xi
Qxx ;
zn
n
P
1
d
2
x)
i=1
16
! N 0;
Qxx1 :