Professional Documents
Culture Documents
0
X
11
Y
11
u
11
pop slope
1
3
chosen in sample
not chosen in sample
estimated error for X
3
(residual)
Y
X
Y
3
Y
i
0
+
1
X
i
X
3
estimated slope =
estimated
intercept =
Sample Regression Equation
u
3
u
3
0
1
pop slope
1
0
Y
3
4
The OLS estimator solves
0
,
1
min
[Y
i
(
0
i1
n
+
1
X
i
)]
2
5
6
California Test Score/Class Size data
Interpretations
7
Predicted values & residuals:
8
OLS regression: STATA output
regress testscr str, robust
Regression with robust standard errors Number of obs = 420
F( 1, 418) = 19.26
Prob > F = 0.0000
R-squared = 0.0512
Root MSE = 18.581
-------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------+----------------------------------------------------------------
str | -2.279808 .5194892 -4.39 0.000 -3.300945 -1.258671
_cons | 698.933 10.36436 67.44 0.000 678.5602 719.3057
-------------------------------------------------------------------------
9
Measures of Fit
10
The Standard Error of the
Regression (SER)
11
Root Mean Squared Error (RMSE)
12
R
2
and SER Example
13
The Least Squares Assumptions
14
LSA #1: E(u|X = x) = 0
15
LSA #2: (X
i
,Y
i
), i = 1,,n are i.i.d.
LSA #3: E(X
4
) < and E(Y
4
) <
16
OLS can be sensitive to an outlier
17
Sampling Distribution of
1
b
18
Some Preliminary Algebra
19
1
1
( X
i
X )
i1
n
( u
i
u )
( X
i
X )
2
i1
n
( X
i
X )
i1
n
u
i
( X
i
X )
2
i1
n
20
Now we can calculate E( ) and var( )
1
E[
1
] E[
1
] + E
(X
i
X )
i1
n
u
i
( X
i
X )
2
i1
n
E[
1
]
1
+ E
( X
i
X ) E[
i1
n
u
i
X
i
]
( X
i
X )
2
i1
n
?
21
Next calculate var( )
1
22
23
The larger the variance of X, the
smaller the variance of
1
There are the same number of black and blue dots using which
would you get a more accurate regression line?
24
What is the sampling distribution of ?
1
25
We are now ready to turn to hypothesis tests & confidence
intervals