Professional Documents
Culture Documents
1. Useacalculator tocomputethesampleleastsquaresregressionlinefor
themodel
,giventhefollowingsixobservations.
y 2 8 6 12 9
11
x 1 4 3 10 10 8
1 4 3 10 10 8
2 8 6 12 9 11
6;
8
6
6
1 6 2 8
8 6 11 8 62
1
Thusthesampleregressionlineis
0.8378
2.9732
74
62
0.8378
74
6 2.9732
0.8378
Supposeasampleofobservationsyieldsleastsquaresestimatesof
b0=32andb1=0.82.
(a) Whatdoes representinthemodel?
1
(b) State the basic (classical) assumptions made about the s in this
model.Explaininwordswhattheassumptionsmean.
|
(i)
0forallobservations.Theconditionalmeanofthedisturbance
doesnotdependonxandisnormalizedtozero.Notethisisdifferentfrom
Kellerwhoonlymentionsthenormalizationtozero.Thattheconditional
mean of the disturbances does not depend on x ensures unbiasedness of
theOLSestimatorandsoisthemuchmoreimportantcomponentofthis
assumption. Relating back to the previous part of the question it implies
that omitted factors that might affect expenditure but appear in the
disturbanceareassumedtobeuncorrelatedwithx.
(ii)
, aredrawnbysimplerandomsamplingandhenceiid.
(iii) Thestandarddeviationof isconstantforallobservations.Itisdenoted
by and we say the disturbances are homoskedastic. Here that implies
the variability in food expenditure does not depend on income which is
possiblyproblematicinpractice.
(iv) The disturbances for any two observations are independent. This will
imply, in particular that there is no correlation between disturbances
associated with different observations. In this example the factors in the
disturbanceforhouseholdiarenotcorrelatedwiththoseforhouseholdj.
(v) isnormallydistributedforallobservations.
Doestheestimateofb0=32makesense?Ifnot,doesthisnecessarily
invalidatethemodel?Explainyouranswer.
Thisindicatesthatifahouseholdhadazeroweeklyincomethenonaverage
such a household would have negative consumption, which does not make
sense.However,thisdoesnotnecessarilyinvalidatethemodel.Itmaybethat
the linear model is only a reasonable approximation for some range of
household incomes, not including incomes near zero. In particular, the
relationship may be nonlinear for values of x near zero. The conclusion is
thatweshouldbecarefulininterpretingtheinterceptterm,asitmaynotbe
verymeaningfulinsomecases.
(c)
Interpretboth 1andb1.Whatdoesthemodelpredictwouldbethe
changeinyfollowinga$10increaseinxfromsomeinitiallevel?
1isthe(unknown)populationchangeinthevalueofyresultingfromaone
unit increase in x, whereas b1=0.82 is an estimate of 1. In this particular
examplethisisthemarginalpropensitytoconsumethatwouldbediscussedin
economics courses. The predicted change in y following a $10 increase in x
wouldbe10
10 0.82 $8.20.
Ifwelet betheestimatedslopecoefficientwhenthevariablesaremeasured
incents,wehave
100
100 100
100 100
100
100
100
Also,denoteby theestimatedinterceptinthiscasethenwehave
100
100
100
100
Thusestimationofthismodel(withthesame,butrescaleddata)wouldlead
3200.
toanunchangedb1,whilsttheintercepttermwouldbecome100
(e) Supposeyweremeasuredindollarsbutxweremeasuredincents.
Whateffectswouldthishaveontheestimatedcoefficientofx?
100
100
100
100
100
100
100
100
Now estimation of this model would lead to the estimated coefficient of the
income variable being 0.0082 and estimated intercept would be unchanged.
Thismakessensesince:
If income is measured in dollars, we predict expenditure (in dollars) will
increaseby$0.82ifhouseholdincomeincreasesbyonedollar.
If income is measured in cents, we predict expenditure (in dollars) will
increaseby$0.0082ifhouseholdincomeincreasesbyonecent.
We can think of
as an estimate of the true random disturbance
.
associatedwithobservationi,
3. ComputingExercise#4
Refer to the Computing Work document in Course Documents in the
Blackboardwebsite.Answerthe2questionsassociatedwithsimplelinear
regressiononpages21and22.
As indicated below in the Line fit plot produced for the first part of the
question,thereisapositivecorrelationbetweenthereturnsonIntelstockand
theoverallmarketreturn.Howeverthereisconsiderablevariationaroundthe
superimposedlinearrelationship.
Discussion:
i)
Whatisthesampleregressionline?
FromtheExcelregressionoutputbelow:
0.022 1.472 ,
ii) Istheresufficientevidencetoinferatthe5%significancelevelthat
thereisalinearrelationshipbetweenthereturnonIntel
Corporationstockandthereturnonthetotalmarket?
Appropriatehypothesistobetestedis:
:
0; :
0
whichaccordingtotheExceloutputyieldsapvalueof0.0069andsoforany
significancelevelgreaterthan0.0069(whichincludes5%)wewouldreject
thenullandconcludethereisevidencetosuggestalinearrelationship.
iii) Istheresufficientevidencetoinferatthe5%significancelevelthat
IntelCorporationstockismoresensitivethantheaveragestock?
Nowtheappropriatehypothesistobetestedis:
:
1; :
1
Thestandardizedteststatisticforthishypothesisis:
1.47163 1
0.9061
0.52052
Usingatcriticalvalueand40degreesoffreedom(actually47degreesof
freedombutthisvaluenotintables)yieldsarejectionregionoft>1.684.
AlternativelywitharelativelylargesamplesizewecaninvoketheCLTanduse
the5%normalcriticalvalueof1.675.
Ineithercasethecalculatedteststatisticfallswellshortoftherejection
regionandwecannotrejectthenullhypothesis.
iv) Discussthesignificanceofthefindings?
Whilethereisevidenceofastrongpositiverelationshipbetweenthereturns,
theevidenceofwhethertheIntelstockismoreorlesssensitivetothemarketis
5
weak.Thepointestimateof1.472indicatesevidenceinfavourofbeingmore
sensitivebutwecannotexcludethepossibilitythatitisinfactlesssensitive.
The95%CIprovidedbyExcelis(0.424,2.519)andhenceincludesvalues
consistentwithbothpossibilities.
v)
Explainthemeaningoftheregressionandresidualsumofsquares.
The total sums of squares representing the total variation (0.4446) in the
dependent variable (returns on Intel stocks) can be decomposed into two
parts:aregressionsumofsquares(0.0658)representingthatpartexplained
bytheregressionmodelandtheresidualsumofsquares(0.3788)representing
thatpartleftoverandunexplainedbythemodel.Inthiscasethelatterislarge
relativetotheformerleadingtoanR2of0.148indicatingthatonly14.8%of
thevariationinIntelstockisbeingexplainedbythemarketmodel.
Thisisconsistentwithourinitialobservationfromthescatterplotthatthere
wasconsiderablevariationaroundthetrendline.Seealsothelinefitplotthat
overlaystheestimatedmarketmodelonthebivariatescatter.
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.3848
0.1480
0.1295
0.0907
48
ANOVA
df
Regression
Residual
Total
Intercept
INDEX
1
46
47
Coefficients
0.02192
1.47163
SS
MS
0.065822161 0.065822
0.378800255 0.008235
0.444622416
Standard Error
0.01508
0.52052
t Stat
1.45365
2.82722
Significance F
F
7.993182 0.0069287
INDEXLineFitPlot
0.25
0.20
0.15
0.10
INTEL
0.05
0.1
INTEL
0.00
0.05
0.05 0
0.05
0.1
PredictedINTEL
0.10
0.15
0.20
INDEX