You are on page 1of 38

UNIVERSIDADE FEDERAL DA PARAÍBA

CENTRO DE CIÊNCIAS SOCIAIS


!
APLICADAS
DEPARTAMENTO DE ECONOMIA

ECONOMETRIA

AULA 01: Introdução

Professor: Erik Figueiredo


Frases ... e mais frases
• Toda a nossa ciência, comparada com a realidade, é primitiva e
infantil - e, no entanto, é a coisa mais preciosa que temos.
Albert Einstein

• I could be bounded in a nutshell and count myself a king of infinite


space. Willian Shakespeare, “Hamlet”, act II, scene II.

• It does take maturity to realize that models are to be used but not
to be believed. Henry Theil

• Vasculhe cada exemplar da psedociência e você encontrará um


cobertozinho de estimação, um dedo para chupar, uma saia para
segurar. Isaac Asimov
Cronograma da Aula

• Introdução;
• Método de investigação;
• Dados em corte;
• Séries de tempo;
• Dados em painel;
• Regressão e causalidade.
Lógica de um Estudo Econométrico
• Círculo de Viena – 1929: Advento do positivismo
lógico.

• Kurt Gödel Werner Karl


Heisenberg
• Albert Einstein: Deus Joga dados?

• Einstein e Gödel –
• Princeton (1950)– USA

Método formalista:
determinista e estocástico
• Heisenberg e Neils Börn: Sim, ele joga dados.
E mais, com gato de Erwin Schrödinger no
colo.

(Heisenberg) (Neils Börn) (Erwin


Schrödinger)

Método formalista: determinista e


estocástico
• O modelo linear de Gauss (Johann Carl
Friedrich Gauss - 1777-1855)

• Y=Xβ+u

Enfim, a econometria ...


Introdução
Francis Galton
• Sir Francis Galton (16
February 1822 – 17 January
1911), cousin of Douglas
Strutt Galton, cousin
of Charles Darwin, was
an English Victorian polymat
h: anthropologist, eugenicist
, tropical explorer, geograph
er, inventor, meteorologist,
proto-
geneticist, psychometrician,
and statistician. He was
knighted in 1909. AND ...
Problema Galtoniano Simplificado
Altura de pais e filhos
• O quanto a altura dos pais
interfere na altura dos filhos?

• Coefficients:
• Estimate Std. Error t value Pr(>|t|)
• Intercept 23.94153 2.81088 8.517 <2e-16 ***
• parent 0.64629 0.04114 15.711 <2e-16 ***
• ---
• Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05

• R-squared: 0.2105, Adjusted R-squared: 0.2096


• F-statistic: 246.8 on 1 and 926 DF, p-value: < 0.00
Pais altos, filho baixo.

Filho alto,
pais baixos.
Regra em Média
• As regras são estabelecidas EM MÉDIA.
• Individualmente, as regras podem ser
quebradas:

• SIM, elas podem!


O Termo Aleatório
Resíduos da Regressão Galtoniana
density.default(x = resid)
0.15
0.10
Density

0.05
0.00

-10 -5 0 5

N = 928 Bandwidth = 0.5134


Especificação
• Fatores omitidos na regressão:
• Alimentação, participou de atividades
físicas, saúde ...

• Variável dependente possui:


• Y
• [UMA PARTE EXPLICADA (quero
maximizar)] + [uma parte não explicada]
Resíduos
• Devem ser meramente aleatórios.
• Expressão comum “ruído branco”.

• O aleatório não se prevê!


Resíduos
• O aleatório segue alguma distribuição
conhecida (econometria paramétrica), por
exemplo:
• Normal, t-student, Pareto ...
• Aqui entram os conhecimentos de teoria
de probabilidade.
• Dados i.i.d.?
Dados em Corte
• Problema da análise do momento. Por
exemplo, os resultados podem se
diferenciar devido ao ciclo econômico.
Séries Temporais
• O que são?
• Uma série determinística. Vejamos.
Determinismo?
Dados em Painel
• As características individuais importam?

• Pessoas, estados, países ... São diferentes


entre si!
• Assuma a equação de salários:

Wageit = β + α ( Educação)it + ... + ηi + uit


Dados em Painel
• As motivações individuais não importam?
Ora, a  Gisele Bündchen tem uma irmã
gêmea:
Dados em Painel
Mesma variável dependente
(varia apenas o tempo)

Mesma variável explicativa

Mesmo vetor de BETAS

Logo, os dados em painel são um caso


particular do SUR 23
Dados em Painel

Exogeneidade
contemporânea

Exogeneidade estrita

24
Considerações Finais

• Importância (mais, cuidado!);


• Sempre fundamentada teoricamente;
• Busca pela robustez;
• A ideia supera o método.
Parte II
• Interpretação;
• P-valor;
• Hipóteses (Gujarati).
scores (lgmath
lgmath) and female
female. For these examples, we have taken the
natural log (ln). All the examples are done in Stata, but they can be easily
Dados
generated in any statistical package. In the examples below, the variable
write or its log transformed version will be used as the outcome variable.
The examples are used for illustrative purposes and are not intended to
make substantive sense. Here is a table of different types of means for
variable write
write.

Variable | Type Obs Mean [95% Conf. Interval]


-------------+----------------------------------------------------------
write | Arithmetic 200 52.775 51.45332 54.09668
| Geometric 200 51.8496 50.46854 53.26845
| Harmonic 200 50.84403 49.40262 52.37208
------------------------------------------------------------------------

Outcome variable is log transformed

://stats.idre.ucla.edu/other/mult-pkg/faq/general/faqhow-do-i-…terpret-a-regression-model-when-some-variables-are-log-transformed/ Page 1
Very often, a linear relationship is hypothesized between a lo
transformed outcome variable and a group of predictor variab
log-linear
mathematically, the relationship follows the equation

log(y_i)= β0 + β1*x1 + … + βk*xk + e_i

where y is the outcome variable and x1, .., xk are the predicto
other words, we assume that log(y) – x ‘β is normally distribute
log-normal conditional on all the covariates.) Since this is just
least squares regression, we can easily interpret a regression
say β1, as the expected change in log of y with respect to a o
increase in x1 holding all other variables at any fixed value, as
x1 enters the model only as a main effect. But what if we want
what happens to the outcome variable y itself for a one-unit in
other words, we assume that log(y) – x ‘β is normally distributed, (or y is
log-normal conditional on all the covariates.) Since this is just an ordinary
least squares regression, we can easily interpret a regression coefficient,
Modelo Exponencial
say β1, as the expected change in log of y with respect to a one-unit
increase in x1 holding all other variables at any fixed value, assuming that
x1 enters the model only as a main effect. But what if we want to know
what happens to the outcome variable y itself for a one-unit increase in x1?
The natural way to do this is to interpret the exponentiated regression
coefficients, exp(β), since exponentiation is the inverse of logarithm
function.
Let’s start with the intercept-only model, log(write) = β0.

------------------------------------------------------------------------------
lgwrite | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
intercept | 3.948347 .0136905 288.40 0.000 3.92135 3.975344
------------------------------------------------------------------------------

We can say that 3.95 is the unconditional expected mean of log of write.
Therefore the exponentiated value is exp(3.948347) = 51.85. This is the
geometric mean of write
write. The emphasis here is that it is the geometric
mean instead of the arithmetic mean. OLS regression of the original
variable y is used to to estimate the expected arithmetic mean and OLS
Modelo Exponencial

FAQ How do I interpret a regression model when some variables are log transformed? - IDRE Stats 18/07/17 11'05

------------------------------------------------------------------------------
lgwrite | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | .1032614 .0265669 3.89 0.000 .050871 .1556518
intercept | 3.89207 .0196128 198.45 0.000 3.853393 3.930747
------------------------------------------------------------------------------

log(write)= β0 + β1*female = 3.89 + .10*female


female

Before diving into the interpretation of these parameters, let’s get the
means of our dependent variable, write
write, by gender.
FAQ How do I interpret a regression model when some variables are log transformed? - IDRE Stats 18/07/17 11'05

Now we can map the parameter estimates to the geometric means for the
two groups. The intercept of 3.89 is the log of geometric mean of write
when female = 0, i.e., for males. Therefore, the exponentiated value of it is
the geometric mean for the male group: exp(3.892) = 49.01. What can we
say about the coefficient for female
female? In the log scale, it is the difference in
the expected geometric means of the log of write between the female
students and male students. In the original scale of the variable write
write, it is
the ratio of the geometric mean of write for female students over the
geometric mean of write for male students, exp(.1032614) =
54.34383/49.01222 = 1.11. In terms of percent change, we can say that
switching from male students to female students, we expect to see about
11% increase in the geometric mean of writing scores.
Last, let’s look at a model with multiple predictor variables.
geometric mean of write for male students, exp(.1032614) =
54.34383/49.01222 = 1.11. In terms of percent change, we can say that
switching from male students to female students, we expect to see about
11% increase in the geometric mean of writing scores.
Last, let’s look at a model with multiple predictor variables.

------------------------------------------------------------------------------
lgwrite | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | .114718 .0195341 5.87 0.000 .076194 .153242
read | .0066305 .0012689 5.23 0.000 .0041281 .0091329
math | .0076792 .0013873 5.54 0.000 .0049432 .0104152
intercept | 3.135243 .0598109 52.42 0.000 3.017287 3.253198
------------------------------------------------------------------------------

log(write)= β0 + β1*female + β2*read + β3*math

The exponentiated coefficient exp(β1) for female is the ratio of the


expected geometric mean for the female students group over the
expected geometric mean for the male students group, when read and
math | .0076792 .0013873 5.54 0.000 .0049432 .0104152
intercept | 3.135243 .0598109 52.42 0.000 3.017287 3.253198
------------------------------------------------------------------------------

log(write)= β0 + β1*female + β2*read + β3*math

The exponentiated coefficient exp(β1) for female is the ratio of the


expected geometric mean for the female students group over the
expected geometric mean for the male students group, when read and
math are held at some fixed value. Of course, the expected geometric
means for the male and female students group will be different for
different values of read and math
math. However, their ratio is a constant:
exp(β1). In our example, exp(β1) = exp(.114718) = 1.12. We can say that writing
scores will be 12% higher for the female students than for the male
students. For the variable read
read, we can say that for a one-unit increase in
read
read, we expect to see about a 0.7% increase in writing score, since
exp(.0066305) = 1.006653. For a ten-unit increase in readread, we expect to
see about a 6.9% increase in writing score, since exp(.0066305*10) =
1.0685526.

https://stats.idre.ucla.edu/other/mult-pkg/faq/general/faqhow-do-i-…erpret-a-regression-model-when-some-variables-are-log-transformed/ Page 4 of 8
Aspectos estatísticos
Example: the log normal distribution
Example: Log Normal
• Mean:

• Median:
The estimator
• Pesquisem sobre a desigualdade de Jensen

You might also like