Professional Documents
Culture Documents
Decomposition Methods
Walter Sosa Escudero
(wsosa@udesa.edu.ar) Universidad de San Andres
November 7, 2011
Decomposition Methods
Source: Fortin, Lemieux and Firpo, 2011, Decomposition Methods in Economics, in Handbook of Labor Economics.
Most of this lecture is based on this chapter. I will not follow their order and will not cover everyting.
Decomposition Methods
Between 1992 and 1998 the Gini index for Argentina rose from 0.45 to 0.50. How much of this change is due to changes in the characteristics that determine income (education, age, etc.) and how much to the way these characteristics are paid. Idea: Y is determined by X through a structural relation (Y = g(X)) Y changes due to changes in X or in g(.) Decomposition: decompose changes in some feature of Y (mean, gini, variance, poverty rate) into those arising from changing X and changes in the structure.
Decomposition Methods
A rst look at the Oaxaca-Blinder decomposition: Two periods: A and B, two regression models, estimated by OLS for each period, so taking averages yA = x A A yB = xB B Substracting yA yB = x A A x B B Substract and add xA B yA yB = xA A xA B + xA B xB B yA yB = xA (A B ) + B (A xB ) x
Walter Sosa Escudero Decomposition Methods
yA yB = xA (A B ) + B (A xB ) x This is the Oaxaca-Blinder decomposition. It decomposes changes in the mean into changes in the structure (coecients) and changes in the determinants (xA and xB )
Decomposition Methods
Some questions and points Go beyond the mean: this decomposes the mean. What about other features? (variance, gini, etc.) Invariance issues: What if instead of xA B we substract and add xB A . Identication issues: this looks like a rather mechanical exercise. Does it refer to any meaningful population magnitude? Can it be attached a causal interpretation? Detailed decomposition: can we further decompose at the variable level? (the eect of each variable). Non-linearities: the linear structure seems to play a major role. How stringent is this?
Decomposition Methods
Goal: decompose mean wages across two groups, A and B (gender, union, periods, etc.). Wage structure: Yg = Xg + g , X is a vector of M variables. E(g |X) = 0 DB , group B membership dummy. g = A, B
Decomposition Methods
Dene the overall mean wage gap as E(YB | DB = 1) E(YA | DB = 0) Now, a conceptually important step. Use the LIE to get = E E(YB | X, DB = 1) | DB = 1 E E(YA | X, DB = 0) | DB = 0 Now, use the wage structure and the exogeneity assumption to get: = E(X|DB = 1) B E(X|DB = 0) A
Decomposition Methods
= E(X|DB = 1) B E(X|DB = 0) A Substract and add E(X|DB = 1) (the counterfactual wage that group B would earn under the structure of group A) to get: E(X|DB = 1) B A + E(X|DB = 1) E(X|DB = 0) A S + X
= =
This is the aggregate Oaxaca/Blinder decomposition. S is the structure (unexplained, discrimination) eect, and X is the composition eect.
Decomposition Methods
Decomposition Methods
S =
i=1 M
XBk (Bk Ak )
X =
i=1
Decomposition Methods
The survey does a great job here. I wont go into details Omitted group problem I: in the detailed decomposition, categorical variables do not necessarily induce a base group (levels of education). Implies some discretionality. Omitted group problem II: in the structure eect, cannot distinguish the part due to group membership and diferences in the coecient of the omitted category. The choice of the counterfactual is not trivial: pooled, average, etc.
Decomposition Methods
General decompositions
Let T (FY ) be any functional, depending on the CDF of Y (wages, etc.). FYg |Ds , g, = A, B. When g = s this is an observable CDF and when g = s it is a counterfactual CDF. The overall dierence will be 0 = T (FYB |DB ) T (FYA |DA ) This will be the subject of the decomposition
Decomposition Methods
Relevant counterfactuals
The idea is to decompose this change into those arising from moving X (composition) and those from the way X alters Y (structure). We need to let X speak. We will use the LIE. Note FYg |Dg (y) = FYg |X,Dg (y|X = x) dFX|Dg (x)
is not.
Counterfactual structure: the counterfactual structure is mA for workers in B and mB for workers in A. Overalapping support: For all points in the support of X, , 0 < P (DB |X, ) < 1. Conditional independence: Dg | X (unconfoundedness, selection on observables).
Decomposition Methods
The decomposition
C S T (FYB |DB ) T (FYA,B ) solely reects dierences between mA and mB . C X T (FYA,B ) T (FYA |DA ) solely reects dierences in the distributions of X and between A and B.
Decomposition Methods
Implementing decompositions
Consequently, the practical idea consist in nding a way to compute the counterfactual distribution FY C (y) =
A,B
Decomposition Methods
Residual imputation
Juhn, Murphy and Pierce (1993) Assume: Ygi = Xi g + gi We would need to compute a counterfactual wage for B using the characteristics of B, but A and A . A can be estimated by OLS. Residuals: more dicult. Assume Rank preservation: let gi |X be the residuals. Individuals are supposed to have the same residual ranking in A and B in their respective conditional distributions. Independence: residuals are independent of X
Decomposition Methods
So
1 2
Compute T for A and B separately Estimate linear models for A and B separately. Get s and the residuals Compute a counterfactual wage
C YBi = XBi A + Ai
Ai is the estimated residual in group A that has the same ranking for i as in group B. Compute T for the counterfactual data. Compute decomposition
5 6
Decomposition Methods
Generate a random number i , from the uniform distribution in (0, 1) Estimate the QR for the quantile i Compute Yi = Xi (i ).
2 3
Machado and Mata (1995) Observed for B: as before Counterfactual: use the X for B, but predict using the QR model estimated for A. Does not assume independence or rank preservation. Leads to detailed decomposition in the structure component, by changing one set of coecients for one variable (while keeping all others constant).
Decomposition Methods
Reweighting Methods
Recall that the counterfactual CDF is FY C (y) =
A,B
FY C (y) =
A,B
Then, the counterfactual CDF for B, is a reweighted version of that for A. Note that fX|A (x) fX|B (x) So P r(X|DB = 0) P r(X|DB = 1) P r(X|DB = 0) P r(X|DB = 1)
Decomposition Methods
Recal Bayes rule P r(a, b) = So P r(X|DB = 0) = and P r(X|DB = 1) = Replacing P r(DB = 1|X) / P r(DB = 1) P r(X|DB = 0) = P r(X|DB = 1) P r(DB = 0|X) / P r(DB = 0)
Walter Sosa Escudero Decomposition Methods
(X) =
(X) =
P r(X|DB = 0) P r(DB = 1|X) / P r(DB = 1) = P r(X|DB = 1) P r(DB = 0|X) / P r(DB = 0) (X) = PD|X / PD (1 PD|X ) / (1 PD )
Only two quantities to estimate P r(DB = 1) is just the probability of being in group B. P r(DB = 1|X) can be estimated using a logit/probit model for the pooled data, regressing the binary indicator of group membership, on X.
Decomposition Methods
Implementation
1 2 3
Compute T (.) for both groups separately. Run a probit/logit of DBi on Xi using all data. For all observations in A, construct i , inserted the predicted probabilities (as before). Compute the counterfactual T (.) using individuals in group A and reweighted by i
Decomposition Methods
1 NA
i Yi
iA
Decomposition Methods