4  Difference in differences analyses

Published

Last modified: 2026-05-07: 6:37:20 (AM)

Many approaches to causal inference assume exchangeability (Definition 1.10) and exploit its consequence (Theorem 1.1):

\[\text{E}{\left[Y(x) | X = x'\right]} = \text{E}{\left[Y(x) | X = x\right]}\]

Difference-in-differences makes a weaker exchangeability assumption:

\[\text{E}{\left[Y_t(0) - Y_{t'}(0) | X = 1\right]} = \text{E}{\left[Y_t(0) - Y_{t'}(0) | X = 0\right]}\]

4.1 Change in Changes

The Change in Changes (CiC) model (Athey and Imbens 2006) is a DiD-type method that estimates the Quantile Treatment Effect on the Treated (QTET) — that is, treatment effects across the full distribution of outcomes, not just the mean (Callaway 2024).

CiC requires two periods of data (a pre-treatment and a post-treatment period). The data can be either repeated cross-sections or panel data (Callaway 2024).

4.1.1 Assumption

Rather than the standard parallel trends assumption (that average outcomes would have followed parallel paths absent treatment), CiC assumes that the distribution of untreated potential outcomes evolves over time in the same way for both treated and control groups (Athey and Imbens 2006).

4.1.2 Covariate adjustment

CiC can condition on covariates by first fitting a linear model for outcomes conditional on group–time indicators and covariates, then residualizing (removing predicted values), and finally applying the CiC estimator to these quasi-residuals (Athey and Imbens 2006; Callaway 2024).

4.1.3 R implementation

The CiC() function in the qte R package (Callaway 2024) implements this estimator. Key arguments are (note: formla and xformla are the actual argument names in the package):

Argument Description
formla y ~ d where y is the outcome and d is a binary treatment indicator
xformla Optional one-sided formula for additional covariates (e.g., ~ age + education)
t Post-treatment time period
tmin1 Pre-treatment time period
tname Name of the column containing the time variable
data Data frame containing all variables
panel TRUE if the dataset is panel data
probs Vector of quantile levels at which to estimate the QTET
iters Number of bootstrap iterations for standard errors

The function returns a QTE object with QTET estimates and (optionally) bootstrap confidence intervals at each quantile in probs.

4.1.4 Example

The following example from Callaway (2024) estimates the QTET for the National Supported Work Demonstration job-training program using the lalonde.psid.panel dataset, conditioning on several pre-treatment characteristics:

Show R code
library(qte)
data(lalonde)

c1 <- CiC(
  re ~ treat,
  t = 1978, tmin1 = 1975, tname = "year",
  xformla = ~ age + I(age^2) + education + black + hispanic + married + nodegree,
  data = lalonde.psid.panel, idname = "id",
  se = FALSE, probs = seq(0.05, 0.95, 0.05)
)

summary(c1)
#> 
#> Quantile Treatment Effect:
#>      
#> tau  QTE
#> 0.05    0.00
#> 0.1     0.00
#> 0.15    0.00
#> 0.2     0.00
#> 0.25  485.23
#> 0.3   929.88
#> 0.35 1460.36
#> 0.4  2321.11
#> 0.45 3462.56
#> 0.5  4232.31
#> 0.55 5010.34
#> 0.6  6210.67
#> 0.65 7458.11
#> 0.7  7508.97
#> 0.75 8210.84
#> 0.8  8332.69
#> 0.85 8515.98
#> 0.9  8822.00
#> 0.95 8273.74
#> 
#> Average Treatment Effect:    4629.24

The resulting QTET estimates suggest that the treatment had no measurable effect on the lower quantiles of the earnings distribution but large positive effects at higher quantiles, with an estimated average treatment effect of roughly $4,600 (Callaway 2024).