The Causal Review

1 Overview

The difference-in-differences literature has been unusually productive over the past three years. What began with the identification of TWFE's negative-weighting pathology has evolved into a rich literature on efficient estimation, Bayesian approaches, and the proper interpretation of event-study plots. This issue summarises four papers that extend and refine modern DiD practice.

2 Paper 1: Roth and Sant'Anna (2023) Efficient Estimation Under Staggered Adoption

Citation

Roth and Sant'Anna [2023]: "Efficient Estimation for Staggered Rollout Designs." Journal of Political Economy: Microeconomics, 1(4):370-426.

Research Question

What is the most efficient estimator of the average treatment effect in a staggered adoption design, and how much precision can be gained relative to existing methods?

Identification Strategy

The paper derives the semiparametric efficiency bound for the average treatment effect in staggered adoption settings under parallel trends and no anticipation. It shows that the efficiency bound is strictly lower (more information available) than what existing estimators such as Callaway-Sant'Anna and Sun-Abraham achieve.

Key Results

The efficient estimator is a weighted average of cohort-specific ATTs, with weights proportional to cohort size and the information content of the comparison group. The paper shows that existing estimators can lose 20-40% in efficiency relative to the bound, and that the efficiency gap is largest when there is heterogeneity in cohort sizes.

The paper also establishes that simple pooling of group-time ATTs with equal weights- as often done in applied practice—is substantially inefficient when cohort sizes are unequal.

Takeaway

Applied researchers who care about precision should use the efficient staggered estimator, implemented in the staggered R package. For standard significance testing, the gains matter most when cohort size variation is large.

3 Paper 2: Callaway, Goodman-Bacon, and Sant'Anna DiD with Continuous Treatment (2024)

Citation

Callaway et al. [2024]: "Difference-in-Differences with a Continuous Treatment." Working paper (NBER Working Paper No. 32117).

Research Question

How can the DiD framework be extended to settings where treatment is a continuous variable (a dose) rather than binary?

Identification Strategy

The paper introduces conditional parallel trends assumptions for continuous treatment: potential outcomes Y(d) are continuous in the dose d, and untreated trends do not vary with the dose received. Under these assumptions, the average dose-response function (ADRF) is identified.

The estimators are doubly robust, combining outcome regression and weighting (generalised propensity score), and nest the binary DiD estimator as a special case.

Key Results

The paper derives the identifying assumptions, proposes estimators for the ADRF in staggered adoption settings, and shows that TWFE-based linear dose-response estimates can be severely biased when the ADRF is nonlinear or when treatment heterogeneity interacts with dose levels. The continuous-treatment framework also reveals additional pathologies in TWFE beyond the binary treatment case.

Takeaway

Many applied DiD settings involve continuous treatment intensity (funding amounts, tariff rates, pollution concentrations). The contdid R package implements the estimators. Applied researchers using TWFE with a continuous treatment should assess sensitivity to the linearity assumption.

4 Paper 3: Borusyak, Jaravel, and Spiess (2024) Revisiting Event Studies with Imputation

Citation

Borusyak et al. [2024]: "Revisiting Event-Study Designs: Robust and Efficient Estimation." Review of Economic Studies, 91(6):3253-3285.

Research Question

Can the identification problems in staggered DiD be solved using an imputation approach, and is this approach more efficient than existing alternatives?

Identification Strategy

The paper proposes estimating untreated potential outcomes using an imputation approach: fit a two-way fixed-effects model on untreated observations (pre-treatment periods and never-treated units), then use the fitted values as imputed counterfactuals for treated observations.

The treatment effect for each treated observation is then the actual minus imputed outcome, and the ATT is the average of these individual treatment effects.

Key Results

The imputation estimator is shown to be efficient within a broad class of estimators that aggregate individual treatment effects by weighting. It avoids the negative-weighting pathology of TWFE, produces valid event-study plots (pre-treatment periods show placebo effects, not TWFE artifacts), and is straightforward to implement in did2s or fixest using two-stage estimation.

A key practical finding: when the TWFE two-stage imputation estimator is used, the standard "TWFE event-study plot" matches the true event-study pattern, unlike the default TWFE estimator which confounds pre-treatment and post-treatment dynamics.

Takeaway

The imputation approach of Borusyak et al. [2024] is an efficient alternative to Callaway-Sant'Anna that is well-suited to settings with many treatment cohorts and a long panel. The did2s R package provides a clean implementation.

5 Paper 4: de Chaisemartin and D'Haultfœuille (2024) Interpreting Event-Study Plots

Citation

de Chaisemartin and D'Haultfœuille [2024]: "Difference-in-Differences Estimators for Treatments Continuously Distributed at Every Period." Working paper.

Research Question

How should the default event-study plots from modern DiD software be interpreted, and do they match the traditional TWFE event-study specification?

Identification Strategy

The paper shows analytically that, even in non-staggered settings (a single common treatment date), the default plots from did, fixest using Sun-Abraham, and staggered do not replicate the standard TWFE event-study plot. They differ in how the comparison group is constructed and how time-period averages are formed.

Key Results

The differences across methods are not just cosmetic: they reflect substantively different estimands. Some methods plot the ATT relative to the period before treatment (k = −1), others relative to the average pre-treatment period. The choice affects both the level and the shape of the event-study plot, and can lead to apparent pre-trends in one specification that are absent in another.

Takeaway

Researchers using modern DiD software should carefully document which event-study estimand they are plotting. Pre-trend tests based on these plots assess different hypotheses depending on the software used. Comparing event-study plots across papers requires attention to the specification, not just the research design.

References

Borusyak, K., Jaravel, X., and Spiess, J. (2024). Revisiting event-study designs: Robust and efficient estimation. Review of Economic Studies, 91(6):3253-3285.
Callaway, B. and Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200-230.
Callaway, B., Goodman-Bacon, A., and Sant'Anna, P. H. C. (2024). Difference-in-differences with a continuous treatment. NBER Working Paper No. 32117.
de Chaisemartin, C. and D'Haultfœuille, X. (2024). Difference-in-differences estimators for treatments continuously distributed at every period. Working paper.
de Chaisemartin, C. and D'Haultfœuille, X. (2020). Two-way fixed effects estimators with heterogeneous treatment effects. American Economic Review, 110(9):2964-2996.
Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of Econometrics, 225(2):254-277.
Roth, J. and Sant'Anna, P. H. C. (2023). Efficient estimation for staggered rollout designs. Journal of Political Economy: Microeconomics, 1(4):370-426.
Sun, L. and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of Econometrics, 225(2):175-199.

Recent Results: DiD Frontiers and Staggered Treatment Estimation (2023-2026)

1 Overview

2 Paper 1: Roth and Sant'Anna (2023) Efficient Estimation Under Staggered Adoption

Citation

Research Question

Identification Strategy

Key Results

Takeaway

3 Paper 2: Callaway, Goodman-Bacon, and Sant'Anna DiD with Continuous Treatment (2024)

Citation

Research Question

Identification Strategy

Key Results

Takeaway

4 Paper 3: Borusyak, Jaravel, and Spiess (2024) Revisiting Event Studies with Imputation

Citation

Research Question

Identification Strategy

Key Results

Takeaway

5 Paper 4: de Chaisemartin and D'Haultfœuille (2024) Interpreting Event-Study Plots

Citation

Research Question

Identification Strategy

Key Results

Takeaway

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Recent Results: DiD Frontiers and Staggered Treatment Estimation (2023-2026)

1 Overview

2 Paper 1: Roth and Sant'Anna (2023) Efficient Estimation Under Staggered Adoption

Citation

Research Question

Identification Strategy

Key Results

Takeaway

3 Paper 2: Callaway, Goodman-Bacon, and Sant'Anna DiD with Continuous Treatment (2024)

Citation

Research Question

Identification Strategy

Key Results

Takeaway

4 Paper 3: Borusyak, Jaravel, and Spiess (2024) Revisiting Event Studies with Imputation

Citation

Research Question

Identification Strategy

Key Results

Takeaway

5 Paper 4: de Chaisemartin and D'Haultfœuille (2024) Interpreting Event-Study Plots

Citation

Research Question

Identification Strategy

Key Results

Takeaway

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title