The Causal Review

1 Introduction

Difference-in-differences is one of the most widely used identification strategies in empirical economics. The standard estimator comparing average outcomes before and after treatment for treated and control units is unbiased under the parallel trends assumption. When parallel trends holds only conditional on covariates, researchers must adjust for those covariates in estimation. The natural approaches are outcome regression (OR) and inverse probability weighting (IPW), but each requires its own nuisance model to be correctly specified.

Sant'Anna and Zhao [2020] propose a doubly robust DiD estimator that combines OR and IPW in a way that is consistent whenever either model is correctly specified, not necessarily both. This robustness property, familiar from the program evaluation literature [Robins et al., 1994, Scharfstein et al., 1999], substantially weakens the requirements for valid causal inference.

This article introduces the doubly robust DiD framework, derives the key estimator, discusses its properties, and describes the drdid R package that implements it.

2 Setup and the Problem with Standard Adjustments

2.1 Notation and Assumptions

Consider a two-period DiD setting with pre-period t=0 and post-period t=1. Let Dᵢ ∈ {0,1} denote treatment indicator (a unit is treated between periods), and let Yᵢₜ(d) denote the potential outcome for unit i at time t under treatment status d. The estimand is the average treatment effect on the treated (ATT):

τᴬᵀᵀ = E[Yᵢ₁(1) - Yᵢ₁(0) | Dᵢ = 1].

The key identifying assumption is conditional parallel trends:

Assumption 1

(Conditional Parallel Trends).‍

E[Yᵢ₁(0) - Yᵢ₀(0) | Xᵢ, Dᵢ = 1] = E[Yᵢ₁(0) - Yᵢ₀(0) | Xᵢ, Dᵢ = 0]for all values of Xᵢ in the support of the treated group.

This says that conditional on pre-treatment characteristics Xᵢ the trend in untreated potential outcomes would have been the same for treated and control units. Assumption 1 is strictly weaker than unconditional parallel trends: it allows the trends to differ in the absence of covariate adjustment, as long as the difference is explained by Xᵢ.

2.2 Two Standard Approaches and Their Failure Modes

Outcome Regression (OR). Specify a parametric model m₀(Xᵢ; θ) for the untreated outcome trend E[Yᵢ₁(0) - Yᵢ₀(0) | Xᵢ, Dᵢ = 0]. The OR-DiD estimator is:

^τ^OR =

n₁

∑ i:D_i=1

[ (Y_i1 − Y_i0) − m₀(X_i; ^θ) ], (1)

where n₁ is the number of treated units. This is consistent if m₀ is correctly specified.

Inverse Probability Weighting (IPW). Specify a propensity score model p(Xᵢ; γ) = P(Dᵢ = 1 | Xᵢ). The IPW-DiD estimator reweights control units by their odds of being treated:

^τ^IPW =

n₁

∑i:D_i=1

(Y_i1 − Y_i0) −

n₁

∑i:D_i=0

^p(X_i)

1 − ^p(X_i)

(Y_i1 − Y_i0), (2)

normalised to sum to one. This is consistent if the propensity score model is correctly specified. Both estimators break down if their respective nuisance model is misspecified. In applied work, both models are typically estimated parametrically (logit for the propensity score, OLS for the outcome regression), so misspecification of either model can produce biased estimates.

3 The Doubly Robust DiD Estimator

Sant'Anna and Zhao [2020] combine OR and IPW via the semiparametric efficiency theory of Hahn [1998] and Robins et al. [1994]. The doubly robust DiD estimator is:

^τ^DR =

n₁

∑i:D_i=1

[(Y_i1 − Y_i0) − m₀(X_i; ^θ)] (3)

−

n₁

∑i:D_i=0

^p(X_i)

1 − ^p(X_i)

[(Y_i1 − Y_i0) − m₀(X_i; ^θ)].

The estimator in (3) augments the OR estimator with an IPW correction. The augmentation term is zero in expectation if either m₀ is correctly specified (because the residual (Yᵢ₁ - Yᵢ₀) - m₀(Xᵢ) is mean-zero among controls) or p(Xᵢ) is correctly specified (because the IPW weights correctly down-weight the residuals).

Theorem 1 (Double Robustness). Under Assumption 1 and the overlap condition P(Dᵢ = 1 | Xᵢ) < 1 a.s., τ̂ᴰᴿ is a consistent estimator of τᴬᵀᵀ if either the outcome regression model m₀(·) or the propensity score model p(·) is correctly specified (but not necessarily both). Moreover, when both models are correctly specified, τ̂ᴰᴿ achieves the semiparametric efficiency bound for the ATT under conditional parallel trends— it is the most precise estimator possible given the assumptions.

4 Panel vs Repeated Cross-Sections

A notable feature of Sant'Anna and Zhao [2020] is that they develop the doubly robust DiD framework for two data structures:

Balanced panel data: The same units are observed in both periods. Here Δ Yᵢ = Yᵢ₁ - Yᵢ₀ is well-defined and the estimator in (3) applies directly.

Repeated cross-sections: Different (potentially overlapping) samples of units are drawn in each period. The pre-post difference Δ Yᵢ is not observed for any individual; instead, the trend must be inferred from the population moments in each period. The doubly robust estimator takes a different algebraic form that achieves the same double robustness property.

For repeated cross-sections, the estimator involves a two-sample adjustment using period indicators and propensity scores both for treatment and for period:

^τ^DR,rcs = ℕ_n [

D_i − ^p(X_i)

1 − ^p(X_i)

2T_i − 1

P(T_i = 1)

ċ Y_i ċ ψ(X_i, T_i; ^θ) ], (4)

where Tᵢ indicates the period, and ψ(·) is an augmentation term analogous to the panel case. Details are provided in Sant'Anna and Zhao [2020].

5 Connection to Callaway-Sant'Anna

The doubly robust DiD estimator is the building block of the staggered treatment estimator of Callaway and Sant'Anna [2021]. In that paper, the group-time ATT ATT(g,t) for cohort g at time t is estimated using a doubly robust procedure applied to a comparison of cohort-g units against "clean controls" (never-treated or not-yet-treated units). The did R package, which implements Callaway and Sant'Anna, uses exactly the DR-DiD estimator of Sant'Anna and Zhao [2020] as its default estimation method.

Understanding DR-DiD is therefore essential background for understanding the most widely used modern staggered DiD methodology.

6 Inference

Sant'Anna and Zhao [2020] derive the asymptotic distribution of τ̂ᴰᴿ under both correctly specified and misspecified nuisance models. The key result is:

√n ( ^τ^DR − τ^ATT )

d →

N (0, V^DR), (5)

where Vᴰᴿ is the semiparametric efficiency bound when both models are correctly specified, and is larger (but finite) when only one model is correctly specified. Bootstrap inference both the nonparametric bootstrap and the multiplier bootstrap provides valid standard errors and can accommodate within-cluster correlation.

7 The drdid Package

The drdid package for R [Sant'Anna and Zhao, 2020] implements the doubly robust DiD estimator for both panel and repeated cross-section data. Core functions include:

drdid_panel(): DR-DiD for balanced panel data.

drdid_rc(): DR-DiD for repeated cross-sections.

drdid_imp_panel(): An improved version using normalised IPW weights.

drdid_imp_rc(): Improved version for repeated cross-sections.

Covariates enter both the propensity score model (logistic regression) and the outcome regression model (linear regression). Standard errors are computed by the multiplier bootstrap by default. The package integrates with the did package, which calls drdid functions internally.

8 Practical Guidance

When should researchers prefer DR-DiD over pure OR or pure IPW?

Whenever covariates are needed for conditional parallel trends. If unconditional parallel trends is implausible and covariate adjustment is required, DR-DiD protects against misspecification of either nuisance model.
‍When the propensity score is difficult to model. High-dimensional or complex covariates make propensity score misspecification more likely. The OR component of DR-DiD provides protection.‍
When the outcome trend model is uncertain. If the functional form of covariate effects on the trend is unclear, the IPW component provides protection.‍
As a default in staggered DiD. Since the did package uses DR-DiD internally, researchers using Callaway and Sant'Anna are already using DR-DiD.

The main limitation is computational: DR-DiD requires estimating two nuisance models, and bootstrap inference adds further computational cost. For large datasets with many covariates, this can be slow.

9 Conclusion

The doubly robust DiD estimator of Sant'Anna and Zhao [2020] provides a principled solution to covariate adjustment in difference-in-differences designs. By combining outcome regression and inverse probability weighting, it achieves consistency under the union of the two models' specification assumptions a significant improvement in robustness over either approach alone. It achieves the semiparametric efficiency bound when both models are correct. As the foundation of the widely used Callaway-Sant'Anna staggered DiD estimator, DR-DiD is now central to applied practice. Applied economists working with observational DiD designs should consider it the default approach whenever covariate adjustment is needed.

References

Callaway, B. and Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200-230.
Hahn, J. (1998). On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica, 66(2):315-331.
Robins, J. M., Rotnitzky, A., and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427):846-866.
Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41-55.
Sant'Anna, P. H. C. and Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of Econometrics, 219(1):101-122.
Scharfstein, D. O., Rotnitzky, A., and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. Journal of the American Statistical Association, 94(448):1096-1120.
Imbens, G. W. (2015). Matching methods in practice: Three examples. Journal of Human Resources, 50(2):373-419.

Doubly Robust Difference-in-Differences: Sant'Anna and Zhao (2020)

1 Introduction

2 Setup and the Problem with Standard Adjustments

2.1 Notation and Assumptions

2.2 Two Standard Approaches and Their Failure Modes

3 The Doubly Robust DiD Estimator

4 Panel vs Repeated Cross-Sections

5 Connection to Callaway-Sant'Anna

6 Inference

7 The drdid Package

8 Practical Guidance

9 Conclusion

References

‍

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Doubly Robust Difference-in-Differences: Sant'Anna and Zhao (2020)

1 Introduction

2 Setup and the Problem with Standard Adjustments

2.1 Notation and Assumptions

2.2 Two Standard Approaches and Their Failure Modes

3 The Doubly Robust DiD Estimator

4 Panel vs Repeated Cross-Sections

5 Connection to Callaway-Sant'Anna

6 Inference

7 The drdid Package

8 Practical Guidance

9 Conclusion

References

‍

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title