Toolbox

The drdid Package in R: Doubly Robust Difference-in-Differences

1 What Problem Does This Tool Solve?

When the parallel trends assumption holds only after conditioning on covariates, researchers must adjust for those covariates in DiD estimation. The standard approaches— outcome regression (OR) and inverse probability weighting (IPW)— each require their respective nuisance model to be correctly specified. Misspecifying either model introduces bias.

The drdid package implements the doubly robust DiD (DR-DiD) estimator of Sant'Anna and Zhao [2020], which combines OR and IPW so that the estimator is consistent whenever either model is correctly specified— a much weaker requirement than needing both. The package handles both balanced panel data and repeated cross-sections, and it serves as the estimation engine inside the widely used did package [Callaway and Sant'Anna, 2021].

2 Installation and Setup

# Install from CRAN
install.packages("drdid")

library(drdid)

# Also useful for comparison:
install.packages("did")
library(did)

The package ships with the nsw_long dataset— a panel version of the National Supported Work (NSW) training programme data from LaLonde [1986], commonly used to benchmark DiD estimators.

3 Core Functions

The package provides four main estimation functions:

Function Data structure Notes
drdid_panel() Balanced panel Standard DR-DiD
drdid_imp_panel() Balanced panel Improved (normalised IPW)
drdid_rc() Repeated cross-sections Standard DR-DiD
drdid_imp_rc() Repeated cross-sections Improved (normalised IPW)
Table 1: Core drdid Functions

The "improved" variants use normalised IPW weights (weights sum to one within each group) and are generally preferred in practice because they are better-behaved in finite samples, especially when propensity scores are close to zero or one.

4 A Minimal Working Example: Panel Data

library(drdid)
data(nsw_long)

# Dataset structure: individual panel data, two periods (pre/post)
# Variables: id, year, experimental (treatment indicator), re (real earnings),
# age, education, black, married, nodegree, hisp

# Prepare the data: keep the two periods used in the LaLonde study
nsw_use <- subset(nsw_long, year %in% c(1975, 1978))

# Estimate DR-DiD (improved panel estimator)
dr_result <- drdid_imp_panel(
  y1 = nsw_use$re[nsw_use$year == 1978], # post-period outcome
  y0 = nsw_use$re[nsw_use$year == 1975], # pre-period outcome
  D = nsw_use$experimental[nsw_use$year == 1978], # treatment indicator
  covariates = model.matrix(~ age + education + black + married + nodegree + hisp, data = nsw_use[nsw_use$year == 1978, ]),
  i.weightvar = NULL, # no sampling weights
  boot = TRUE, # use bootstrap for inference
  boot.type = "weighted", # multiplier (wild) bootstrap
  nboot = 999
)

# Summary
summary(dr_result)
# $ATT: estimated ATT
# $se: bootstrap standard error
# $lci, $uci: 95% confidence interval

4.1 Interpreting the Output

The main output is:

  • ATT: the doubly robust estimate of the average treatment effect on the treated.
  • se: the bootstrap standard error.
  • lci, uci: 95% confidence interval from the percentile bootstrap.
  • boots: the vector of bootstrap replications (useful for custom inference).

5 Repeated Cross-Sections

When the same individuals are not tracked across periods (e.g. repeated surveys), use the repeated cross-section functions. The key difference is that the pre-period and post-period observations are from different samples, so the outcome difference Yᵢ₁ − Yᵢ₀ is not directly observed:

# Simulate a repeated cross-section from the panel data
# In practice, you would have separate pre and post survey samples
set.seed(42)
n <- nrow(nsw_long)

# Randomly assign each observation to "pre" or "post" sample
nsw_rcs <- nsw_long
nsw_rcs$sample_period <- sample(c(0, 1), n, replace = TRUE, prob = c(0.5, 0.5))

# Keep observations in their designated period
nsw_rcs_use <- subset(nsw_rcs, (year == 1975 & sample_period == 0) | (year == 1978 & sample_period == 1))

# DR-DiD for repeated cross-sections
dr_rc_result <- drdid_imp_rc(
  y = nsw_rcs_use$re,
  post = as.integer(nsw_rcs_use$year == 1978),
  D = nsw_rcs_use$experimental,
  covariates = model.matrix(~ age + education + black + married + nodegree, data = nsw_rcs_use),
  boot = TRUE, boot.type = "weighted", nboot = 999
)

summary(dr_rc_result)

6 Connection to the did Package

The drdid package is the estimation engine inside the did package of Callaway and Sant'Anna [2021]. When you call att_gt() with the default est_method = "dr" (doubly robust), it internally calls drdid_imp_panel() or drdid_imp_rc() for each (group, time) cell. Understanding drdid directly helps you interpret what did is doing and allows you to customise the estimation if needed.

library(did)
# The did package uses drdid internally with est_method = "dr"
out_did <- att_gt(
  yname = "re",
  tname = "year",
  idname = "id",
  gname = "first_treat", # year of first treatment (or 0 if never treated)
  xformla = ~ age + education + black + married + nodegree,
  data = nsw_long,
  est_method = "dr" # calls drdid_imp_panel() internally
)
summary(out_did)

7 Key Options and Pitfalls

7.1 Bootstrap Type

drdid supports two bootstrap types:

  • boot.type = "weighted": the multiplier bootstrap (recommended). Faster and has better finite-sample properties.
  • boot.type = "normal": the standard nonparametric bootstrap with replacement. More familiar but slower.

7.2 Overlap Violations

If the propensity score p̂(Xᵢ) is near 1 for some units (some control units look just like treated units), IPW weights become extreme, inflating variance. Inspect the propensity score distribution before running DR-DiD:

# Fit propensity score model manually to inspect overlap
ps_model <- glm(experimental ~ age + education + black + married + nodegree, data = nsw_use[nsw_use$year == 1978,], family = binomial(link = "logit"))
ps_scores <- predict(ps_model, type = "response")
hist(ps_scores[nsw_use$experimental[nsw_use$year==1978] == 0], main = "Propensity scores for control units", xlab = "Propensity score")

Extreme scores (above 0.9 or below 0.1 for controls) signal overlap problems. Consider trimming observations with extreme scores or using alternative methods.

7.3 Model Specification

By default, drdid uses logistic regression for the propensity score and OLS for the outcome regression. Both models are linear in the provided covariates. To capture nonlinearity:

  • Include polynomial terms or interactions in the covariates matrix.
  • Use the DoubleML package if you want machine learning nuisance estimation (cross-fitting with nonparametric learners).

8 Comparison to Alternatives

Package Method DR? ML nuisance?
drdid
did
DR-DiD (panel/RC)
CS staggered (uses drdid)
Yes
Yes
No
No
DRDID (Python) DR-DiD Yes Optional
DoubleML DML for ATE/ATT Yes (sort of) Yes
fixest TWFE with controls No No
Table 2: DiD Estimators with Covariate Adjustment in R

9 Conclusion

The drdid package implements the doubly robust DiD estimator of Sant'Anna and Zhao [2020] with a clean interface for panel and repeated cross-section data. Its double robustness property— consistency when either the propensity score or the outcome model is correctly specified— makes it the preferred estimator for covariate-adjusted DiD in observational settings. As the workhorse inside the did package, it underpins much of the modern staggered DiD literature. Applied researchers should prefer drdid_imp_panel() or drdid_imp_rc() over simpler alternatives whenever covariate adjustment is needed.

References

  1. Callaway, B. and Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200-230.
  2. LaLonde, R. J. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review, 76(4):604-620.
  3. Robins, J. M., Rotnitzky, A., and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427):846-866.
  4. Sant'Anna, P. H. C. and Zhao, J. (2020). Doubly robust difference-in-differences estimators. Journal of Econometrics, 219(1):101-122.
  5. Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41-55.

Continue Reading

Browse All Sections →
Home
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.

Article Title