The Causal Review

1 Motivation: Combining Two Approaches

Difference-in-differences (DiD) and synthetic control (SC) are two of the most widely used quasi-experimental methods in economics. DiD compares treated units to control units be- fore and after treatment, relying on parallel trends. SC constructs a weighted combination of control units to match the pre-treatment trajectory of the treated unit, without requiring parallel trends but requiring a good pre-treatment fit. Both methods have well-known limi- tations: DiD can fail when control units follow different trends; SC is designed for a single treated unit and can be sensitive to the choice of donor pool.

Arkhangelsky et al. [2021] proposed the synthetic difference-in-differences (SDiD) estima- tor, which combines the two approaches. SDiD re-weights control units as in SC (to achieve pre-treatment parallel trends) while also taking differences over time as in DiD (to remove time-invariant confounders). The result is an estimator that performs well in settings where neither pure DiD nor pure SC is ideal, and that has desirable efficiency properties under a factor model for the outcome.

2 Setup

Consider a balanced panel of N units observed over T periods, with N_tr treated units thatadopt treatment at period T₀+1 and Nco = N −N_tr control units that never adopt. DenoteˆY_it(0) as the potential untreated outcome and ˆY_it(1) as the potential treated outcome.

$$ Y_{it} = Y_{it}(0) + D_{it} \cdot \tau_{it}, $$

(1)

where D_it=1 if unit i is treated in period t. The estimand is the average treatment effect on the treated (ATT):

$$ \tau^{\text{ATT}} = \frac{1}{N_{\text{tr}}(T - T_0)} \sum_{i \in \mathcal{T}} \sum_{t > T_0} \tau_{it}, $$

(2)

where T is the set of treated units.

3 The SDiD Estimator

The SDiD estimator solves a weighted DiD regression. Define:

ω^sdid=(ˆω^co1 ,...,ˆω^co_Nco): unitweightsforcontrolunits(asinSC
ˆλ^sdid=(ˆ λ₁,...,ˆ λ_T0): timeweightsforpre-treatmentperiods

The estimator is:

$$ \hat{\tau}^{\text{sdid}} = \arg \min_{\tau, \alpha_i, \beta_t} \sum_{i=1}^N \sum_{t=1}^T \hat{\omega}_i^{\text{co}} \hat{\lambda}_t (Y_{it} - \alpha_i - \beta_t - \tau D_{it})^2, $$

(3)

where the unit weights ω_i^co and time weights ˆ λ_t are estimated in a preliminary step.

3.1 Step 1: Unit Weights

Unit weights are chosen to make the pre-treatment trends of the weighted control units match the trends of the treated units. Formally:

$$ \hat{\omega}^{\text{sc}} = \arg \min_{\omega \ge 0, \sum \omega=1} \sum_{t=1}^{T_0} \left( \bar{Y}_t^{\text{tr}} - \sum_{i \in \text{co}} \omega_i Y_{it} \right)^2 + \zeta^2 T_0 \|\omega\|^2, $$

(4)

where Y_t^tr is the mean outcome of treated units in period t, and ( is a regularisation param- eter. The L₂penalty discourages extreme weights and prevents overfitting.

3.2 Step 2: Time Weights

Time weights up-weight pre-treatment periods that best predict the post-treatment period:

$$ \hat{\lambda}^{\text{sdid}} = \arg \min_{\lambda \ge 0, \sum \lambda=1} \sum_{i \in \text{co}} \left( \bar{Y}_{i,\text{post}} - \sum_{t=1}^{T_0} \lambda_t Y_{it} \right)^2 + \zeta'^2 T_0^{-1} \|\lambda\|^2, $$

(5)

where ¯Y_i,_post is the mean post-treatment outcome for control unit i. This step has no analogue in pure SC or pure DiD; it allows SDiD to down-weight pre-treatment periods that are distant in structure from the post-treatment period.

3.3 Step 3: Weighted DiD Regression

Given the weights, ˆτ^sdidis obtained by weighted least squares of Y_it on unit and time fixed effects and the treatment indicator D_it, using weights ˆω_coi ˆ λ_t.

4 Comparison to DiD and SC

Table 1 summarises how SDiD relates to its predecessors.

Table 1: Comparison of DiD, SC, and SDiD
Feature	DiD	SC	SDiD
Unit weights	Equal	Optimised	Optimised
Time weights	Equal	Equal	Optimised
Removes unit FE	Yes	No	Yes
Removes time FE	Yes	No	Yes
Pre-trend requirement	Parallel trends	Exact pre-fit	Approximate pre-fit
Number of treated units	Many	One	One or many
Variance estimation	Standard	Bootstrap	Bootstrap/jackknife

Removes unit FEYesNoYesRemoves time FEYesNoYesPre-trend requirementParallel trendsExact pre-fitApproximate pre-fitNumber of treated unitsManyOneOne or manyVariance estimationStandardBootstrapBootstrap/jackknife

The key insight is that SDiD achieves the best of both worlds: unit weights match pre- treatment trends (as in SC), and the double-differencing removes common time and unit confounders (as in DiD). Arkhangelsky et al. [2021] show that under a two-way factor model

$$ Y_{it}(0) = \alpha_i + \beta_t + \mathbf{u}_i' \mathbf{v}_t + \varepsilon_{it}, $$

(6)

SDiD consistently estimates the ATT under conditions where both DiD and SC may be biased.

5 Statistical Inference

Arkhangelsky et al. [2021] propose three variance estimators for sdid.

(a) Bootstrap over units. Resample units with replacement; recompute sdid on each bootstrap sample.

(b) Jackknife over units. Leave one unit out at a time; estimate variance from the leave-one-out distribution of .

(c) Randomly assign treatment to control units; compare the distribution of placebo ˆτ to the actual estimate. This is especially useful when N_co is small

The placebo approach mirrors inference in SC [Abadie et al., 2010] and is the most robust when the number of control units is small relative to the number of time periods.

6 Application: California's Tobacco Tax

Arkhangelsky et al. [2021] replicate and extend Abadie et al. [2010]'s canonical application: the effect of California's 1989 Proposition 99 tobacco tax on per-capita cigarette sales. In the original SC analysis, California's per-capita sales dropped from roughly 100 packs per year before 1989 to around 60 packs by 2000, while the synthetic control (a weighted average of other states) remained near 90-95 packs.

SDiD produces similar point estimates to SC but with substantially smaller confidence intervals, reflecting its efficiency advantage from using time weights that down-weight early pre-treatment periods and from the double-differencing that removes unit-level heterogeneity.

7 Staggered Adoption

The original SDiD paper focused on a single adoption cohort. Extending to staggered treat- ment where different units adopt at different times requires care. Arkhangelsky et al. [2021] propose estimating cohort-specific treatment effects and aggregating them, analogous to the approach of Callaway and Sant'Anna [2021] in DiD. Software for staggered SDiD is available in the synthdid package in R.

8 Available Software

The synthdid package in R (available on CRAN and GitHub) implements:

synthdid_estimate(): compute sdid along with DiD and SC as special cases

vcov.synthdid_estimate(): variance estimation via bootstrap, jackknife, or placebo

synthdid_plot(): visualise treated and synthetic control trajectories with unit and time weights

install.packages("synthdid")
library(synthdid)

# California tobacco data (panel: state x year)
data(california_prop99)
setup <- panel.matrices(california_prop99)

# Estimate SDiD, SC, DiD
tau_sdid <- synthdid_estimate(setup$Y, setup$N0, setup$T0)
tau_sc   <- sc_estimate(setup$Y, setup$N0, setup$T0)
tau_did  <- did_estimate(setup$Y, setup$N0, setup$T0)

# Standard errors via bootstrap
se_sdid <- sqrt(vcov(tau_sdid, method = "bootstrap"))

cat("SDiD:", round(tau_sdid, 2), "(SE:", round(se_sdid, 2), ")\n")

9 Conclusion

Synthetic DiD represents a meaningful methodological advance: it inherits SC's ability to construct data-driven comparison groups that match pre-treatment trends, while adding DiD's robustness to time-invariant confounders. Under a factor model, it is asymptotically efficient. The method is particularly valuable in settings with moderate numbers of control units and pre-treatment periods, where pure SC may overfit and pure DiD may rest on implausible parallel trends assumptions.

References

Abadie, A., Diamond, A., and Hainmueller, J. (2010). Synthetic control methods for compar- ative case studies: Estimating the effect of California's tobacco control program. Journal of the American Statistical Association, 105(490):493-505.
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., and Wager, S. (2021). Syn- thetic difference-in-differences. American Economic Review, 111(12):4088-4118.
Callaway, B. and Sant'Anna, P. Н. С. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200-230.

Synthetic Difference-in-Differences: Arkhangelsky et al. (2021)

1 Motivation: Combining Two Approaches

2 Setup

3 The SDiD Estimator

3.1 Step 1: Unit Weights

3.2 Step 2: Time Weights

3.3 Step 3: Weighted DiD Regression

4 Comparison to DiD and SC

5 Statistical Inference

6 Application: California's Tobacco Tax

7 Staggered Adoption

8 Available Software

9 Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Synthetic Difference-in-Differences: Arkhangelsky et al. (2021)

1 Motivation: Combining Two Approaches

2 Setup

3 The SDiD Estimator

3.1 Step 1: Unit Weights

3.2 Step 2: Time Weights

3.3 Step 3: Weighted DiD Regression

4 Comparison to DiD and SC

5 Statistical Inference

6 Application: California's Tobacco Tax

7 Staggered Adoption

8 Available Software

9 Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title