The Causal Review

Motivation: Combining Two Approaches

Difference-in-differences (DiD) and synthetic control (SC) are two of the most widely used quasi-experimental methods in economics. DiD compares treated units to control units before and after treatment, relying on parallel trends. SC constructs a weighted combination of control units to match the pre-treatment trajectory of the treated unit, without requiring parallel trends but requiring a good pre-treatment fit. Both methods have well-known limitations: DiD can fail when control units follow different trends; SC is designed for a single treated unit and can be sensitive to the choice of donor pool.

Arkhangelsky et al.(2021) proposed the synthetic difference-in-differences (SDiD) estimator, which combines the two approaches. SDiD re-weights control units as in SC (to achieve pre-treatment parallel trends) while also taking differences over time as in DiD (to remove time-invariant confounders). The result is an estimator that performs well in settings where neither pure DiD nor pure SC is ideal, and that has desirable efficiency properties under a factor model for the outcome.

Setup

Consider a balanced panel of \(N\) units observed over \(T\) periods, with \(N_{\text{tr}}\) treated units that adopt treatment at period \(T_0 + 1\) and \(N_{\text{co}} = N - N_{\text{tr}}\) control units that never adopt. Denote \(Y_{it}(0)\) as the potential untreated outcome and \(Y_{it}(1)\) as the potential treated outcome.

We observe: \[\begin{equation} Y_{it} = Y_{it}(0) + D_{it} \cdot \tau_{it}, \end{equation}\] where \(D_{it} = 1\) if unit \(i\) is treated in period \(t\). The estimand is the average treatment effect on the treated (ATT): \[\begin{equation} \tau^{\text{ATT}} = \frac{1}{N_{\text{tr}}(T - T_0)} \sum_{i \in \mathcal{T}} \sum_{t > T_0} \tau_{it}, \label{eq:att} \end{equation}\] where \(\mathcal{T}\) is the set of treated units.

The SDiD Estimator

The SDiD estimator solves a weighted DiD regression. Define:

\(\hat{\omega}^{\text{sdid}} = (\hat{\omega}_1^{\text{co}}, \ldots, \hat{\omega}_{N_{\text{co}}}^{\text{co}})\): unit weights for control units (as in SC)
\(\hat{\lambda}^{\text{sdid}} = (\hat{\lambda}_1, \ldots, \hat{\lambda}_{T_0})\): time weights for pre-treatment periods

The estimator is: \[\begin{equation} \hat{\tau}^{\text{sdid}} = \arg\min_{\tau, \alpha_i, \beta_t} \sum_{i=1}^{N} \sum_{t=1}^{T} \hat{\omega}_i^{\text{co}} \hat{\lambda}_t \left(Y_{it} - \alpha_i - \beta_t - \tau D_{it}\right)^2, \label{eq:sdid} \end{equation}\] where the unit weights \(\hat{\omega}_i^{\text{co}}\) and time weights \(\hat{\lambda}_t\) are estimated in a preliminary step.

Step 1: Unit Weights

Unit weights are chosen to make the pre-treatment trends of the weighted control units match the trends of the treated units. Formally: \[\begin{equation} \hat{\omega}^{\text{sc}} = \arg\min_{\omega \geq 0,\, \sum \omega = 1} \sum_{t=1}^{T_0} \left(\bar{Y}_{t}^{\text{tr}} - \sum_{i \in \text{co}} \omega_i Y_{it}\right)^2 + \zeta^2 T_0 \|\omega\|^2, \label{eq:unitweights} \end{equation}\] where \(\bar{Y}_{t}^{\text{tr}}\) is the mean outcome of treated units in period \(t\), and \(\zeta\) is a regularisation parameter. The \(L_2\) penalty discourages extreme weights and prevents overfitting.

Step 2: Time Weights

Time weights up-weight pre-treatment periods that best predict the post-treatment period: \[\begin{equation} \hat{\lambda}^{\text{sdid}} = \arg\min_{\lambda \geq 0,\, \sum \lambda = 1} \sum_{i \in \text{co}} \left(\bar{Y}_{i,\text{post}} - \sum_{t=1}^{T_0} \lambda_t Y_{it}\right)^2 + \zeta^{\prime 2} T_0^{-1} \|\lambda\|^2, \label{eq:timeweights} \end{equation}\] where \(\bar{Y}_{i,\text{post}}\) is the mean post-treatment outcome for control unit \(i\). This step has no analogue in pure SC or pure DiD; it allows SDiD to down-weight pre-treatment periods that are distant in structure from the post-treatment period.

Step 3: Weighted DiD Regression

Given the weights, \(\hat{\tau}^{\text{sdid}}\) is obtained by weighted least squares of \(Y_{it}\) on unit and time fixed effects and the treatment indicator \(D_{it}\), using weights \(\hat{\omega}_i^{\text{co}} \hat{\lambda}_t\).

Comparison to DiD and SC

Table summarises how SDiD relates to its predecessors.

Feature	DiD	SC	SDiD
Unit weights	Equal	Optimised	Optimised
Time weights	Equal	Equal	Optimised
Removes unit FE	Yes	No	Yes
Removes time FE	Yes	No	Yes
Pre-trend requirement	Parallel trends	Exact pre-fit	Approximate pre-fit
Number of treated units	Many	One	One or many
Variance estimation	Standard	Bootstrap	Bootstrap/jackknife

Comparison of DiD, SC, and SDiD

The key insight is that SDiD achieves the best of both worlds: unit weights match pre-treatment trends (as in SC), and the double-differencing removes common time and unit confounders (as in DiD). Arkhangelsky et al.(2021) show that under a two-way factor model \[\begin{equation} Y_{it}(0) = \alpha_i + \beta_t + \mathbf{u}_i' \mathbf{v}_t + \varepsilon_{it}, \end{equation}\] SDiD consistently estimates the ATT under conditions where both DiD and SC may be biased.

Statistical Inference

Arkhangelsky et al.(2021) propose three variance estimators for \(\hat{\tau}^{\text{sdid}}\):

Bootstrap over units. Resample units with replacement; recompute \(\hat{\tau}^{\text{sdid}}\) on each bootstrap sample.
Jackknife over units. Leave one unit out at a time; estimate variance from the leave-one-out distribution of \(\hat{\tau}\).
Placebo (permutation) inference. Randomly assign treatment to control units; compare the distribution of placebo \(\hat{\tau}\) to the actual estimate. This is especially useful when \(N_{\text{co}}\) is small.

The placebo approach mirrors inference in SC (Abadie et al.(2010)) and is the most robust when the number of control units is small relative to the number of time periods.

Application: California's Tobacco Tax

Arkhangelsky et al.(2021) replicate and extend Abadie et al.(2010)'s canonical application: the effect of California's 1989 Proposition 99 tobacco tax on per-capita cigarette sales. In the original SC analysis, California's per-capita sales dropped from roughly 100 packs per year before 1989 to around 60 packs by 2000, while the synthetic control (a weighted average of other states) remained near 90–95 packs.

SDiD produces similar point estimates to SC but with substantially smaller confidence intervals, reflecting its efficiency advantage from using time weights that down-weight early pre-treatment periods and from the double-differencing that removes unit-level heterogeneity.

Staggered Adoption

The original SDiD paper focused on a single adoption cohort. Extending to staggered treatment — where different units adopt at different times — requires care. Arkhangelsky et al.(2021) propose estimating cohort-specific treatment effects and aggregating them, analogous to the approach of Callaway and Sant'Anna(2021) in DiD. Software for staggered SDiD is available in the synthdid package in R.

Available Software

The synthdid package in R (available on CRAN and GitHub) implements:

synthdid_estimate(): compute \(\hat{\tau}^{\text{sdid}}\) along with DiD and SC as special cases
vcov.synthdid_estimate(): variance estimation via bootstrap, jackknife, or placebo
synthdid_plot(): visualise treated and synthetic control trajectories with unit and time weights

Conclusion

Synthetic DiD represents a meaningful methodological advance: it inherits SC's ability to construct data-driven comparison groups that match pre-treatment trends, while adding DiD's robustness to time-invariant confounders. Under a factor model, it is asymptotically efficient. The method is particularly valuable in settings with moderate numbers of control units and pre-treatment periods, where pure SC may overfit and pure DiD may rest on implausible parallel trends assumptions.

References

Abadie, A., Diamond, A., and Hainmueller, J. (2010). Synthetic control methods for comparative case studies: Estimating the effect of California's tobacco control program. Journal of the American Statistical Association, 105(490):493--505.
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., and Wager, S. (2021). Synthetic difference-in-differences. American Economic Review, 111(12):4088--4118.
Callaway, B. and Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200--230.

Synthetic Difference-in-Differences: Arkhangelsky et al. (2021)

Motivation: Combining Two Approaches

Setup

The SDiD Estimator

Step 1: Unit Weights

Step 2: Time Weights

Step 3: Weighted DiD Regression

Comparison to DiD and SC

Statistical Inference

Application: California's Tobacco Tax

Staggered Adoption

Available Software

Conclusion

References

Continue Reading

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Synthetic Difference-in-Differences: Arkhangelsky et al. (2021)

Motivation: Combining Two Approaches

Setup

The SDiD Estimator

Step 1: Unit Weights

Step 2: Time Weights

Step 3: Weighted DiD Regression

Comparison to DiD and SC

Statistical Inference

Application: California's Tobacco Tax

Staggered Adoption

Available Software

Conclusion

References

Continue Reading

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title