1 What Problem Does This Tool Solve?
The synthetic control method [Abadie et al., 2010] addresses the evaluation of a policy applied to a single aggregate unit (a state, country, or city) when a single comparison unit is not a credible counterfactual. Instead, it constructs a weighted combination of control units— the "synthetic" treated unit whose pre-treatment characteristics closely match those of the treated unit. The Synth package [Abadie et al., 2011] is the reference implementation of the original Abadie-Diamond-Hainmueller (ADH) estimator. It provides functions for:
- Constructing the synthetic control by solving the nested optimisation problem.
- Summarising the balance between the treated unit and its synthetic counterpart.
- Producing the canonical "path plot" (treated vs synthetic trajectories) and "gap plot" (difference over time).
- Conducting permutation inference (placebo studies in space).
2 Installation and Setup
3 Data Preparation with dataprep()
The dataprep() function structures your panel data into the matrices required by the optimisation routine. It requires:
- A balanced panel (long format: one row per unit-period).
- A single treated unit with a known treatment start period.
- A donor pool of untreated units.
The special.predictors argument allows you to include lagged values of the outcome at specific time points as predictors, which is standard practice for improving pre-treatment fit on the outcome trajectory.
4 Running the Optimisation with synth()
The synth() function solves the nested optimisation: for a given predictor weight vector v, it finds the unit weights w(v) minimising the predictor imbalance; then it searches over v to minimise the pre-treatment MSPE of the outcome variable . The result is a list containing:
synth.out$solution.w: the optimal unit weights wⱼ.
synth.out$solution.v: the optimal predictor weights vₛ.
synth.out$loss.w: the pre-treatment MSPE.
5 Summarising Results with synth.tab()
The tab.pred output shows the pre-treatment values of each predictor for the treated unit, the synthetic control, and the simple donor-pool average. A good synthetic control should match the treated unit on all predictors; substantial imbalance on any predictor is a warning sign . The tab.w table lists all donor units and their weights. In the Basque study, the synthetic Basque Country is composed primarily of Catalonia (65%) and Madrid (25%).
6 Plotting: Path and Gap Plots
A good synthetic control will show:
- Path plot: The synthetic control trajectory closely overlapping the treated unit's trajectory before the treatment year.
- Gap plot: The gap hovering near zero in the pre-treatment period, then diverging after treatment.
7 Permutation Inference: Placebo Studies
To assess statistical significance, apply the synthetic control to each control unit in turn ("space placebos"):
The key diagnostic is the ratio of post-treatment MSPE to pre-treatment MSPE. Discard placebos with poor pre-treatment fit (pre-MSPE more than twice the treated unit's). If the treated unit's MSPE ratio exceeds all (or nearly all) controls, the effect is statistically significant.
8 Key Options and Pitfalls
8.1 Predictor Choice
Include pre-treatment lags of the outcome (typically 3-4 time points spanning the pre-period) as special predictors. This ensures the synthetic control matches the treated unit's outcome trajectory. Poor pre-treatment fit on the outcome is the primary warning sign of an invalid synthetic control.
8.2 Balanced Panel Requirement
Synth requires a balanced panel: all units observed in all time periods. If your data has gaps, you must impute or restrict the sample. Unbalanced panels require the augsynth or SCtools packages.
8.3 Optimisation Failures
The nested optimisation occasionally fails to converge or produces degenerate solutions. Run synth() with multiple starting values (optimxmethod = "All") to check robustness.
8.4 Comparison to Modern Extensions
The original Synth estimator minimises pre-treatment MSPE but does not correct for residual imbalance. For settings with many pre-treatment periods and some remaining imbalance, the augmented synthetic control (augsynth) or synthetic DiD (synthdid) may produce lower bias.
9 Comparison to Alternatives
10 Conclusion
The Synth package is the reference R implementation of the Abadie et al. [2010] synthetic control estimator. Its dataprep() -> synth() -> synth.tab() -> path.plot() -> gaps.plot() workflow is straightforward and well-documented. For the canonical single-treated-unit comparative case study with a long pre-treatment panel, Synth remains the natural starting point.
References
- Abadie, A. and Gardeazabal, J. (2003). The economic costs of conflict: A case study of the Basque Country. American Economic Review, 93(1):113-132.
- Abadie, A., Diamond, A., and Hainmueller, J. (2010). Synthetic control methods for comparative case studies: Estimating the effect of California's tobacco control program. Journal of the American Statistical Association, 105(490):493-505.
- Abadie, A., Diamond, A., and Hainmueller, J. (2011). Synth: An R package for synthetic control methods in comparative case studies. Journal of Statistical Software, 42(13):1-17.
- Abadie, A. (2021). Using synthetic controls: Feasibility, data requirements, and methodological aspects. Journal of Economic Literature, 59(2):391-425.
- Ben-Michael, E., Feller, A., and Rothstein, J. (2021). The augmented synthetic control method. Journal of the American Statistical Association, 116(536):1789-1803.
- Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., and Wager, S. (2021). Synthetic difference-in-differences. American Economic Review, 111(12):4088-4118.