The Causal Review

Background: SCM and ASCM

The Synthetic Control Method (Abadie et al.(2010)) constructs a counterfactual for a single treated unit by finding a convex combination of donor (control) units that best matches the treated unit's pre-treatment outcome trajectory. The synthetic control estimator identifies the average treatment effect for the treated unit in each post-treatment period.

The key limitation of classical SCM is that the pre-treatment fit may be imperfect, especially when the treated unit lies outside the convex hull of the donor pool or when the pre-treatment period is short. Ben-Michael et al.(2021) propose the Augmented Synthetic Control, which adds a ridge regression bias correction term to the SCM estimate. Under a linear factor model, the bias correction removes the bias from imperfect pre-treatment fit, and the estimator inherits the desirable properties of both SCM and regression.

Installation

The augsynth package is available on GitHub:

# install.packages("remotes") remotes::install_github("ebenmichael/augsynth") library(augsynth) library(dplyr) library(ggplot2)

Data Requirements

augsynth expects a data frame in wide or long format. For single-unit SCM, it needs:

A column for each donor unit's outcomes (wide format) or a unit and time column (long format).
A column indicating treatment status (binary: 0 before treatment, 1 after).
The outcome variable.

For multisynth (staggered adoption), the data must be in long format with unit, time, a binary treatment indicator, and the outcome.

Single-Unit Synthetic Control

Using the `augsynth()`

Function

The main function is augsynth(). We use the built-in kansas dataset (about the Kansas tax cut) included in the package for illustration:

# Load built-in data data(kansas) # kansas: panel of US states, outcome = gdp_2012 (GDP index) # Treatment: Kansas cuts taxes in 2012 (treated = 1 for Kansas 2012 onward) head(kansas)

# Fit augmented synthetic control

syn_out <- augsynth(

form = lngdpcapita treated, # outcome treatment indicator

unit = state, time = year,

data = kansas,

progfunc = "Ridge", # bias correction: Ridge regression

scm = TRUE # include standard SCM weights )

summary(syn_out)

The progfunc argument specifies the bias correction model. Options include:

"Ridge": ridge regression (the default augmentation in Ben-Michael et al.).
"None": classical SCM with no bias correction.
"EN": elastic net.
"RF": random forest (non-parametric bias correction).

Examining the Weights

# Extract synthetic control weights weights <- syn_out$weights print(round(weights, 3)) # Units with weight > 0 form the synthetic control # Weights should be non-negative and sum to 1 (for SCM component)

Pre-Treatment Fit

plot(syn_out) + labs( title = "Augmented Synthetic Control: Kansas Tax Cut", subtitle = "Outcome: log GDP per capita", x = "Year", y = "Log GDP per capita" ) + theme_bw()

The plot shows the outcome trajectory for Kansas and its synthetic control. In the pre-treatment period, the synthetic control should track Kansas closely (indicating good fit). Post-treatment, any divergence is attributed to the tax cut.

Inference by Permutation

Since there is typically only one treated unit, large-sample standard errors are not available. The standard approach is a permutation (placebo) test: re-run the synthetic control for each donor unit pretending it was treated at the same time, and compare the estimated treatment effect for Kansas to the distribution of placebo effects.

syn_inf <- permutation_inference(syn_out, n_perm = 1000) plot(syn_inf) + labs( title = "Permutation Inference: Kansas", x = "Year", y = "Estimated treatment effect" ) + theme_bw() # The p-value is the fraction of placebo effects at least as # large as the actual treatment effect

Staggered Adoption with `multisynth`

The multisynth() function extends ASCM to settings with multiple treated units treated at different times. It estimates a separate synthetic control for each treated unit, then averages the treatment effects, optionally weighting by group size.

Data Preparation

set.seed(123) n_units <- 40 n_times <- 12

# Assign treatment timing: 10 units treated at t=5, 10 at t=8, 20 never cohorts <- c(rep(5, 10), rep(8, 10), rep(Inf, 20)) unit_ids <- 1:n_units

staggered_panel <- expand.grid(unit = unit_ids, time = 1:n_times) %>% left_join(data.frame(unit = unit_ids, first_treat = cohorts), by = "unit") %>% mutate( treated = as.integer(time >= first_treat), unit_fe = rep(rnorm(n_units), each = n_times), att = ifelse(treated == 1, 3 + 0.5 * (time - first_treat), 0), y = unit_fe + 0.5 * time + att + rnorm(n()) )

Fitting `multisynth`

ms_out <- multisynth( form = y treated, unit = unit, time = time, data = staggered_panel, lambda = NULL, # auto-select ridge penalty n_leads = 4 # number of post-treatment periods to estimate ) summary(ms_out)

The output reports the estimated ATT for each treated cohort ($g$) at each horizon $\ell$ since treatment, as well as an averaged estimate across cohorts. This is directly analogous to the Callaway–Sant'Anna event-study aggregation, but using synthetic control rather than DiD-style comparisons.

Plotting Multisynth Results

ms_plot <- plot(ms_out, levels = "average") + labs( title = "Augmented Synthetic Control: Average ATT by Event Time", x = "Periods since treatment", y = "Average treatment effect" ) + theme_bw() print(ms_plot) # Plot by cohort ms_plot_cohort <- plot(ms_out, levels = "individual") + facet_wrap( Level) + theme_bw() print(ms_plot_cohort)

Adding Covariates

Both augsynth() and multisynth() allow covariates to be included in the balance constraints:

# Assume the data includes a covariate x1 measured pre-treatment syn_cov <- augsynth( form = y treated | x1, # | separates outcome from covariates unit = unit, time = time, data = staggered_panel, progfunc = "Ridge", scm = TRUE )

Covariates after the | are included in the pre-treatment balance optimisation: the synthetic control weights are chosen to match both the pre-treatment outcome trajectory and the covariate values.

Choosing Between SCM, ASCM, and DiD

The choice between synthetic control methods and difference-in-differences depends on the setting:

Use SCM/ASCM when there is one (or a small number of) treated units and a larger donor pool. SCM is especially suited to comparative case studies where no single donor unit is a natural comparison.
Use DiD when there are many treated and control units, and you want to leverage the panel structure for efficiency. DiD's parallel trends assumption is transparent; SCM's factor model assumption may be harder to assess.
ASCM bridges the two: it inherits the SCM's flexibility in choosing comparison units and the bias reduction from ridge regression when the pre-treatment fit is imperfect.

Ben-Michael et al.(2021) provide guidance on when ASCM outperforms SCM: primarily when the donor pool is large relative to the treated unit and when the pre-treatment period is short.

Conclusion

The augsynth package provides a streamlined interface to both classical and augmented synthetic control estimation in R. For single treated units, the workflow is: (1) call augsynth() with progfunc = "Ridge"; (2) examine pre-treatment fit; (3) run permutation inference. For staggered adoption, use multisynth() and plot event-study results by cohort. Together, these tools implement the state-of-the-art comparative case study methodology in empirical social science.

References

Abadie, A., Diamond, A., and Hainmueller, J. (2010). Synthetic control methods for comparative case studies: Estimating the effect of California's tobacco control program. Journal of the American Statistical Association, 105(490):493--505.
Ben-Michael, E., Feller, A., and Rothstein, J. (2021). The augmented synthetic control method. Journal of the American Statistical Association, 116(536):1789--1803.
Callaway, B. and Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200--230.
Abadie, A. (2021). Using synthetic controls: Feasibility, data requirements, and methodological aspects. Journal of Economic Literature, 59(2):391--425.
Doudchenko, N. and Imbens, G. W. (2016). Balancing, regression, difference-in-differences and synthetic control methods: A synthesis. NBER Working Paper No. 22791.

The `augsynth` Package in R: Synthetic Control and ASCM

Background: SCM and ASCM

Installation

Data Requirements

Single-Unit Synthetic Control

Using the `augsynth()`

Examining the Weights

Pre-Treatment Fit

Inference by Permutation

Staggered Adoption with `multisynth`

Data Preparation

Fitting `multisynth`

Plotting Multisynth Results

Adding Covariates

Choosing Between SCM, ASCM, and DiD

Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

The `augsynth` Package in R: Synthetic Control and ASCM

Background: SCM and ASCM

Installation

Data Requirements

Single-Unit Synthetic Control

Using the augsynth()

Examining the Weights

Pre-Treatment Fit

Inference by Permutation

Staggered Adoption with multisynth

Data Preparation

Fitting multisynth

Plotting Multisynth Results

Adding Covariates

Choosing Between SCM, ASCM, and DiD

Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title

Using the `augsynth()`

Staggered Adoption with `multisynth`

Fitting `multisynth`