Toolbox

fixest in R: Fast Fixed Effects, Event Studies, and Sun-Abraham DiD

1 What Problem Does fixest Solve?

Panel data regressions with multiple high-dimensional fixed effects firm, worker, time, and industry fixed effects simultaneously are computationally demanding. The standard approach of adding dummy variables for each group becomes infeasible with millions of observations and thousands of fixed effect levels. fixest [Bergé, 2018] solves this by implementing the within-group demeaning algorithm of Guimaraes and Portugal [2010], which iteratively demeans observations by each set of fixed effects until convergence, avoiding the explicit formation of a large dummy variable matrix. The result is estimation that is typically 100-1000 faster than 1m() or plm() for large panels. Beyond speed, fixest has become the go-to package for event study designs and, criti- cally, for implementing the Sun and Abraham [2021] interaction-weighted (IW) DiD estima- tor that corrects for heterogeneous treatment effects in staggered DiD.

2 Installation and Setup

install.packages("fixest")

library (fixest)

fixest uses a distinctive formula syntax: fixed effects are specified after a | separator, and multi-way clustering is handled through the cluster argument or the vcov argument.

3 Basic Fixed Effects Regression

Simulated panel: 500 firms, 10 periods

set.seed(123)

N<500; $T<-10$

panel <- data.frame

firm = rep (1: N, each $=T$, year = rep (1:T, N), treat = rbinom (NT, 1, 0.3), y = rnorm(N * T) )

Two-way fixed effects (firm + year FE)

ols_twfe <- feols (y treat firm + year, data = panel, cluster = firm)

summary (ols_twfe)

Poisson PPML with fixed effects (for count outcomes or log-linear models)

pois_fe <- fepois (y data treat = panel, firm + year, cluster = "firm)

The feols() function is the main workhorse. The formula y treat firm + year means: regress y on treat with firm and year fixed effects absorbed (not explicitly included as dummies).

4 Event Study with TWFE

Event studies examine dynamic treatment effects how outcomes evolve in the periods be- fore and after treatment. In a TWFE framework, this means interacting treatment indicators with relative-time dummies.

Generate staggered adoption panel

library (data.table)

N< 200; $T<-12$

cohorts <- sample (c(5, 7, 9, NA), N, replace = TRUE) #NA $NA=$ never treated

panel2 < CJ(unit $=1:N$, year $=1:T)$

panel2[, cohort := cohorts [unit]]

panel2[, treat_year:= cohort]

panel2[, post := (year >= cohort) & !is.na (cohort)]

panel2[, rel_time := ifelse (is.na (cohort), Inf, year cohort)]

panel2[, $y:=0.5$ * post + 0.1 * (year 6) + rnorm(.N)]

TWFE event study (note: biased under treatment effect heterogeneity!)

es_twfe <- feols (y i(rel_time, ref $=-1)$ | unit + year, data panel2 [rel_time >= -4 & (rel_time < $<=4$ | is. na (cohort))], cluster = "unit)

iplot(es_twfe, main = "TWFE Event Study", xlab = "Periods relative to treatment")

The i(rel_time, ref = -1) syntax creates the full set of relative-time interactions, with period -1 (one period before treatment) as the reference. iplot() produces a clean event-study plot with confidence intervals.

5 The Sun-Abraham Estimator

The TWFE event study above is biased when treatment effects are heterogeneous across cohorts or over time. Sun and Abraham [2021] show that TWFE event-study coefficients are weighted averages of group-time average treatment effects with potentially negative weights. Their interaction-weighted (IW) estimator corrects this by estimating cohort-specific effects and aggregating them with proper non-negative weights.

fixest implements the Sun-Abraham estimator natively via the sunab() function:

Sun-Abraham estimator (corrects for heterogeneous treatment effects)

es_sa <- feols (y sunab (cohort, year) | unit + year, data = panel2 [!is.na(cohort) | TRUE], cluster = unit)

iplot(es_sa, main = "Sun-Abraham Event Study", xlab = "Periods relative to treatment")

Compare TWFE and Sun-Abraham estimates:

iplot(list(twfe = es_twfe, 'Sun-Abraham' = es_sa), main = "TWFEVS.Sun-Abraham")

The sunab (cohort, year) term instructs fixest to:

(1) Create all cohort relative-time interactions.

(2) Weight them using the share of each cohort in the treated sample.

(3) Aggregate to produce clean event-study coefficients.

6 Multiple Hypothesis Testing with etable

fixest includes etable() for publication-quality regression tables across multiple specifica- tions:

Compare OLS, TWFE, and IV

m1 <- feols (y treat, data panel, cluster = firm)

m2 <- feols (y treat firm, data = panel, cluster = ~firm)

m3 <- feols (y treat firm year, data panel, cluster = firm)

etable (m1, m2, m3, headers ers = c("OLS", "Firm FE", "Two-way FE"), tex FALSE)

7 IV Estimation

fixest also handles IV via the two-part formula syntax y x1 instrument:

fe1fe2 | endog_var

IV: endog_var instrumented by z, with firm and year FE

iv_est <- feols (y x1 firm + year treat data = panel, cluster = ~firm) Z,

summary(iv_est, stage = 1:2) # Show both stages

The stage $=1:2$ argument reports first and second-stage results separately.

8 Key Options and Pitfalls

8.1 Clustering

Always cluster standard errors at the unit level (or higher) in panel regressions:

#Two-way clustering (firm and year)

feols (y treat firm year, data = panel, VCOV = firm + year)

8.2 The ref argument in event studies

Omitting the reference period leads to multicollinearity. The convention is $=-1$ (one period before treatment). Verify that pre-trend coefficients are small and insignificant.

8.3 Never-treated units

In the Sun-Abraham estimator, never-treated units serve as the clean comparison group. Ensure you have enough of them. If all units are eventually treated, consider using "not- yet-treated" units as controls (available via the panel. id option).

9 Comparison to Alternatives

Table 1: Event study and staggered DiD packages in R

PackageEstimatorBest forfixest didTWFE, Sun-Abraham Callaway-Sant'Anna

Fast TWFE; IW estimator

Doubly robust group-time ATTS

staggeredRoth-Sant'AnnaEfficiency under staggered adoptionHonestDiDRambachan-RothSensitivity analysis to pre-trendsdidimputationBorusyak-Jaravel-SpiessImputation-based DiD

fixest is the natural first step for panel regressions and event studies. For staggered DiD with rigorous treatment of heterogeneous effects, combine it with did (Callaway-Sant'Anna) or use the sunab() estimator built into fixest.

10 Conclusion

fixest has become the workhorse R package for panel econometrics: it is fast, flexible, and provides a clean interface for fixed effects, event studies, IV, and the Sun-Abraham staggered DiD estimator. The iplot() function makes producing publication-quality event study figures easy. Every applied economist working with panel data should have fixest in their toolkit.

References

  1. Bergé, L. (2018). Efficient estimation of maximum likelihood models with multiple fixed- effects: The R package FENmlm. CREA Discussion Paper Series, 13.
  2. Guimaraes, P. and Portugal, P. (2010). A simple feasible procedure to fit models with high-dimensional fixed effects. Stata Journal, 10(4):628-649.
  3. Sun, L. and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of Econometrics, 225(2):175-199.
  4. Callaway, B. and Sant'Anna, P. Н. С. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200-230.
  5. Rambachan, A. and Roth, J. (2023). A more credible approach to parallel trends. Review of Economic Studies, 90(5):2555-2591.

Continue Reading

Browse All Sections →
Home
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.

Article Title