1 What Problem Does fixest Solve?
Panel data regressions with multiple high-dimensional fixed effects firm, worker, time, and industry fixed effects simultaneously are computationally demanding. The standard approach of adding dummy variables for each group becomes infeasible with millions of observations and thousands of fixed effect levels. fixest [Bergé, 2018] solves this by implementing the within-group demeaning algorithm of Guimaraes and Portugal [2010], which iteratively demeans observations by each set of fixed effects until convergence, avoiding the explicit formation of a large dummy variable matrix. The result is estimation that is typically 100-1000 faster than 1m() or plm() for large panels. Beyond speed, fixest has become the go-to package for event study designs and, criti- cally, for implementing the Sun and Abraham [2021] interaction-weighted (IW) DiD estima- tor that corrects for heterogeneous treatment effects in staggered DiD.
2 Installation and Setup
install.packages("fixest")
library (fixest)
fixest uses a distinctive formula syntax: fixed effects are specified after a | separator, and multi-way clustering is handled through the cluster argument or the vcov argument.
3 Basic Fixed Effects Regression
Simulated panel: 500 firms, 10 periods
set.seed(123)
N<500; $T<-10$
panel <- data.frame
firm = rep (1: N, each $=T$, year = rep (1:T, N), treat = rbinom (NT, 1, 0.3), y = rnorm(N * T) )
Two-way fixed effects (firm + year FE)
ols_twfe <- feols (y treat firm + year, data = panel, cluster = firm)
summary (ols_twfe)
Poisson PPML with fixed effects (for count outcomes or log-linear models)
pois_fe <- fepois (y data treat = panel, firm + year, cluster = "firm)
The feols() function is the main workhorse. The formula y treat firm + year means: regress y on treat with firm and year fixed effects absorbed (not explicitly included as dummies).
4 Event Study with TWFE
Event studies examine dynamic treatment effects how outcomes evolve in the periods be- fore and after treatment. In a TWFE framework, this means interacting treatment indicators with relative-time dummies.
Generate staggered adoption panel
library (data.table)
N< 200; $T<-12$
cohorts <- sample (c(5, 7, 9, NA), N, replace = TRUE) #NA $NA=$ never treated
panel2 < CJ(unit $=1:N$, year $=1:T)$
panel2[, cohort := cohorts [unit]]
panel2[, treat_year:= cohort]
panel2[, post := (year >= cohort) & !is.na (cohort)]
panel2[, rel_time := ifelse (is.na (cohort), Inf, year cohort)]
panel2[, $y:=0.5$ * post + 0.1 * (year 6) + rnorm(.N)]
TWFE event study (note: biased under treatment effect heterogeneity!)
es_twfe <- feols (y i(rel_time, ref $=-1)$ | unit + year, data panel2 [rel_time >= -4 & (rel_time < $<=4$ | is. na (cohort))], cluster = "unit)
iplot(es_twfe, main = "TWFE Event Study", xlab = "Periods relative to treatment")
The i(rel_time, ref = -1) syntax creates the full set of relative-time interactions, with period -1 (one period before treatment) as the reference. iplot() produces a clean event-study plot with confidence intervals.
5 The Sun-Abraham Estimator
The TWFE event study above is biased when treatment effects are heterogeneous across cohorts or over time. Sun and Abraham [2021] show that TWFE event-study coefficients are weighted averages of group-time average treatment effects with potentially negative weights. Their interaction-weighted (IW) estimator corrects this by estimating cohort-specific effects and aggregating them with proper non-negative weights.
fixest implements the Sun-Abraham estimator natively via the sunab() function:
Sun-Abraham estimator (corrects for heterogeneous treatment effects)
es_sa <- feols (y sunab (cohort, year) | unit + year, data = panel2 [!is.na(cohort) | TRUE], cluster = unit)
iplot(es_sa, main = "Sun-Abraham Event Study", xlab = "Periods relative to treatment")
Compare TWFE and Sun-Abraham estimates:
iplot(list(twfe = es_twfe, 'Sun-Abraham' = es_sa), main = "TWFEVS.Sun-Abraham")
The sunab (cohort, year) term instructs fixest to:
(1) Create all cohort relative-time interactions.
(2) Weight them using the share of each cohort in the treated sample.
(3) Aggregate to produce clean event-study coefficients.
6 Multiple Hypothesis Testing with etable
fixest includes etable() for publication-quality regression tables across multiple specifica- tions:
Compare OLS, TWFE, and IV
m1 <- feols (y treat, data panel, cluster = firm)
m2 <- feols (y treat firm, data = panel, cluster = ~firm)
m3 <- feols (y treat firm year, data panel, cluster = firm)
etable (m1, m2, m3, headers ers = c("OLS", "Firm FE", "Two-way FE"), tex FALSE)
7 IV Estimation
fixest also handles IV via the two-part formula syntax y x1 instrument:
fe1fe2 | endog_var
IV: endog_var instrumented by z, with firm and year FE
iv_est <- feols (y x1 firm + year treat data = panel, cluster = ~firm) Z,
summary(iv_est, stage = 1:2) # Show both stages
The stage $=1:2$ argument reports first and second-stage results separately.
8 Key Options and Pitfalls
8.1 Clustering
Always cluster standard errors at the unit level (or higher) in panel regressions:
#Two-way clustering (firm and year)
feols (y treat firm year, data = panel, VCOV = firm + year)
8.2 The ref argument in event studies
Omitting the reference period leads to multicollinearity. The convention is $=-1$ (one period before treatment). Verify that pre-trend coefficients are small and insignificant.
8.3 Never-treated units
In the Sun-Abraham estimator, never-treated units serve as the clean comparison group. Ensure you have enough of them. If all units are eventually treated, consider using "not- yet-treated" units as controls (available via the panel. id option).
9 Comparison to Alternatives
Table 1: Event study and staggered DiD packages in R
PackageEstimatorBest forfixest didTWFE, Sun-Abraham Callaway-Sant'Anna
Fast TWFE; IW estimator
Doubly robust group-time ATTS
staggeredRoth-Sant'AnnaEfficiency under staggered adoptionHonestDiDRambachan-RothSensitivity analysis to pre-trendsdidimputationBorusyak-Jaravel-SpiessImputation-based DiD
fixest is the natural first step for panel regressions and event studies. For staggered DiD with rigorous treatment of heterogeneous effects, combine it with did (Callaway-Sant'Anna) or use the sunab() estimator built into fixest.
10 Conclusion
fixest has become the workhorse R package for panel econometrics: it is fast, flexible, and provides a clean interface for fixed effects, event studies, IV, and the Sun-Abraham staggered DiD estimator. The iplot() function makes producing publication-quality event study figures easy. Every applied economist working with panel data should have fixest in their toolkit.
References
- Bergé, L. (2018). Efficient estimation of maximum likelihood models with multiple fixed- effects: The R package FENmlm. CREA Discussion Paper Series, 13.
- Guimaraes, P. and Portugal, P. (2010). A simple feasible procedure to fit models with high-dimensional fixed effects. Stata Journal, 10(4):628-649.
- Sun, L. and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of Econometrics, 225(2):175-199.
- Callaway, B. and Sant'Anna, P. Н. С. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200-230.
- Rambachan, A. and Roth, J. (2023). A more credible approach to parallel trends. Review of Economic Studies, 90(5):2555-2591.