What Problem Does fixest
Solve?
Panel data regressions with multiple high-dimensional fixed effects — firm, worker, time, and industry fixed effects simultaneously — are computationally demanding. The standard approach of adding dummy variables for each group becomes infeasible with millions of observations and thousands of fixed effect levels.
fixest (Berg\'e(2018)) solves this by implementing the within-group demeaning algorithm of Guimaraes and Portugal(2010), which iteratively demeans observations by each set of fixed effects until convergence, avoiding the explicit formation of a large dummy variable matrix. The result is estimation that is typically 100–1000\(\times\) faster than lm() or plm() for large panels.
Beyond speed, fixest has become the go-to package for event study designs and, critically, for implementing the Sun and Abraham(2021) interaction-weighted (IW) DiD estimator that corrects for heterogeneous treatment effects in staggered DiD.
Installation and Setup
install.packages("fixest")
library(fixest)
fixest uses a distinctive formula syntax: fixed effects are specified after a | separator, and multi-way clustering is handled through the cluster argument or the vcov argument.
Basic Fixed Effects Regression
# Simulated panel: 500 firms, 10 periods
set.seed(123)
N <- 500; T <- 10
panel <- data.frame(
firm = rep(1:N, each = T),
year = rep(1:T, N),
treat = rbinom(N * T, 1, 0.3),
y = rnorm(N * T)
)
# Two-way fixed effects (firm + year FE) ols_twfe <- feols(y treat | firm + year, data = panel, cluster = firm) summary(ols_twfe)
# Poisson PPML with fixed effects (for count outcomes or log-linear models) pois_fe <- fepois(y treat | firm + year, data = panel, cluster = firm)
The feols() function is the main workhorse. The formula y \ treat | firm + year means: regress y on treat with firm and year fixed effects absorbed (not explicitly included as dummies).
Event Study with TWFE
Event studies examine dynamic treatment effects — how outcomes evolve in the periods before and after treatment. In a TWFE framework, this means interacting treatment indicators with relative-time dummies.
# Generate staggered adoption panel
library(data.table)
N <- 200; T <- 12
cohorts <- sample(c(5, 7, 9, NA), N, replace = TRUE) # NA = never treated
panel2 <- CJ(unit = 1:N, year = 1:T)
panel2[, cohort := cohorts[unit]]
panel2[, treat_year := cohort]
panel2[, post := (year >= cohort) & !is.na(cohort)]
panel2[, rel_time := ifelse(is.na(cohort), -Inf, year - cohort)]
panel2[, y := 0.5 * post + 0.1 * (year - 6) + rnorm(.N)]
# TWFE event study (note: biased under treatment effect heterogeneity!) es_twfe <- feols(y i(rel_time, ref = -1) | unit + year, data = panel2[rel_time >= -4 & (rel_time <= 4 | is.na(cohort))], cluster = unit) iplot(es_twfe, main = "TWFE Event Study", xlab = "Periods relative to treatment")
The i(rel\_time, ref = -1) syntax creates the full set of relative-time interactions, with period \(-1\) (one period before treatment) as the reference. iplot() produces a clean event-study plot with confidence intervals.
The Sun–Abraham Estimator
The TWFE event study above is biased when treatment effects are heterogeneous across cohorts or over time. Sun and Abraham(2021) show that TWFE event-study coefficients are weighted averages of group-time average treatment effects with potentially negative weights. Their interaction-weighted (IW) estimator corrects this by estimating cohort-specific effects and aggregating them with proper non-negative weights.
fixest implements the Sun–Abraham estimator natively via the sunab() function:
# Sun-Abraham estimator (corrects for heterogeneous treatment effects)
es_sa <- feols(y sunab(cohort, year) | unit + year,
data = panel2[!is.na(cohort) | TRUE],
cluster = unit)
iplot(es_sa, main = "Sun-Abraham Event Study", xlab = "Periods relative to treatment")
# Compare TWFE and Sun-Abraham estimates: iplot(list(twfe = es_twfe, `Sun-Abraham` = es_sa), main = "TWFE vs. Sun-Abraham")
The sunab(cohort, year) term instructs fixest to:
- Create all cohort \(\times\) relative-time interactions.
- Weight them using the share of each cohort in the treated sample.
- Aggregate to produce clean event-study coefficients.
Multiple Hypothesis Testing with etable
fixest includes etable() for publication-quality regression tables across multiple specifications:
# Compare OLS, TWFE, and IV
m1 <- feols(y treat, data = panel, cluster = firm)
m2 <- feols(y treat | firm, data = panel, cluster = firm)
m3 <- feols(y treat | firm + year, data = panel, cluster = firm)
etable(m1, m2, m3, headers = c("OLS", "Firm FE", "Two-way FE"), tex = FALSE)
IV Estimation
fixest also handles IV via the two-part formula syntax y \ x1 | fe1 + fe2 | endog\_var \ instrument:
# IV: endog_var instrumented by z, with firm and year FE
iv_est <- feols(y x1 | firm + year | treat z,
data = panel,
cluster = firm)
summary(iv_est, stage = 1:2) # Show both stages
The stage = 1:2 argument reports first- and second-stage results separately.
Key Options and Pitfalls
Clustering
Always cluster standard errors at the unit level (or higher) in panel regressions:
# Two-way clustering (firm and year)
feols(y treat | firm + year, data = panel,
vcov = firm + year)
The ref
argument in event studies
Omitting the reference period leads to multicollinearity. The convention is ref = -1 (one period before treatment). Verify that pre-trend coefficients are small and insignificant.
Never-treated units
In the Sun–Abraham estimator, never-treated units serve as the clean comparison group. Ensure you have enough of them. If all units are eventually treated, consider using "not-yet-treated" units as controls (available via thepanel.id option).
Comparison to Alternatives
| Package | Estimator | Best for |
|---|---|---|
fixest | TWFE, Sun–Abraham | Fast TWFE; IW estimator |
did | Callaway–Sant'Anna | Doubly robust group-time ATTs |
staggered | Roth–Sant'Anna | Efficiency under staggered adoption |
HonestDiD | Rambachan–Roth | Sensitivity analysis to pre-trends |
didimputation | Borusyak–Jaravel–Spiess | Imputation-based DiD |
fixest is the natural first step for panel regressions and event studies. For staggered DiD with rigorous treatment of heterogeneous effects, combine it with did (Callaway–Sant'Anna) or use the sunab() estimator built into fixest.
Conclusion
fixest has become the workhorse R package for panel econometrics: it is fast, flexible, and provides a clean interface for fixed effects, event studies, IV, and the Sun–Abraham staggered DiD estimator. The iplot() function makes producing publication-quality event study figures easy. Every applied economist working with panel data should have fixest in their toolkit.
References
- Berg\'e, L. (2018). Efficient estimation of maximum likelihood models with multiple fixed-effects: The R package FENmlm. CREA Discussion Paper Series, 13.
- Guimaraes, P. and Portugal, P. (2010). A simple feasible procedure to fit models with high-dimensional fixed effects. Stata Journal, 10(4):628--649.
- Sun, L. and Abraham, S. (2021). Estimating dynamic treatment effects in event studies with heterogeneous treatment effects. Journal of Econometrics, 225(2):175--199.
- Callaway, B. and Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200--230.
- Rambachan, A. and Roth, J. (2023). A more credible approach to parallel trends. Review of Economic Studies, 90(5):2555--2591.