The Causal Review

1 The Problem: Unobserved Heterogeneity

Imagine you want to estimate the effect of a job-training programme on wages. You observe wages and participation status for 500 workers. Workers who participated have higher wages on average but are they higher because of the training, or because workers who self-select into training are more motivated, more skilled, or better-connected in the first place?

The selection problem is severe when the unobserved differences between groups are the very things that explain both treatment receipt and the outcome. In the job-training example, ability and motivation are exactly such confounders: they are unobserved, they predict participation, and they predict wages. A cross-sectional regression of wages on a training indicator will capture the compound effect of training and selection on unobservables not the causal effect of training alone.

The fixed effects (FE) estimator uses repeated observations of the same unit over time to solve this problem, under a crucial condition: the unobserved confounders must be time-invariant. This article explains the logic, the estimator, its assumptions, and when it works.

2 Panel Data and the Fixed Effects Model

A panel dataset follows N units (individuals, firms, countries) over T time periods. For unit i and time t, we observe (Yᵢₜ, Dᵢₜ, Xᵢₜ): an outcome, a treatment indicator, and a set of observed controls.

The fixed effects model is:

Y_it = α_i + βD_it + γ^TX_it + ε_it (1)

where αᵢ is a unit fixed effect a time-invariant intercept that absorbs all unobserved, time-invariant characteristics of unit i. This single term αᵢ captures everything that is stable over time for unit i: ability, culture, geography, initial endowments. The coefficient β is the parameter of interest: the causal effect of the treatment Dᵢₜ, holding fixed the unit-specific level αᵢ.

2.1 A Numerical Example

Suppose we observe two workers—Alice and Bob in two years:

Table 1: Wages and training participation for Alice and Bob

Worker	Year	Wage	Training	Δ Wage
Alice	2022	30,000	0
Alice	2023	35,000	1	+5,000
Bob	2022	50,000	0
Bob	2023	53,000	1	+3,000

Cross-sectionally, Bob earns more than Alice in both years but this reflects his higher unobserved ability, not the effect of training. The fixed effects estimator compares each worker to themselves over time. Alice's wage rose by $5,000 when she took training; Bob's rose by $3,000. The FE estimate of the training effect is the average of these within-person changes: ($5,000 + $3,000) / 2 = $4,000.

The crucial point: we never directly compare Alice to Bob. We only ask: how did each person's wage change when they went from untreated to treated? The stable unobserved factor αᵢ (ability, motivation) cancels out in the within-person comparison.

3 The Within Estimator

The fixed effects estimator is often called the within estimator because it exploits within-unit variation. To see why, take the time-average of equation (1) for unit i:

Ȳ_i = α_i + βD̄_i + γ^TX̄_i + ε̄_i (2)
(Y_it - Ȳ_i) = β(D_it - D̄_i) + γ^T(X_it - X̄_i) + (ε_it - ε̄_i) (3)

where bars denote time averages: Ȳᵢ = T⁻¹ Σₜ Yᵢₜ Subtracting from (1):

Assumption (Strict Exogeneity): E[ε̃_it | D̃_is, X̃_is, ∀s] = 0 for all t.

The fixed effect αᵢ has vanished. Equation (3) is the demeaned model: all variables are expressed as deviations from their unit-specific means. Running OLS on (3) is numerically equivalent to including a dummy variable for each unit i (the Frisch-Waugh-Lovell theorem guarantees this).

The intuition: by subtracting each unit's mean, we strip away everything that is constant for that unit including the time-invariant unobservables. What remains is only the within-unit variation in treatment and outcome, and it is this variation that identifies β.

4 The Identifying Assumption

The FE estimator is consistent under: [INSERT ASSUMPTION BLOCK HERE]

In plain language: after removing unit means, the treatment Dᵢₜ must be uncorrelated with the residual at any time period past, present, or future.

This rules out: • Feedback from past outcomes. If this year's wage affects next year's training participation, strict exogeneity fails. • Anticipation effects. If workers adjust behaviour before training starts (because they know it's coming), the pre-period outcome is endogenous. • Time-varying confounders. If ability grows differentially across workers (some learn faster), and faster learners also participate in training, the time-varying ability is an un-controlled confounder.

The FE estimator controls for time-invariant unobservables only. It does not solve the endogeneity problem if the relevant confounders change over time. This is perhaps its most important limitation.

5 Two-Way Fixed Effects

Many applications add both unit fixed effects αᵢ and time fixed effects δₜ:

Y_it = α_i + δ_t + βD_it + γ^TX_it + ε_it (4)

The time fixed effects δₜ absorb aggregate shocks common to all units macroeconomic fluctuations, policy changes that affect everyone. By including both, we control for the combination of stable unit characteristics and common time trends. This is the two-way fixed effects (TWFE) model, ubiquitous in applied economics.

The TWFE estimator compares the within-unit change in the outcome to the average within-unit change across all units at the same time a difference-in-differences logic at its core. In fact, the DiD estimator with two groups and two periods is a special case of TWFE.

Recent literature has shown that with staggered treatment adoption (units receiving treatment at different times), TWFE can produce misleading estimates by using already-treated units as controls for later-treated units [Goodman-Bacon, 2021, Callaway and Sant'Anna, 2021]. For a detailed treatment of staggered designs, see earlier issues of The Causal Review.

6 What Fixed Effects Cannot Do

Understanding the limits of FE is as important as understanding what it achieves:

Time-invariant treatments. If a treatment never changes within a unit (e.g., gender, country of birth), it is perfectly collinear with the unit fixed effect and cannot be estimated. Fixed effects "difference out" any variable that does not change over time.
Time-varying confounders. FE does not control for variables that change over time and are correlated with treatment. For example, if training participation responds to wage shocks in the same year (reverse causality), FE will not fix the bias.
Incidental parameters problem. With many units and few time periods (N ≫ T), the fixed effects α̂ᵢ are estimated imprecisely. In nonlinear models (logit, probit), this imprecision transmits into a bias in β̂ of order 1/T. For linear models, this problem does not arise.

7 A Simple Code Example in R

1 library (fixest) # Fast fixed effects estimation
2
6
T
# Simulate a balanced panel
set.seed (42)
N <- 200 ; T <- 5
id <- rep (1: N, each = T)
time <- rep (1:T, times = N)
alpha_i <- rep (rnorm(N, mean = 0 , sd = 2), each = T) # unit FE
delta_t <- rep (c(0, 0.5, 1.0, 1.5, 2.0), times = N) # time trend
D <- as.integer (time >= 3 & id <= 100) # treatment post-period, treated units
Y <- alpha_i + delta_t + 2*D + rnorm (N*T, sd = 1)
df <- data.frame(Y, D, id, time)
fit <- feols (Y ~ D | id + time, data = df)
summary (fit)

The key function feols() from the fixest package [Bergé, 2018] implements the within transformation via the algorithm of Guimarães and Portugal [2010], which is far faster than including explicit dummy variables. The formula Y ~ D | id + time specifies that id and time fixed effects are to be absorbed.

8 Common Mistakes

• Confusing FE with controlling for observable characteristics. Adding a unit FE is not the same as controlling for all unit-level variables. FE eliminates time-invariant unobservables regardless of whether they are observed. Adding individual observable controls on top of FE is fine but redundant for time-invariant variables (they are subsumed by αᵢ). • Interpreting the FE coefficient as a structural parameter. The within estimator identifies the short-run effect of changing treatment status within a unit. In a long panel, this may differ from the long-run effect if adjustment is gradual. • Forgetting to cluster standard errors. With panel data, residuals εᵢₜ for the same unit i are typically correlated across time. Standard errors must be clustered at the unit level (or higher) to be valid (Bertrand et al., 2004). • Using FE when only between-unit variation is available. If treatment never switches (everyone treated or no one treated within a unit), there is no within-unit variation and FE cannot estimate β.

9 Where to Learn More

For a thorough treatment of panel data methods, Wooldridge [2010] is the standard graduate-level reference. Angrist and Pischke [2009] Chapter 5 provides an accessible overview oriented toward applied researchers. For the recent literature on TWFE in staggered designs, Roth et al. [2023] is an excellent survey.

10 Conclusion

The fixed effects estimator is one of the most widely used tools in applied economics precisely because it handles a pervasive problem—time-invariant unobserved heterogeneity—in a simple, transparent way. By comparing each unit to itself over time, it effectively controls for stable but unobservable characteristics.

The within transformation is intuitive, the implementation is fast in modern software, and the identifying assumption is explicit. But it solves only part of the endogeneity problem: time-varying confounders, reverse causality, and anticipation effects all require additional tools. Understanding both what fixed effects can and cannot do is foundational for any applied researcher working with panel data.

References

Angrist, J. D. and Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press, Princeton, NJ.
Bergé, L. (2018). Efficient estimation of maximum likelihood models with multiple fixed-effects: The R package FENmlm. CREA Discussion Paper 2018-06.
Bertrand, M., Duflo, E., and Mullainathan, S. (2004). How much should we trust differences-in-differences estimates? Quarterly Journal of Economics, 119(1):249-275.
Callaway, B. and Sant'Anna, P. H. C. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2):200-230.
Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of Econometrics, 225(2):254-277.
Guimarães, P. and Portugal, P. (2010). A simple feasible alternative procedure to estimate models with high-dimensional fixed effects. Stata Journal, 10(4):628-649.
Roth, J., Sant'Anna, P. H. C., Bilinski, A., and Poe, J. (2023). What's trending in difference-in-differences? A synthesis of the recent econometrics literature. Journal of Econometrics, 235(2):2218-2244.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data. 2nd ed. MIT Press, Cambridge, MA. 7

Fixed Effects and Panel Data: Controlling for What You Cannot Observe

1 The Problem: Unobserved Heterogeneity

2 Panel Data and the Fixed Effects Model

2.1 A Numerical Example

3 The Within Estimator

4 The Identifying Assumption

5 Two-Way Fixed Effects

6 What Fixed Effects Cannot Do

7 A Simple Code Example in R

8 Common Mistakes

9 Where to Learn More

10 Conclusion

References

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Fixed Effects and Panel Data: Controlling for What You Cannot Observe

1 The Problem: Unobserved Heterogeneity

2 Panel Data and the Fixed Effects Model

2.1 A Numerical Example

3 The Within Estimator

4 The Identifying Assumption

5 Two-Way Fixed Effects

6 What Fixed Effects Cannot Do

7 A Simple Code Example in R

8 Common Mistakes

9 Where to Learn More

10 Conclusion

References

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title