The Causal Review

The Central Question

Suppose you want to know whether a job training programme increases participants' earnings. You observe that programme participants earn, on average, 5Z2,000?

Not necessarily. People who choose to enrol in a job training programme may already be more motivated, more educated, or in better economic circumstances than those who do not enrol. Their higher earnings might reflect those pre-existing advantages, not the effect of the programme. To know the programme's causal effect, you would need to compare each participant's actual earnings to what they would have earned had they not participated. That hypothetical — what would have happened in a world that did not occur — is a counterfactual.

The Potential Outcomes Framework

The potential outcomes framework, introduced by Rubin(1974) and drawing on earlier work by Neyman (1923), provides a formal language for counterfactuals. For each individual i and each possible treatment value, we define a potential outcome: the value of the outcome that would be observed if the individual received that treatment.

In the simplest binary treatment case:

Y_i(1): the earnings of person i if they participate in the training programme.
Y_i(0): the earnings of person i if they do not participate in the training programme.

The causal effect of the programme for person i is:

$$ \tau_i = Y_i(1) - Y_i(0) $$

This is the difference between what would happen in the treated world and what would happen in the untreated world, for the same person. It is a comparison of two potential outcomes.

The Fundamental Problem

Here is the central difficulty: at any given point in time, each person either participates in the programme or does not. We observe Y_i(1) for participants and Y_i(0) for non-participants, but never both for the same person at the same time. The individual treatment effect $\tau_i$ is therefore unobservable.

This is the fundamental problem of causal inference (Holland(1986)): we can never directly observe a counterfactual, because it refers to a world that did not happen.

The observed outcome for person i with treatment status (D_i in {0,1} is:

$$ Y_i = D_i \cdot Y_i(1) + (1 - D_i) \cdot Y_i(0) $$

We observe either Y_i(1) (if (D_i = 1 or Y_i(0) if D_i = 0, but not both.

$$ \text{ATE} = \frac{1}{10} \sum_{i=1}^{10} \tau_i = \frac{5000 + 3000 + 6000 + 4000 + 4000 + 4000 + 5000 + 4000 + 5000 + 4000}{10} = \frac{44,000}{10} = 4,400 $$

A Numerical Example

Consider a small job training programme with five participants and five non-participants. Table shows both potential outcomes for each person (in reality, only one column is observed per person).

Table 1: Potential Outcomes for Ten Individuals (Hypothetical)
Person	D_i	Y_i(0)	Y_i(1)	τ_i = Y_i(1) − Y_i(0)
Alice	1	28,000	33,000	+5,000
Bob	1	32,000	35,000	+3,000
Carol	1	25,000	31,000	+6,000
Dave	1	30,000	34,000	+4,000
Eve	1	26,000	30,000	+4,000
Frank	0	20,000	24,000	+4,000
Grace	0	18,000	23,000	+5,000
Henry	0	22,000	26,000	+4,000
Iris	0	19,000	24,000	+5,000
Jack	0	21,000	25,000	+4,000

pants only — is:

$$ \text{ATT} = \frac{1}{5} \sum_{i:D_i=1} \tau_i = \frac{5000 + 3000 + 6000 + 4000 + 4000}{5} = \frac{22,000}{5} = 4,400 $$

In this example, ATE = ATT = $4,400. The programme raises earnings by $4,400on average.

What We Actually Observe

In practice, we observe Y_i(1) for participants (Alice through Eve) and Y_i(0) for non-participants (Frank through Jack). The naive comparison of means is:

$$ \bar{Y}_{\text{treated}} = \frac{33000 + 35000 + 31000 + 34000 + 30000}{5} = 32,600 $$

$$ \bar{Y}_{\text{control}} = \frac{20000 + 18000 + 22000 + 19000 + 21000}{5} = 20,000 $$

$$ \hat{\tau}_{\text{naive}} = 32,600 - 20,000 = 12,600 $$

The naive estimate is $12,600 — nearly three times the true effect! The biasarises because participants already had higher earnings potential than non-participantseven before the programme: their Yi(0) values (28–32k) are much higher than nonparticipants’ Yi(0) values (18–22k). This is selection bias

Formally, the naive comparison estimates:

$$ \hat{\tau}_{\text{naive}} = \underbrace{\bar{Y}(1) - \bar{Y}(0)}_{\text{observed difference}} = \underbrace{\text{ATT}}_{\text{true effect}} + \underbrace{(\bar{Y}_{\text{treated}}(0) - \bar{Y}_{\text{control}}(0))}_{\text{selection bias}} $$

How Do We Solve the Problem?

The fundamental problem of causal inference means that τ_i can never be directly observed. But population-level summaries like the ATE or ATT can be identified —that is, expressed as functions of the observable data distribution — under additionalassumptions. Three main approaches exist:

Randomisation. If treatment is randomly assigned, then D_iis independent of(Y_i(0), Y_i(1)). This means the control group provides a valid counterfactual for thetreatment group: E[Y_i(0) | D_i = 1] = E[Y_i(0) | D_i = 0], so selection bias is zero.Randomised controlled trials (RCTs) are the “gold standard” for this reason.

Conditional independence (selection on observables). If treatment assignment is “as good as random”conditional on observed covariates X_i — that is, (Y_i(0),Y_i(1)) ⊥D_i | X_i — then within cells defined by X_i, there is no selection bias. This is the “unconfoundedness” assumption (Rosenbaum and Rubin(1983)). The propensity scorep(Xi) = Pr(Di = 1 | Xi) can be used to reweight the sample and recover the ATE(Rosenbaum and Rubin, 1983).

Natural experiments. When an external factor (an instrument, a policy discontinuity, a lottery) creates quasi-random variation in treatment, we can exploit this variation to estimate causal effects without directly observing the counterfactual. This is the logic of instrumental variables, regression discontinuity, and difference-in-differences.

Why This Matters

The potential outcomes framework is not just a mathematical formalism — it is a way of thinking about causality that clarifies what questions are answerable and what assumptions are required. Before asking "does X cause Y?" you should ask: what is the counterfactual? For whom? Over what time horizon? Under what conditions?

These questions are often glossed over in casual empirical reasoning, leading to confused claims about causation. The framework forces precision: a causal claim is a statement about a comparison between two potential outcomes, and any evidence for that claim must somehow address the fundamental problem that only one of those outcomes is observed.

Conclusion

The counterfactual is the central object in causal inference. We want to know what would have happened in a world that did not occur. The fundamental problem is that this world is unobservable, so we must identify the causal effect from what we can observe by making assumptions. The most important assumption is that we have a valid comparison group — a group whose observed outcomes, after suitable adjustment, tell us what the treated group's outcomes would have been without treatment. All methods of causal inference — experiments, matching, IV, DiD, regression discontinuity — are strategies for finding or constructing such a comparison group.

References

Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688--701.
Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396):945--960.
Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41--55.
Imbens, G. W. and Rubin, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, Cambridge.
Angrist, J. D. and Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press, Princeton, NJ.
Pearl, J. (2009). Causality: Models, Reasoning, and Inference. 2nd edition. Cambridge University Press, Cambridge.

What Is a Counterfactual?

The Central Question

The Potential Outcomes Framework

The Fundamental Problem

A Numerical Example

What We Actually Observe

How Do We Solve the Problem?

Why This Matters

Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

What Is a Counterfactual?

The Central Question

The Potential Outcomes Framework

The Fundamental Problem

A Numerical Example

What We Actually Observe

How Do We Solve the Problem?

Why This Matters

Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title