1 The Question
Consider a simple, important question: does going to university increase earnings? We observe that university graduates earn about 60% more per hour than workers with only a high-school diploma. Can we conclude that university education causes a 60% earnings premium?
Most economists would say no, and the reason is selection bias. People who go to university are not a random sample of the population. They tend to be more academically able, come from wealthier families, and have stronger networks—all of which would boost their earnings even without the degree. The raw earnings gap conflates the causal effect of education with these pre-existing advantages. This article explains precisely what goes wrong and introduces the omitted variable bias formula.
2 The Population Model
Suppose the true structural equation for earnings Yᵢ is:
where Sᵢ is years of schooling, Aᵢ is ability (intelligence, work ethic, family connections—all the unobserved traits that affect earnings), and uᵢ is a mean-zero residual independent of both Sᵢ and Aᵢ.
The parameter β₁ is the causal return to schooling: the increase in earnings from one additional year of education, holding ability fixed. This is what we want to estimate.
The problem: we do not observe Aᵢ. We fit the short regression:
The OLS estimator for β̃₁ converges to:
This is the omitted variable bias (OVB) formula. The short regression estimate is biased by an amount equal to the coefficient on the omitted variable (β₂) multiplied by the coefficient from an auxiliary regression of the omitted variable on the included variable (Cov(Sᵢ, Aᵢ) / Var(Sᵢ) ≡ δ).
3 Decomposing the Bias
The OVB formula in equation (3) can be written as:
Two conditions must hold simultaneously for OVB to be non-zero:
- The omitted variable must affect the outcome: β₂ ≠ 0.
- The omitted variable must be correlated with the included variable: δ = Cov(Sᵢ, Aᵢ) / Var(Sᵢ) ≠ 0.
If either condition fails, there is no bias. For the education-earnings example:
- Ability clearly affects earnings: β₂ > 0.
- Ability is positively correlated with education: δ > 0 (able individuals study longer).
Therefore the bias is positive: the raw OLS estimate β̃₁ overstates the causal return to education. This is the "ability bias" in returns-to-schooling estimates.
A Numerical Illustration
Suppose the true return to education is β₁ = 0.06 (6% per year), ability has effect β₂ = 0.10 on log earnings, and the OLS coefficient of ability on schooling is δ = 0.5 years per unit of ability. Then the short-regression bias is 0.10 × 0.5 = 0.05, and:
plim(β̃₁) = 0.06 + 0.05 = 0.11
The raw return appears to be 11% per year—nearly double the true causal effect of 6%.
4 Selection Bias as a Special Case of OVB
Selection bias is the OVB that arises when individuals self-select into treatment in a way that is correlated with the outcome. Let Dᵢ ∈ {0, 1} be a binary treatment indicator. The naïve OLS estimator of the treatment effect converges to:
The first term is the average treatment effect—what we want. The second term is selection bias: the difference in counterfactual outcomes between those who select into treatment and those who do not. If those who choose training are more employable even without training (positive selection), OLS overstates the programme effect. If they are "hard cases"—least likely to find employment on their own (negative selection, as in some remediation programmes), OLS understates it.
5 Graphical Illustration: The DAG Perspective
Figure 1 shows the bias using a directed acyclic graph. Ability (A) is a confounder: it causes both the treatment (education) and the outcome (earnings). To estimate the causal effect of S → Y, we must close the back-door path S ← A → Y—either by controlling for A or by using an instrument for S.
Figure 1: DAG for the education-earnings problem. The causal path runs from schooling (S) to earnings (Y) with coefficient β₁. Ability (A) creates a backdoor path S ← A → Y. OLS conflates the causal path with the backdoor path, overstating β₁.
6 Solutions to the OVB Problem
The OVB formula makes clear that there are two ways to eliminate bias:
- Control for the confounder. If Aᵢ were observed, including it in the regression gives an unbiased estimate of β₁ (the "long regression"). For this to work, Aᵢ must fully capture all confounding. In practice, unmeasured confounders remain.
- Use an instrument. An instrumental variable Zᵢ that affects schooling Sᵢ but is unrelated to ability Aᵢ breaks the bias. Card (1995) used proximity to a college as an instrument for education: living near a college increases schooling but does not directly affect earnings.
- Exploit natural experiments. Differences-in-differences, regression discontinuity, and randomised experiments are all strategies that generate variation in treatment uncorrelated with potential confounders.
7 The Direction and Magnitude of Bias
The OVB formula allows us to reason about the direction of bias even when we cannot estimate it precisely. Some useful cases:
8 Common Mistakes
- "I controlled for everything important." This claim is almost never defensible. Measurement error in controls and genuine omitted variables are ubiquitous.
- "My R-squared is high, so omitted variable bias is small." Wrong. A high R-squared means the included variables explain a lot of variance in the outcome. It says nothing about whether the treatment is correlated with omitted variables.
- Controlling for a mediator. If variable Mᵢ is on the causal path from Dᵢ to Yᵢ (a mediator), controlling for it removes part of the causal effect rather than correcting bias. Only confounders (variables causing both treatment and outcome) should be included.
- Collider bias. Controlling for a variable that is caused by both treatment and outcome (a collider) opens a spurious association. This is a less obvious but equally dangerous form of bias [Pearl, 2009].
9 Where to Learn More
- Angrist and Pischke [2009] Chapter 3 derives the OVB formula clearly and applies it to returns to schooling.
- Pearl [2009] develops the DAG framework for understanding confounding and collider bias.
- Imbens [2015] provides a unified review of estimating average causal effects under unconfoundedness.
10 Conclusion
Ordinary least squares gives a consistent estimate of the causal effect only when the treatment variable is uncorrelated with the error term—that is, when there are no relevant omitted variables. In observational data, this condition almost always fails. The omitted variable bias formula makes the problem concrete and quantifiable: bias equals the product of the effect of the omitted variable on the outcome and its correlation with the treatment. Correcting for OVB—through randomisation, instruments, or regression discontinuity—is the central goal of modern empirical economics.
References
- Angrist, J. D. and Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press.
- Imbens, G. W. (2015). Matching methods in practice: three examples. Journal of Human Resources, 50(2):373-419.
- Pearl, J. (2009). Causality: Models, Reasoning and Inference (2nd ed.). Cambridge University Press[cite: 3].