The Causal Review

1 The Causal Question

How much does an additional year of schooling raise lifetime earnings? The private return to education is one of the most studied quantities in labour economics. Yet estimating it is notoriously difficult: individuals who choose more education differ from those who choose less in ways that are correlated with earnings regardless of schooling. Raw OLS regressions of earnings on years of schooling are contaminated by ability bias, family background differences, and innate earnings potential. Identifying the causal effect requires a source of variation in schooling that is unrelated to these unobservables.

Angrist and Krueger [1991] proposed a remarkably elegant solution: use the quarter of birth as an instrument for educational attainment. Their paper is among the most cited in the IV literature and sparked a decade of debate about instrument validity, weak instruments, and the proper interpretation of local average treatment effects.

2 The Identification Strategy

2.1 The Compulsory Schooling Mechanism

U.S. compulsory schooling laws require children to remain in school until a specified age typically 16 or 17—regardless of how many years of schooling they have completed. School entry rules in most U.S. states require children to be six years old by a specific date (often December 31 or September 1) at the start of the school year.

These two policies interact in a way that generates variation in educational attainment by season of birth:

A child born in the fourth quarter (October-December) enters school younger than a child born in the first quarter (January-March) of the following year, because the fourth-quarter child is the youngest in their cohort.
Because compulsory schooling laws bind by age rather than by grade completed, first-quarter children who hit the minimum leaving age during the school year can legally drop out with less schooling than fourth-quarter children of the same cohort.

The upshot: first-quarter-born children are systematically more likely to reach the compulsory leaving age while still in lower grades, giving them a higher rate of early departure from school. Fourth-quarter-born children enter school older and, when they hit the leaving age, have completed slightly more schooling.

Figure 1 illustrates the first stage: the positive relationship between birth quarter and mean years of schooling in U.S. Census data.

Mean years of schooling

Sawtooth pattern: Q1 lowest
within each cohort year

Birth cohort (quarter)

2.2 IV Validity Conditions

For quarter of birth to be a valid instrument for years of schooling, three conditions must hold:

(i) Relevance: Birth quarter must predict years of schooling. This is satisfied by the compulsory schooling mechanism and confirmed empirically by significant first-stage regressions (F-statistics well above 10 in the main specifications).

(ii) Exclusion: Birth quarter must affect earnings only through its effect on schooling. This rules out direct biological effects of season on earnings capacity (e.g., vitamin D, prenatal conditions) or sorting effects (seasonal patterns in parental characteristics correlated with child outcomes).

(iii) Independence: Birth quarter must be uncorrelated with unobservable determinants of earnings. Angrist and Krueger argue this holds because parents cannot precisely control birth timing in ways correlated with child ability.

3 Data and Setting

Angrist and Krueger [1991] use data from the 1970 and 1980 U.S. Census of Population. The sample consists of men born between 1930 and 1959. Census data provide information on earnings, years of schooling completed, and birth quarter (derived from quarter-of-birth responses). The final sample for the main analysis covers over 300,000 men, providing substantial statistical power.

Years of schooling are self-reported in the Census and vary from 0 to 18 or more. The earnings measure is log weekly earnings, which partially controls for hours variation. The paper focuses on white men to abstract from labour market discrimination.

4 Key Findings

4.1 First Stage

The first stage is stark: men born in the first quarter of the year complete, on average, about 0.1 fewer years of schooling than men born in the fourth quarter of the same year. This difference is small in absolute terms but precisely estimated in the large Census sample. The F-statistic for the first stage exceeds 10 across virtually all specifications, well above the conventional threshold for weak instrument concerns at the time of the paper's writing.

4.2 OLS and IV Returns

The OLS estimate of the return to schooling—simply regressing log earnings on years of schooling—yields a coefficient of approximately 0.07, implying a 7% wage premium per additional year of education. This estimate is almost certainly upward biased by ability sorting: more able individuals both complete more schooling and earn more for reasons unrelated to their education.

The 2SLS estimate using quarter of birth as an instrument yields a coefficient of approximately 0.10—substantially higher than OLS, not lower. This result was surprising. If OLS is upward biased by ability, the IV estimate should be lower. Instead, it is higher.

Angrist and Krueger [1991] interpret this through the lens of the LATE theorem [Imbens and Angrist, 1994]: the instrument identifies the return to schooling for the specific group of men who complied with the instrument—those who left school earlier because they hit the compulsory leaving age. These compliers are likely individuals from disadvantaged backgrounds who were on the margin of dropping out. The IV estimate of 10% reflects the high returns to education for this marginal group.

4.3 Heterogeneity and Policy Relevance

The higher IV than OLS estimate also fits a model in which the returns to education are heterogeneous—high for those at the margin of dropout (compliers with the compulsory schooling law), lower on average for higher-ability individuals who would have obtained education regardless. This heterogeneity means the LATE recovered by the QOB instrument may not represent the policy-relevant average treatment effect for education policies targeting different populations.

5 The Subsequent Debate: Weak Instruments and Validity

The Angrist-Krueger paper ignited one of the most important methodological debates in applied econometrics.

5.1 Bound, Jaeger, and Baker (1995)

Bound et al. [1995] raised two serious concerns. First, when Angrist and Krueger extend the instrument set from 3 quarter-of-birth dummies to interactions of birth quarter with year of birth (yielding 30 instruments), the first stage becomes weak relative to the instrument count. Weak instruments can lead to severely biased IV estimates that approach OLS in finite samples. Bound, Jaeger and Baker showed that randomly generated instruments (with zero population correlation with schooling) produced similar IV estimates to QOB when the instrument set was similarly large—a damning critique of the identification.

Second, they raised concerns about the exclusion restriction: season of birth is correlated with maternal characteristics (older and more educated mothers are less likely to have winter births in some datasets), violating the assumption that birth quarter is uncorrelated with earnings potential.

5.2 Staiger and Stock (1997)

Staiger and Stock [1997] formalised the theory of weak instruments, showing that when the first-stage F-statistic is low, 2SLS is severely biased and normal inference is invalid. Their recommendation of $F \ge 10$ as a minimum threshold for reliable 2SLS inference has become standard practice. Applied to the Angrist-Krueger setting, the large 30-instrument specification fails this threshold.

5.3 The Defence and the Canonical Status

Angrist and Krueger acknowledged the weak instrument concerns and emphasised that their preferred 3-quarter specification has a strong first stage. The paper's contribution—introducing the idea of using institutional timing rules as instruments—has proven enormously generative. The general strategy of exploiting birthday cutoffs in educational systems has been replicated in dozens of countries and contexts, often with cleaner first stages and more convincing exclusion restrictions than the original US setting.

6 Limitations and What We Learn

The Angrist-Krueger paper illustrates several durable lessons about IV:

(1) Exclusion restrictions are untestable and must be argued on economic grounds. Birth quarter's exclusion can be questioned (maternal health, school readiness differences by birth month), and no amount of data directly tests whether QOB affects earnings only through schooling.

(2) LATE, not ATE. The IV estimate recovers returns for compulsory-schooling compliers—marginal dropouts from disadvantaged backgrounds. This is a meaningful parameter, but it does not speak to the average return to schooling across the population.

(3) Weak instruments matter. Using too many instruments to increase first-stage power introduces bias and invalidity when the instruments are jointly weak. The 30-instrument specification is a cautionary tale that has shaped how applied economists think about instrument proliferation.

(4) Effect heterogeneity can reverse the naive prediction. The IV above OLS finding, while puzzling at first, is consistent with high returns for marginal students and heterogeneous returns more broadly—a preview of the debates about treatment effect heterogeneity that would dominate the following two decades.

References

Angrist, J. D. and Krueger, A. B. (1991). Does compulsory school attendance affect schooling and earnings? Quarterly Journal of Economics, 106(4), 979-1014.

Bound, J., Jaeger, D. A., and Baker, R. M. (1995). Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association, 90(430), 443-450.

Card, D. (1995). Using geographic variation in college proximity to estimate the return to schooling. In L. N. Christofides, E. K. Grant, and R. Swidinsky (Eds.), Aspects of Labour Market Behaviour: Essays in Honour of John Vanderkamp. University of Toronto Press, pp. 201-222.

Imbens, G. W. and Angrist, J. D. (1994). Identification and estimation of local average treatment effects. Econometrica, 62(2), 467-475.

Staiger, D. and Stock, J. H. (1997). Instrumental variables regression with weak instruments. Econometrica, 65(3), 557-586.

Angrist and Krueger (1991): Season of Birth, Compulsory Schooling, and Returns to Education

1 The Causal Question

2 The Identification Strategy

2.1 The Compulsory Schooling Mechanism

2.2 IV Validity Conditions

3 Data and Setting

4 Key Findings

4.1 First Stage

4.2 OLS and IV Returns

4.3 Heterogeneity and Policy Relevance

5 The Subsequent Debate: Weak Instruments and Validity

5.1 Bound, Jaeger, and Baker (1995)

5.2 Staiger and Stock (1997)

5.3 The Defence and the Canonical Status

6 Limitations and What We Learn

References

‍

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Angrist and Krueger (1991): Season of Birth, Compulsory Schooling, and Returns to Education

1 The Causal Question

2 The Identification Strategy

2.1 The Compulsory Schooling Mechanism

2.2 IV Validity Conditions

3 Data and Setting

4 Key Findings

4.1 First Stage

4.2 OLS and IV Returns

4.3 Heterogeneity and Policy Relevance

5 The Subsequent Debate: Weak Instruments and Validity

5.1 Bound, Jaeger, and Baker (1995)

5.2 Staiger and Stock (1997)

5.3 The Defence and the Canonical Status

6 Limitations and What We Learn

References

‍

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title