Introduction
Causal inference is not one discipline but three at least if you count frameworks. A researcher trained in the Rubin tradition reaches for potential outcomes notation, randomisation arguments, and the language of estimands. A researcher schooled in Pearl’s programme draws directed acyclic graphs (DAGs), invokes d-separation, and applies the do-calculus. A structural econometrician writes systems of simultaneous equations, worries about the rank condition, and draws on the Cowles Commission tradition. For most of the twentieth century, these communities talked past each other. Recent work culminating in the synthesis of Imbens (2020) has clarified where the frameworks agree, where they genuinely differ, and which questions each is best suited to answer.
This article surveys all three, identifies their deep equivalences, and offers a practical guide to choosing a framework for your research question.
1 The Potential Outcomes Framework
The potential outcomes (PO) framework traces to Rubin (1974) and was popularised in statistics by Holland (1986). Its primitive is the potential outcome: Yᵢ(d) denotes what unit i's outcome would be if it received treatment d. The causal effect for unit i is simply Yᵢ(1) − Yᵢ(0)—a quantity that is never fully observed because each unit is observed in only one treatment state. This is the fundamental problem of causal inference (Holland, 1986).
Aggregate causal parameters are defined as averages over the distribution of potential outcomes:
Identification requires assumptions that link the observable distribution P(Y,D,X) to the unobservable potential outcome distribution. The key identifying conditions are:
- SUTVA (Stable Unit Treatment Value Assumption): no interference between units, no hidden versions of treatment (Imbens and Rubin, 2015).
- Strong ignorability: (Y(0), Y(1)) ⟂ D | X (unconfoundedness) and 0 < P(D=1 | X) < 1 (overlap).
The PO framework’s strength is design transparency: it forces the researcher to specify a target estimand before touching the data, and connects randomisation to unconfoundedness via the Fisher randomisation argument. Its weakness is that it offers limited guidance on which variables to condition on—the framework is agnostic about the structural data-generating process.
2 Directed Acyclic Graphs and the Do-Calculus
Pearl (2009) developed a language for causal reasoning based on structural causal models (SCMs) and their graphical representation. A DAG G = (V,E) has nodes V (variables) and directed edges E ("X causes Y directly"). The graph encodes conditional independence relations via the d-separation criterion: variables X and Y are d-separated by Z in G if and only if X ⟂ Y | Z in any distribution compatible with the graph.
The key tool is the do-operator: P(Y | do(X=x)) represents the distribution of Y when X is externally set to x (not just observed to equal x). The backdoor criterion gives a simple graphical condition for when conditioning on a set Z suffices to identify the causal effect of X on Y:
provided Z blocks all backdoor paths from X to Y and contains no descendant of X. When no such Z exists, the front-door criterion may still identify the effect via an observed mediator.
The DAG framework’s strength is assumption transparency: by drawing the graph, the researcher commits to specific causal claims (which variables are connected, what the direction of causality is) that can be scrutinised by subject-matter experts. It also handles collider bias naturally—a common pitfall that is invisible in the PO framework.
3 Structural Equation Models
The structural econometrics tradition originates with Haavelmo (1944)’s probability approach and was institutionalised by the Cowles Commission. A structural equation model (SEM) specifies a system of equations describing the data-generating process:
where Yₚₐ₍ⱼ₎ are the direct causes of Yⱼ and Uⱼ is an unobserved disturbance. Identification requires restrictions: exclusion restrictions (some variables appear in some equations but not others) and distributional assumptions on U.
Heckman (1997) championed the SEM approach precisely because it supports policy counterfactuals: once the structural parameters are identified, one can simulate the effect of any hypothetical policy that changes the parameters of specific equations. By contrast, reduced-form estimates from the PO tradition identify only the specific policy variation embodied in the natural experiment.
The framework’s weakness is fragility: misspecification of any equation contaminates the identified parameter, and the Lucas critique (Lucas, 1976) warns that structural parameters estimated under one policy regime may not be stable under a different one.
4 Equivalences and Bridges
The three frameworks are more similar than their proponents often acknowledge. Imbens (2020) provides the definitive comparison.
- PO and DAGs. A DAG with independent errors (Uj ⊥ Uk for j ̸ = k) implies a setof conditional independence restrictions via d-separation. These are exactly the independence restrictions embedded in the potential outcome model under the corresponding noconfounding assumptions. Pearl’s do-calculus and Rubin’s PO framework produce the sameidentified estimand whenever the unconfoundedness assumption holds (Pearl, 2009). Themain difference is language and emphasis: PO researchers tend to focus on design (how wastreatment assigned?), while DAG researchers focus on structure (what is the data-generatingmechanism?).
- DAGs and SEMs. A structural equation model is a DAG with specific functional forms for each equation. The d-separation criterion applied to the corresponding DAG gives the same conditional independence implications as the model’s reduced form. Peters et al. (2017) provide a comprehensive treatment of this equivalence in their textbook on causal inference.
- IV in all three frameworks. Consider an instrument Z for treatment D on outcome Y. In the PO framework, Z is valid if Z ⟂ (Y(0,0), Y(0, 1), Y(1,0), Y(1,1)) (independence) and the exclusion restriction Yᵢ(d,z) = Yᵢ(d) holds. In the DAG, Z is valid if there is no direct edge Z → Y and Z is not a descendant of D or Y. In the SEM, Z enters the D-equation but is excluded from the Y-equation. These conditions are formally equivalent (Angrist et al., 1996).
5 Where the Frameworks Diverge
Despite deep equivalences, the frameworks give genuinely different guidance in three settings.
- Mediation. Defining direct and indirect effects requires cross-world potential outcomes: Yᵢ(1, Mᵢ(0)), the outcome under D = 1 with the mediator set to its value under D = 0. These quantities exist naturally in the DAG/SEM world but are deeply problematic in the PO world. Robins (2003) showed that natural direct and indirect effects are generally non-identified without untestable cross-world independence assumptions.
- Treatment effect heterogeneity. The PO LATE theorem (Angrist et al., 1996) identifies the average effect for compliers—the subpopulation induced to take treatment by the instrument. In the SEM, the same IV moment identifies a different weighted average of structural parameters. The two need not coincide when effects are heterogeneous and the model allows complex selection (Heckman, 1997). The marginal treatment effect (MTE) framework of Heckman and Vytlacil (2005) provides a bridge. the MTE is the derivative of the average structural effect with respect to the propensity score,and LATE equals the average MTE over the complier range.
- Handling cycles. Standard DAGs are acyclic by definition. Economic models often have simultaneous causation (supply and demand). The SEM handles this via simultaneous equations and the rank condition. For DAGs, ancestral graph Markov models allow for bidirected edges representing latent common causes—a partial solution, though structural cycles remain outside the DAG framework.
6 A Practical Guide
How should an applied researcher choose a framework?
- Use PO when the focus is on a specific, well-defined treatment with a plausible identification strategy (RCT, IV, DiD, RD). The estimand-first approach forces clarity before modelling decisions are made.
- Use DAGs when the confounding structure is complex and you need to reason about valid controls. DAGs are especially valuable for identifying bad controls, mediators, and colliders—mistakes that PO reasoning alone may not catch.
- Use SEMs when the policy question requires out-of-sample extrapolation, general equilibrium reasoning, or structural parameters invariant to interventions. The framework is also natural when the data-generating process is well understood from theory.
In practice, the most productive approach combines all three: specify the target estimand in PO notation; draw a DAG to determine valid controls; and write a structural equation if extrapolation is required .
7 Conclusion
The three frameworks are different languages for the same underlying reality: the causal structure of the world. The PO framework excels at clarity of estimands. The DAG framework excels at encoding assumptions transparently. The SEM excels at supporting structural policy analysis. Researchers fluent in all three are better equipped to choose the right tool for each scientific problem.
References
- Angrist, J. D., Imbens, G. W., and Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91(434):444-455.
- Haavelmo, T. (1944). The probability approach in econometrics. Econometrica, 12(Supplement):1-118.
- Heckman, J. J. (1997). Instrumental variables: A study of implicit behavioral assumptions used in making program evaluations. Journal of Human Resources, 32(3):441-462.
- Heckman, J. J. and Vytlacil, E. (2005). Structural equations, treatment effects, and econometric policy evaluation. Econometrica, 73(3):669-738.
- Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396):945-960.
- Imbens, G. W. (2020). Potential outcome and directed acyclic graph approaches to causality: Relevance for empirical practice in economics. Journal of Economic Literature, 58(4):1129-1179.
- Imbens, G. W. and Rubin, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press.
- Lucas, R. E. (1976). Econometric policy evaluation: A critique. Carnegie-Rochester Conference Series on Public Policy, 1:19-46.
- Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge University Press.
- Peters, J., Janzing, D., and Schölkopf, B. (2017). Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press.
- Richardson, T. and Spirtes, P. (2002). Ancestral graph Markov models. Annals of Statistics, 30(4):962-1030.
- Robins, J. M. (2003). Semantics of causal DAG models and the identification of direct and indirect effects. In Highly Structured Stochastic Systems, pages 70-81. Oxford UP.
- Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688-701.