The Causal Review

1 Introduction

‍The Stable Unit Treatment Value Assumption SUTVA is so foundational to the potential outcomes framework that it is easy to forget it is an assumption at all. Most introductory treatments state it in passing and move on. Yet for a growing share of the settings researchers actually care about social programmes with peer effects, platform experiments with network externalities, public health interventions with herd immunity, place-based policies with agglomeration effects SUTVA is violated by design. The units interact. What happens to your neighbour shapes what happens to you, regardless of your own treatment status.

This article examines what SUTVA violation means for identification, surveys the frameworks that have been developed to handle interference, and reviews the practical strategies researchers use when they cannot assume units are isolated from one another. The econometrics of interference is no longer a speciality corner; it is increasingly central to credible causal inference.

2 SUTVA and Why It Can Fail

‍Rubin [1980] formalised SUTVA as requiring two things: (1) no multiple versions of treatment (the treatment received by unit i is well-defined and does not vary with who administers it), and (2) no interference between units (the potential outcome of unit i depends only on i's own treatment status, not on the treatment status of any other unit j ≠ i). The second condition the no-interference clause is the focus of this article.

Under no interference, we write unit i's potential outcomes as Y_i(0) and Y_i(1), indexed solely by i's own treatment D_i ∈ {0, 1}. The observed outcome is Y_i = Y_i(Di). This is clean and tractable.

When units interact, the potential outcomes must instead be indexed by the entire treatment vector D = (D₁, ..., D_n). Unit i's outcome is Y_i(d) for any assignment vector d ∈ {0, 1}^n. With n units, there are 2^n possible assignment vectors a combinatorial explosion that makes non-parametric identification essentially impossible at any useful sample size [Hudgens and Halloran, 2008].

The practical question is therefore not whether SUTVA holds in an absolute sense, but whether the interference is structured enough that we can still learn something useful. Several frameworks impose tractable structure.

3 The Partial Interference Framework

Hudgens and Halloran [2008] proposed the partial interference assumption: units are partitioned into clusters (households, villages, classrooms) such that interference occurs freely within clusters but not across clusters. Within cluster g, unit i's potential outcome is Y_ig(d_g), depending on the full treatment vector of cluster g but not on the treatment of units in other clusters.

This is enormously useful for vaccine trials, deworming programmes, and conditional cash transfer evaluations, where a household is affected by what happens to its neighbours but not by what happens in a distant village. Under partial interference, Hudgens and Halloran [2008] define four estimands:

Direct effect: the effect of treating unit i conditional on the cluster treatment allocation α (proportion treated in the cluster).‍
Indirect (spillover) effect: the effect on unit i of changing the cluster allocation from α to α', holding i's own treatment fixed.‍
Total effect: the combined direct and indirect effect.‍
Overall effect: the population-average effect of the policy α relative to no treatment.

A two-level randomised experiment first randomise clusters into treatment arms, then randomise individuals within treated clusters separately identifies all four estimands.

A Worked Example: Deworming in Kenya

‍A canonical application is the Miguel and Kremer [2004] deworming study in Kenya, later reanalysed extensively for spillover effects. Deworming reduces worm burden in treated children but also, by reducing transmission, in untreated children in the same school. A naive intention-to-treat estimate that ignores spillovers underestimates the total social benefit of the programme. Hoyt and Bjorkman [2009] later used the partial interference framework to decompose total and indirect effects in a similar setting.

4 Exposure Mappings

‍Aronow and Samii [2017] generalised partial interference by introducing exposure mappings. Rather than requiring an arbitrary cluster partition, an exposure mapping f_i(D) summarises the features of the treatment assignment that are relevant for unit i's outcome. For instance, in a social network experiment, f_i(D) might be the fraction of i's friends who are treated:

$$f_i(\mathbf{D}) = \frac{1}{|N_i|} \sum_{j \in N_i} D_j \tag{1}$$

where N_i is the set of i's network neighbours. Once an exposure mapping is specified, the potential outcomes are written Y_i(d, f) for own treatment d and exposure level f. Identification proceeds via inverse probability weighting under a known treatment assignment mechanism [Aronow and Samii, 2017].

The power of this approach is its flexibility: the researcher specifies, and hence makes transparent, exactly which channel of interference is being modelled. The limitation is equally clear: if the exposure mapping is misspecified, the estimand is not well defined.

5 Approximate Neighbourhood Interference

‍Leung [2022] proposed a different approach suited to large networks. Rather than specifying an exposure mapping, Leung [2022] assumes that interference decays with network distance: units far apart in the network have negligible effect on one another. Formally, unit i's potential outcome Y_i(D) depends non-trivially on the treatment of units within K steps of i in the network, but the influence of units at distance > K is approximately zero. As K grows, more interference is accommodated; as the network becomes sparse, the approximation becomes sharper.

Under approximate neighbourhood interference, central limit theorems hold for appropriately constructed estimators, enabling inference even without a clean cluster structure [Leung, 2022]. This is particularly relevant for observational social network data where the analyst has no control over the assignment mechanism.

6 Experimental Designs Under Interference

When the researcher has control over randomisation, several designs are better suited to settings with interference than a simple Bernoulli trial.

Clustered Randomisation

‍The most common approach is to randomise treatment at a higher level of aggregation than the unit of analysis randomise villages, not households. If interference is confined within villages (partial interference), this eliminates cross-cluster contamination. The cost is a reduction in effective sample size and hence statistical power.

Bipartite Experiments

‍In platform settings (e.g., two-sided markets), interference runs between buyers and sellers rather than among buyers. Johari et al. [2022] formalised bipartite experiments: randomise on one side of the market and measure outcomes on the other, using a design that limits interference across the experimental boundary. Large technology companies, including LinkedIn, have reported using variants of this approach.

The Bernoulli Design and Horvitz-Thompson Estimation

‍When the full treatment vector D is randomised as independent Bernoullis with known probability p, the Horvitz-Thompson estimator:

$$\hat{\tau} = \frac{1}{n} \sum_{i=1}^{n} \frac{D_i Y_i}{p} - \frac{(1 - D_i) Y_i}{1 - p}$$

(2)

remains unbiased for the average treatment effect if no interference is present. Under exposure mapping interference, a modified Horvitz-Thompson estimator:

$$\hat{\tau}(d, f) = \frac{1}{n} \sum_{i=1}^{n} \frac{\mathbf{1}[D_i = d, f_i(\mathbf{D}) = f]}{\Pr(D_i = d, f_i(\mathbf{D}) = f)} Y_i$$

(3)

identifies the mean potential outcome at exposure (d, f) [Aronow and Samii, 2017]. Variance estimation is, however, challenging because outcomes of units that share network neighbours are correlated.

7 Difference-in-Differences with SpilloversInterference creates particular difficulties for DiD designs, which typically assume that the control group provides a valid counterfactual for the treated group. If a policy in region A generates spillovers to neighbouring region B, using B as a control group will produce a biased estimate of the direct policy effect. The spillover contaminates the control.

This is not a merely theoretical concern. Autor [2003] studied the employment effects of the Americans with Disabilities Act and noted that firms near the size threshold of legal applicability showed different pre-trends than firms far from the threshold. Goldsmith-Pinkham et al. [2020] discuss how spillovers across geographic units can invalidate shift-share instruments that treat geographic variation as independent.

A practical remedy is to exclude "border" control units those most likely to be affected by spillovers from treated units from the control group. Alternatively, researchers can construct a spillover treatment variable (e.g., fraction of neighbouring counties that received treatment) and include it explicitly in the regression, estimating a joint model of direct and spillover effects.

8 A Diagram of Interference StructuresFigure 1 illustrates three structures of interference: no interference (SUTVA), partial interference (cluster-contained), and general network interference.

(A) SUTVA holds

(B) Partial interference

Figure 1: Three interference structures. Dashed arrows indicate spillover pathways. Panel (A): SUTVA no cross-unit influence. Panel (B): Partial interference within clusters only. Panel (C): Network interference arbitrary spillover patterns.

9 Current Debates and FrontiersWhat is the right estimand under interference?

When units interact, the "average treatment effect" ceases to be well defined without specifying the counterfactual treatment allocation. Manski [2013] showed that with social interactions, the relevant policy comparison is between full-population treatment policies, not individual-level treatment assignments. This raises deep questions about what applied researchers should report.

Equilibrium effects.

Large-scale policies that change prices or alter labour market equilibria produce spillovers that no finite sample of clusters can circumvent. The general equilibrium critique that evaluation in a small pilot cannot speak to the effects of full-scale rollout-applies whenever treatment operates through market-clearing mechanisms [Heckman, 1998]. Structural modelling is one response; Acemoglu et al. [2022] provide a recent treatment of how to identify equilibrium effects in production networks.

Estimating spillovers in observational data.

‍Most of the identification results above assume a known, researcher-controlled assignment mechanism. In observational settings, identifying spillover effects requires an instrument that generates variation in neighbourhood treatment independent of own treatment a demanding requirement that few natural experiments satisfy cleanly.

10 Implications for Practice

Applied researchers should take the following steps when spillovers are plausible:

State the interference assumption explicitly. SUTVA, partial interference, and exposure mapping are all defensible in different settings; make clear which one you are invoking.
‍Design around interference. When randomising, use clusters large enough that cross-cluster spillovers are negligible, and pre-specify the cluster level.
‍Report both direct and indirect effects. A direct effect estimate alone misses social multipliers that may be the dominant channel of policy impact.
‍Conduct spillover robustness checks. Estimate the effect on "near-control" units separately from "far-control" units; the difference is informative about contamination.

11 Conclusion

SUTVA is an assumption, not a law of nature. For a large class of economically interesting interventions public health, information diffusion, labour market programmes, platform experiments interference is the norm rather than the exception. The frameworks developed by Hudgens and Halloran [2008], Aronow and Samii [2017], and Leung [2022] show that credible causal inference is still possible when units interact, but it requires more careful experimental design, more transparent specification of the interference structure, and more thoughtful definition of the estimand. As datasets increasingly capture network structure and as policy interventions operate at scale, the econometrics of interference will only grow in importance.

References

Acemoglu, D., Akigit, U., and Kerr, W. R. (2022). Networks and the macroeconomy: an empirical exploration. NBER Macroeconomics Annual, 30:276-335.
Aronow, P. M. and Samii, C. (2017). Estimating average causal effects under general interference, with application to a social network experiment. Annals of Applied Statistics, 11(4):1912-1947.
Autor, D. H. (2003). Outsourcing at will: the contribution of unjust dismissal doctrine to the growth of employment outsourcing. Journal of Labor Economics, 21(1):1-42.
Goldsmith-Pinkham, P., Sorkin, I., and Swift, H. (2020). Bartik instruments: what, when, why, and how. American Economic Review, 110(8):2586-2624.
Heckman, J. J. (1998). Detecting discrimination. Journal of Economic Perspectives, 12(2):101-116.
Bjorkman, M. and Svensson, J. (2009). Power to the people: evidence from a randomized field experiment on community-based monitoring in Uganda. Quarterly Journal of Economics, 124(2):735-769.
Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. Journal of the American Statistical Association, 103(482):832-842.
Johari, R., Li, H., Liskovich, I., and Weintraub, G. Y. (2022). Experimental design in two-sided platforms: an analysis of bias. Management Science, 68(10):7069-7089.
Leung, M. P. (2022). Causal inference under approximate neighborhood interference. Econometrica, 90(1):267-293.
Manski, C. F. (2013). Identification of treatment response with social interactions. Econometrics Journal, 16(1):S1-S23.
Miguel, E. and Kremer, M. (2004). Worms: identifying impacts on education and health in the presence of treatment externalities. Econometrica, 72(1):159-217.
Rubin, D. B. (1980). Comment on "Randomization analysis of experimental data: the Fisher randomization test" by D. Basu. Journal of the American Statistical Association, 75(371):591-593.[cite: 8]

‍

Interference and Spillovers: When SUTVA Fails and What To Do About It

1 Introduction

2 SUTVA and Why It Can Fail

3 The Partial Interference Framework

A Worked Example: Deworming in Kenya

4 Exposure Mappings

5 Approximate Neighbourhood Interference

6 Experimental Designs Under Interference

Clustered Randomisation

Bipartite Experiments

The Bernoulli Design and Horvitz-Thompson Estimation

9 Current Debates and FrontiersWhat is the right estimand under interference?

Equilibrium effects.

Estimating spillovers in observational data.

10 Implications for Practice

11 Conclusion

References

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Interference and Spillovers: When SUTVA Fails and What To Do About It

1 Introduction

2 SUTVA and Why It Can Fail

3 The Partial Interference Framework

A Worked Example: Deworming in Kenya

4 Exposure Mappings

5 Approximate Neighbourhood Interference

6 Experimental Designs Under Interference

Clustered Randomisation

Bipartite Experiments

The Bernoulli Design and Horvitz-Thompson Estimation

9 Current Debates and FrontiersWhat is the right estimand under interference?

Equilibrium effects.

Estimating spillovers in observational data.

10 Implications for Practice

11 Conclusion

References

Continue Reading

The ivmte Package in R: Marginal Treatment Effects and Bounding Policy-Relevant Parameters

The contdid Package in R: Estimating Dose-Response Functions with Continuous Treatments

Recent Results: Housing Markets, Rent Control, and Urban Economics

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title