The Causal Review

1 Introduction

Randomised controlled trials (RCTs) are considered the gold standard for causal inference. When treatment is randomly assigned, a simple comparison of mean outcomes between treated and control groups yields an unbiased estimate of the average treatment effect. No covariates required.

Yet in practice, nearly every analysis of experimental data includes covariates. A regression of outcome on treatment and baseline characteristics is the norm, not the exception. Why? And is this always a good idea?

The question generates genuine disagreement among econometricians and statisticians. Proponents argue that covariate adjustment always reduces variance and should be included by default. Sceptics worry about model misspecification, data snooping, and the introduction of bias in finite samples. This article presents both sides and examines what the evidence says.

2 The Case For Including Covariates

2.1 Variance Reduction

The primary argument for covariate adjustment is statistical efficiency. In a randomised experiment, treatment is independent of all pre-treatment covariates by design. But this does not mean covariates are uninformative about the outcome. If a covariate X is strongly correlated with Y, including it in the regression removes variation in Y that is "explained" by X, reducing the residual variance and tightening the confidence interval for the treatment effect.

Formally, the OLS estimator from the regression Yᵢ = α + βDᵢ + γXᵢ + εᵢ has variance:

$$ \text{Var}(\hat{\beta}_{adj}) = \frac{\sigma_\varepsilon^2}{n \cdot \text{Var}(D)} \cdot \frac{1}{1 - R^2_{D \sim X}} $$

(1)

where σₑ² is the residual variance after controlling for X, and R²_D∼X is the R² from regressing D on X. Since D is randomised and independent of X, R²_(D~X) ≈ 0, so the denominator factor 1 / (1 −R²_(D~X)) ≈ 1 adjustment does not inflate the denominator.

The variance gain from covariate adjustment therefore comes entirely from the numerator: including X in the regression reduces the residual variance σₑ² by absorbing the portion of outcome variation explained by X. The reduction in σₑ² (not any change in the denominator) is what tightens the confidence interval for β̂ₐ_𝒹 ⱼ relative to the unadjusted estimator.

Freedman [2008] showed that in a randomised experiment, the unadjusted estimator is consistent but the adjusted estimator has lower asymptotic variance whenever γ ≠ 0—that is, whenever covariates predict the outcome.

2.2 Lin's Estimator

Lin [2013] pushed this further with a robust result: if the regression includes treatment, covariates, and their interactions (Dᵢ · (Xᵢ − X̄), where X̄ is the sample mean), the adjusted estimator is at least as efficient as the unadjusted estimator asymptotically, regardless of whether the linear model is correctly specified.

$$ Y_i = \alpha + \beta D_i + \gamma'(X_i - \bar{X}) + \delta' D_i(X_i - \bar{X}) + \varepsilon_i $$

(2)

The coefficient β estimates the ATE. The interactions allow the regression to flexibly accommodate heterogeneous treatment effects, and the centring ensures β is estimated at themean of X.

2.3 Regulatory Endorsement

Pre-specified covariate adjustment is now standard in clinical trials and endorsed by regulatory agencies. The FDA and EMA guidance documents recommend covariate adjustment using pre-specified baseline characteristics, arguing it improves power without compromising Type I error rates [FDA, 2023].

3 The Case Against (or Caution)

3.1 Freedman's Critique

Freedman [2008] raised an important concern that is often underappreciated. In a finite sample, OLS coefficient estimates in a randomised experiment are subject to a small-sample bias that does not average out to zero for a fixed number of units.

The bias is of order 1/n and arises because the treatment indicator and covariates are not perfectly orthogonal in any finite sample, even though they are independent in expectation.

The practical implication is minor for large experiments but can matter for small ones (say, n < 50 per arm). The unadjusted difference-in-means is always unbiased, even in finite samples.

3.2 Misspecification and Extrapolation

If the covariate-outcome relationship is nonlinear and the researcher uses a linear regression, covariate adjustment can introduce bias by extrapolating the linear fit outside the region of common support. In a well-powered experiment with good overlap, this is a minor concern. But in experiments with covariate imbalance (which can occur by chance even with randomisation), linear adjustment may correct for the wrong functional form.

3.3 Data Snooping and Multiple Testing

A practical concern is that researchers often do not pre-specify which covariates to include. If covariates are chosen after seeing the data based on which specification produces the smallest p-value the nominal Type I error rate is inflated. Including many covariates without pre-registration introduces researcher degrees of freedom [Simmons et al., 2011].

The solution is pre-registration of the analysis plan, including a pre-specified list of covariates. This maintains the nominal Type I error rate while retaining the efficiency gains from adjustment.

3.4 Post-Stratification and Alternative Methods

Some researchers argue that post-stratification or stratified randomisation is superior to regression adjustment. In stratified randomisation, units are matched on key covariates before treatment is assigned, creating balanced groups by design. The estimator then stratifies the comparison accordingly:

$$ \hat{\tau}_{strat} = \sum_{s} \frac{N_s}{N} \hat{\tau}_s $$

(3)

where s indexes strata. This avoids any misspecification concern with the covariate-outcome relationship.

4 What Does the Evidence Say?

The balance of the literature supports covariate adjustment as generally beneficial:

Lin [2013] proves asymptotic efficiency gains under minimal assumptions using the Lin estimator.
Simulation studies consistently find that covariate adjustment reduces MSE in moderate-to-large samples.
The Freedman critique matters mainly for very small experiments (fewer than 30-50 per arm).
Pre-specification of covariates addresses the data-snooping concern.

The consensus emerging from methodological work is: include pre-specified, pre-treatment covariates using the Lin estimator. This dominates unadjusted estimation in large samples and differs negligibly in small ones.

5 Unresolved Questions

Despite the consensus on the Lin estimator, several questions remain:

How many covariates? In very high-dimensional settings, including all covariates risks overfitting. Double ML methods (Chernozhukov et al. [2018]) extend the partialling-out approach to many covariates via cross-fitting.

Machine learning for covariate adjustment: Recent work by Wager et al. [2016] extends covariate-adjusted ATE estimation to high-dimensional X using ML, but inference is more complex.

Cluster-randomised experiments: When treatment is assigned at the cluster level (schools, villages), covariate adjustment at the cluster level is appropriate, but individual-level covariates require careful treatment.

6 Practical Recommendations

Pre-specify the covariates and adjustment model before seeing the data.
Use the Lin estimator (treatment x demeaned covariates) for robustness to misspecification.
Report both unadjusted and adjusted estimates; if they differ substantially, investigate why.
In small experiments n < 100 total), prefer stratified randomisation or post-stratification over regression adjustment.
Avoid bad controls: Never adjust for variables measured after randomisation.

7 Conclusion

The question of whether to include covariates in RCT analysis is settled at the level of principle but involves nuance in practice. Asymptotically, pre-specified covariate adjustment via the Lin estimator weakly dominates unadjusted estimation. In small finite samples, the Freedman critique cautions against adjustment, and stratified randomisation is preferable.

The key safeguard pre-registration of the analysis plan is as important as the choice of whether to adjust. When the covariates are pre-specified, the model is robust, and the sample is of moderate size, including covariates is almost always the right choice.

References

Lin, W. (2013). Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique. Annals of Applied Statistics, 7(1):295-318.
Freedman, D.A. (2008). On regression adjustments to experimental data. Advances in Applied Mathematics, 40(2):180-193.
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. Econometrics Journal, 21(1):C1-C68.
Simmons, J.P., Nelson, L.D., and Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11):1359-1366.
U.S. Food and Drug Administration (2023). Covariate adjustment in randomized clinical trials for drugs and biological products: Guidance for industry. FDA Guidance Document. Available from the FDA website at https://www.fda.gov/regulatory-information/search-fda-guidance-documents.
Wager, S., Du, W., Taylor, J., and Tibshirani, R.J. (2016). High-dimensional regression adjustments in randomized experiments. Proceedings of the National Academy of Sciences, 113(45):12673-12678.

Should Covariates Always Be Included in Randomised Experiments?

1 Introduction

2 The Case For Including Covariates

2.1 Variance Reduction

2.2 Lin's Estimator

2.3 Regulatory Endorsement

3 The Case Against (or Caution)

3.1 Freedman's Critique

3.2 Misspecification and Extrapolation

3.3 Data Snooping and Multiple Testing

3.4 Post-Stratification and Alternative Methods

4 What Does the Evidence Say?

5 Unresolved Questions

6 Practical Recommendations

7 Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Should Covariates Always Be Included in Randomised Experiments?

1 Introduction

2 The Case For Including Covariates

2.1 Variance Reduction

2.2 Lin's Estimator

2.3 Regulatory Endorsement

3 The Case Against (or Caution)

3.1 Freedman's Critique

3.2 Misspecification and Extrapolation

3.3 Data Snooping and Multiple Testing

3.4 Post-Stratification and Alternative Methods

4 What Does the Evidence Say?

5 Unresolved Questions

6 Practical Recommendations

7 Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title