The Causal Review

1 What Problem Does DoubleML Solve?

Applied researchers frequently want to estimate the causal effect of a treatment or exposure on an outcome while controlling for a large number of covariates. When the covariate set is large relative to the sample size, standard OLS is unreliable: it overfits, and the resulting standard errors are invalid.

A common response is to use machine learning methods for model selection—LASSO, random forests, gradient boosting—to select relevant controls. But naively using ML predictions for causal inference introduces regularisation bias: LASSO, for example, deliberately shrinks coefficients toward zero, which biases the treatment effect estimate if the treatment variable is also regularised.

Double Machine Learning (DML) [Chernozhukov et al., 2018] solves this problem with two key ingredients: (1) partialling out the treatment from the outcome using residuals from ML predictions, and (2) cross-fitting to ensure the residuals are out-of-sample and free of overfitting bias.

The DoubleML package provides a production-quality implementation in both R and Python.

2 Installation

# R install.packages("DoubleML") library(DoubleML)

Python

# Python (pip) # pip install doubleml

The R package requires the mlr3 ecosystem for machine learning. The Python package uses scikit-learn.

3 The DML Estimating Framework

3.1 The Partially Linear Model

The baseline DML model is the partially linear regression:

Y_i = θ₀D_i + g₀(X_i) + ε_i

(1)

D_i = m₀(X_i) + v_i

(2)

where Y is the outcome, D is the (scalar) treatment, X are high-dimensional controls, g₀ and m₀ are unknown functions estimated by ML, and εᵢ, vᵢ have zero mean conditional on Xᵢ.

The target parameter is θ₀, the causal effect of D on Y conditional on X.

3.2 The DML Estimator

The DML estimator proceeds in two steps:

Partial out: Use ML to estimate ĝ₀(x) = Ê[Y | X = x] and m̂₀(x) = Ê[D | X = x].
Form residuals:
Ỹᵢ = Yᵢ − ĝ₀(Xᵢ), D̃ᵢ = Dᵢ − m̂₀(Xᵢ)
‍Regress residual on residual:
θ̂₀ = (∑ D̃ᵢ²)⁻¹ ∑ D̃ᵢ Ỹᵢ

This is the Frisch–Waugh–Lovell theorem applied non-parametrically. By partialling out X from both Y and D, the remaining variation in D̃ is orthogonal to X, removing the confounding.

Cross-fitting: To avoid regularisation bias, ĝ₀ and m̂₀ must be estimated on a different fold of the data than the one used to compute residuals. DML splits the data into K folds and estimates on K−1 folds, predicting on the held-out fold. The final estimate averages over folds. This ensures the residuals are out-of-sample predictions, making them asymptotically unbiased.

3.3 Properties

Under regularity conditions (the ML nuisance functions converge to the truth at rate n⁻¹ᐟ⁴, θ̂₀ is √n-consistency and asymptotically normal:

√ n (θ̂₀ − θ₀)

d →

𝒩(0, V)

where V is a variance that can be consistently estimated. This allows standard confidence intervals and t-tests.

4 A Minimal Working Example (R)

library(DoubleML) library(mlr3) library(mlr3learners) set.seed(42) # Simulate partially linear data n <- 500; p <- 20 X <- matrix(rnorm(n * p), n, p) colnames(X) <- paste0("X", 1:p) D <- 0.5 * X[,1] + rnorm(n) Y <- 1.5 * D + X[,1] + X[,2] + rnorm(n) # True theta_0 = 1.5 # Create DoubleML data object data <- cbind(Y=Y, D=D, as.data.frame(X)) dml_data <- DoubleMLData$new(data, y_col="Y", d_cols="D", x_cols=paste0("X", 1:p)) # Specify learners lasso <- lrn("regr.cv_glmnet", alpha=1) rf <- lrn("regr.ranger")

# Fit DML-PLR (partially linear regression)

dml_plr <- DoubleMLPLR$new(dml_data,

ml_l = lasso, # for Y ~ X (outcome model)

ml_m = lasso, # for D ~ X (treatment model)

n_folds = 5) [cite: 165, 166, 167, 171, 174]

dml_plr$fit()

dml_plr$summary()

# Coefficient should be close to 1.5‍

5 Beyond the Partially Linear Model

DoubleML supports several model classes:

PLR (DoubleMLPLR): Partially linear regression for continuous treatment.
PLIV (DoubleMLPLIV): Partially linear IV for endogenous continuous treatment with an instrument.
IRM (DoubleMLIRM): Interactive regression model for binary treatment, targeting ATE or ATT.
IIVM (DoubleMLIIVM): Interactive IV model for binary treatment with a binary instrument.

‍

# Binary treatment: IRM for ATE # ml_m uses a classification learner (classif.ranger) because D is binary: # the propensity score P(D=1|X) is a probability requiring a classifier. # ml_g uses a regression learner for the outcome model E[Y|D,X]. dml_irm <- DoubleMLIRM$new(dml_data, ml_g = lrn("regr.ranger"), ml_m = lrn("classif.ranger"), score = "ATE", n_folds = 5) dml_irm$fit() dml_irm$summary()

6 Key Options and Pitfalls

6.1 Choosing Learners

The DML estimator is valid for any ML learner that converges fast enough. Practical guidance:

LASSO/Elastic net: Good when the true model is sparse. Use lrn("regr.cv_glmnet").
Random forest: Good for nonlinear, high-dimensional settings. Use lrn("regr.ranger").
Stacking (ensemble): Combining multiple learners often outperforms any single one. Use mlr3pipelines for stacking.

6.2 Common Pitfalls

Too few folds: Use at least 5-fold cross-fitting (K=5). With K=2, variance is high.
Using the same learner for g and m: If Y and D have different functional forms (e.g., Y is binary, D is continuous), use different learner types.
Ignoring clustering: If observations are clustered, set cluster_cols in the data object and use clustered standard errors.

7 Comparison to Alternatives

Method	Strength	Limitation
DoubleML	Valid inference, flexible ML	Assumes unconfoundedness
grf (causal forest)	CATE estimation	Same assumption
PDS LASSO (hdm)	Sparse settings	Parametric
econml (Python)	Many CATE estimators	Less focus on ATE inference

8 Conclusion

The DoubleML package makes the Chernozhukov et al. (2018) DML framework accessible to applied researchers. By combining flexible ML for nuisance function estimation with cross-fitting and Neyman-orthogonal score functions, it achieves valid √n-inference for treatment effects even in high-dimensional settings. For researchers working with many controls but a single or small number of treatments, DoubleML is the recommended tool.

References

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. Econometrics Journal, 21(1):C1-C68.
Bach, P., Chernozhukov, V., Kurz, M.S., and Spindler, M. (2022). DoubleML—an object-oriented implementation of double machine learning in R. Journal of Statistical Software, 103(3):1-45.
Belloni, A., Chernozhukov, V., and Hansen, C. (2014). Inference on treatment effects after selection among high-dimensional controls. Review of Economic Studies, 81(2):608-650.
Robinson, P.M. (1988). Root-N-consistent semiparametric regression. Econometrica, 56(4):931-954.

Double Machine Learning in Practice: The DoubleML Package (R and Python)

1 What Problem Does DoubleML Solve?

2 Installation

3 The DML Estimating Framework

3.1 The Partially Linear Model

3.2 The DML Estimator

3.3 Properties

4 A Minimal Working Example (R)

5 Beyond the Partially Linear Model

6 Key Options and Pitfalls

6.1 Choosing Learners

6.2 Common Pitfalls

7 Comparison to Alternatives

8 Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Article Title

Double Machine Learning in Practice: The DoubleML Package (R and Python)

1 What Problem Does DoubleML Solve?

2 Installation

3 The DML Estimating Framework

3.1 The Partially Linear Model

3.2 The DML Estimator

3.3 Properties

4 A Minimal Working Example (R)

5 Beyond the Partially Linear Model

6 Key Options and Pitfalls

6.1 Choosing Learners

6.2 Common Pitfalls

7 Comparison to Alternatives

8 Conclusion

References

Continue Reading

The causalml Package in Python: Uplift Modeling and CATE Meta-Learners

The gsynth Package in R: Generalized Synthetic Control with Interactive Fixed Effects

Recent Results: Immigration, Migration, and Labour Markets

Natural Experiments: Finding Causal Evidence Without Randomisation

Regression Discontinuity Design: Sharp, Fuzzy, and the CCT Bandwidth

The Credibility Revolution in Econometrics: Thirty Years of Causal Inference

Stay current with causal inference

Article Title