Toolbox

Microsoft's EconML: Causal Machine Learning in Python

Text for Webflow CMS Rich Text Field

Microsoft's EconML: Causal Machine Learning in Python April 4, 2026

1 What Problem Does EconML Solve?

The standard tools of econometrics OLS, IV, DiD, matching are designed to estimate a single summary number: the average treatment effect (ATE) or, at most, a few subgroup estimates. But modern data sets are often rich enough to support estimation of heterogeneous treatment effects: how does the effect of a policy or intervention vary across individuals with different characteristics?

EconML is a Python library developed by Microsoft Research that provides a unified interface for estimating heterogeneous treatment effects using methods from machine learning and econometrics [Battocchi et al., 2019]. It implements the Double Machine Learning (DML) estimator of Chernozhukov et al. [2018], the Causal Forest of Wager and Athey [2018], and a suite of metalearners (T-learner, X-learner, DR-learner), all in a consistent scikit-learn-compatible API.

2 Installation and Setup

#Install via pip

#pip install econml

import numpy as np

import pandas as pd

from econml.dml import LinearDML,

CausalForestDMLfrom econml.metalearners import TLearner, XLearner

GradientBoostingRegressor

from sklearn.linear_model import LassoCV, LogisticRegressionCV

import matplotlib.pyplot as plt

3 Generating Simulated Data

Let us simulate data with a heterogeneous treatment effect. The true effect is τ(x)=2+x₁, where x₁ is the first covariate. Treatment is assigned with probability depending on x₁ (endogenous treatment via a confounded propensity score).

np.random.seed(42)

n = 2000

# Covariates

X = np.random.randn(n, 5)  # 5 features

# Unobserved confounder

U = np.random.randn(n)

# Propensity score (depends on X[:,0] and U)

propensity = 1 / (1 + np.exp(-X[:, 0] - 0.5 * U))

T = np.random.binomial(1, propensity)

# True CATE: tau(x) = 2 + X[:,0]

true_cate = 2 + X[:, 0]

# Outcome: treatment effect + confounding from U

Y = true_cate * T + X[:, 0] + 0.5 * U + np.random.randn(n)

4 Double Machine Learning with LinearDML

The LinearDML estimator implements the partially linear model:

Y = θ(X) · T + g(X, W) + ε (1)

with θ(X)=β₀+β₁X₁+..., using cross-fitting to remove regularisation bias. The nuisance functions E[Y|X] and E[T|X] are estimated using any scikit-learn estimator.

# LinearDML: assumes tau(x) is linear in X

est_dml = LinearDML( [cite: 39, 40, 41]

   model_y=Gradient Boosting Regressor(n_estimators = 200), [cite: 42]

   model_t=LogisticRegressionCV(cv=5), [cite: 43]

   cv=5, #cross-fitting folds [cite: 44, 45]

   random_state = 42 [cite: 46]

) [cite: 47]

est_dml.fit (Y, T, X=X) [cite: 48]

# Point estimate and inference for the CATE model

print(est_dml.coef_) # coefficient on X [cite: 49, 50, 51]

print(est_dml.coef__interval()) # 95% confidence interval [cite: 52]

5 Causal Forest with CausalForestDML

For non-parametric heterogeneous effects, CausalForestDML combines DML residualisation with a causal forest:

est_cf = CausalForestDML( [cite: 55, 56, 57]

   model_y=Gradient Boosting Regressor (n_estimators = 200),

   model_t=LogisticRegressionCV( cv=5), [cite: 58]

   n_estimators = 500, [cite: 59]

   cv=5, [cite: 60]

   random_state = 42 [cite: 61]

)

est_cf.fit (Y, T, X=X) [cite: 62]

# Predict CATE for each individual

cate_pred = est_cf.effect (X) [cite: 63, 64]

# Compare to true CATE

print("Correlation with true CATE:", np.corrcoef (cate_pred, true_cate) [0, 1].round(3)) [cite: 65, 66, 67]

# Confidence intervals

cate_lb, cate_ub = est_cf.effect_interval (X, alpha = 0.05) [cite: 68, 69, 70]

6 Metalearners: T-Learner and X-Learner

Metalearners estimate CATE by first estimating outcome models and then combining them. The T-learner fits separate response surfaces for treated and control units; the X-learner, developed by Künzel et al. [2019], is more efficient when treatment and control groups differ substantially in size.

# T-Learner

tl = TLearner (models = Gradient BoostingRegressor (n_estimators = 200)) [cite: 75, 76, 77]

tl.fit (Y, T, X=X) [cite: 78]

cate_t = tl.effect (X) [cite: 79]

# X-Learner

xl = XLearner( [cite: 80, 81, 82, 83]

   models = Gradient BoostingRegressor(n_estimators = 200), [cite: 84]

   propensity_model=LogisticRegressionCV (cv=5) [cite: 85]

)

xl.fit (Y, T, X=X) [cite: 86]

cate_x = xl.effect (X) [cite: 87]

# Compare methods

print("T-learner corr with truth:", np.corrcoef (cate_t, true_cate) [0,1].round (3)) [cite: 88, 89, 90]

print("X-learner corr with truth:", np.corrcoef (cate_x, true_cate) [0,1].round(3)) [cite: 91, 92]

7 Heterogeneity Analysis and Policy Learning

EconML supports heterogeneity analysis and policy learning out of the box. The SingleTree CateInterpre fits a simple decision tree to the estimated CATE function, producing an interpretable sum- mary of which subgroups benefit most:

from econml.cate_interpreter import SingleTree CateInterpreter [cite: 96]

interp = SingleTree CateInterpreter ( [cite: 97, 98]

   include_model_uncertainty=True,

   max_depth = 2, [cite: 99]

   min_samples_leaf = 100 [cite: 100, 101]

)

interp.interpret (est_cf, X) [cite: 102]

interp.plot(feature_names =[f"X{i}" for i in range(5)], fontsize = 12) [cite: 103, 105]

plt.show() [cite: 104]

8 Key Options and Pitfalls

  • Endogenous treatment: EconML's DML estimators handle endogenous treatment via residualisation—they are not designed for pure IV settings with binary instruments (use the DRIV or IntentToTreatDRIV estimators for those cases).
  • Cross-fitting folds: Use at least cv=5 to ensure adequate sample sizes in each fold. For very small samples (n<500), consider cv=10 or leave-one-out.
  • Model selection: The nuisance models (for E[Y|X] and E[T|X]) should be chosen to fit the data well. Gradient boosting and random forests are safe defaults; use cross- validated tuning for the hyperparameters.
  • Calibration: Always check calibration of the CATE estimates using est.score (Y, T, X=X), which reports the R-score (an analogue of R² for CATE estimation).
  • Inference: Confidence intervals from CausalForestDML use honest forests for valid coverage. Point estimates from metalearners do not have built-in inference—use boot- strapping.

9 Comparison to R Alternatives

Method EconML (Python) R equivalent
DML LinearDML, NonParamDML DoubleML package
Causal Forest CausalForestDML grf package
T-/X-Learner TLearner, XLearner grf+ custom code
IV with ML DRIV, IntentToTreatDRIV DoubleML PLIV model
Table 1: Python EconML versus R alternatives

10 Conclusion

EconML provides a comprehensive, production-ready implementation of causal machine learning methods in Python. Its scikit-learn compatibility means that researchers can slot in any ML estimator for the nuisance functions, while EconML handles the DML-style de- biasing and heterogeneous treatment effect estimation. For researchers working in Python who want to estimate CATEs with valid inference, EconML is the natural first choice.

References

  1. Battocchi, K., Dillon, E., Hei, M., Lewis, G., Ness, P., Nichols, A., and Syrgkanis, V. EconML: A Python package for ML-based heterogeneous treatment effects estimation. Version 0.x, 2019. https://github.com/microsoft/EconML.
  2. Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1):C1-C68, 2018.
  3. Künzel, S. R., Sekhon, J. S., Bickel, P. J., and Yu, B. Metalearners for estimating heteroge- neous treatment effects using machine learning. Proceedings of the National Academy of Sciences, 116(10):4156-4165, 2019.
  4. Wager, S. and Athey, S. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228-1242, 2018.

Continue Reading

Browse All Sections →
Home
This is some text inside of a div block.
This is some text inside of a div block.
This is some text inside of a div block.

Article Title