Previous Masterclasses

Fall 2016

September 26-27, 2016
Victor Chernozhukov, MIT
Machine Learning for Treatment Effects and Structural Equation Models



The course provides a practical introduction to modern high-dimensional function fitting methods — a.k.a. machine learning methods  — for efficient estimation and inference on treatment effects and structural parameters in empirical economic models.  Participants will use R to allow them to immediately internalize and  use the techniques in their own academic and industry work.  All lectures, except the introductory one, will be accompanied by R-code that can be used to reproduce the empirical examples.  Thus, there will be no gap between theory and practice.


This is the 7th edition of the course to be given at Georgetown University. Previous editions were given at GSERM St. Gallen, CEMMAP, MIT in 2015 and 2016 as parts of the 14.387 course “Applied Econometrics” and in the Summer School of Econometrics of the Bank of Italy in Perugia. 


1. Causal Inference in Approximately Sparse Linear Structural Equations Models.

  • Approximately sparse econometric models as generalizations of conventional econometric models
  • “Double lasso” or “double partialling out” methods for efficient estimation and inference of causal parameters in these models.
  • Various empirical examples.
  • References: 3, 4.

2. Understanding of the Inference Strategy via the Double Partialling Out and Adaptivity.

  • Theory: Frisch-Waugh 3Partialling Out.  Adaptivity.
  • Laying a strategy for the use of non-sparse and generic ML methods.
  • R Practicum:  Mincer Equations, Barro-Lee, and Acemoglu-Johnson-Robinson examples.
  • References: 3,4, 6.

3. ML Methods for Prediction = Reduced Form Estimation.  Evaluation of ML Methods using Test Samples.

  • Penalization Regression Methods: Ridge, Lasso, Elastic Nets, etc.
  • Regression Trees, Random Forest, Boosted Trees.
  • Modern Nonlinear Regression via Neural Nets and Deep Learning
  • Aggregation and Cross-Breading of the ML methods.
  • R Practicum: Simulated, Wage, and Pricing Examples.
  • References:  1, 2, 9-11.­­ 

4. ML Methods for Causal Parameters -- “Double” Machine Learning for Causal Parameters in Treatment Effect Models and Nonlinear Econometric Models

  • Using generic ML (beyond Lasso) to Estimate Coefficients in Partially Linear Methods
  • Using generic ML to estimate ATE, ATT, LATE in Heterogeneous Treatment Effect Models
  • Using generic ML methods to estimate structural parameters in Moment Condition problems.
  • R-practicum: 401(k) Example.
  • References:  5, 6, 7, 8.

5. Scalability:  Working with Large Data. MapReduce, Hadoop and all that

  • MapReduce, Sufficient Statistics, Linear Estimators
  • MapReduce and Computation of Nonlinear Estimatos via Distributed Gradient Descent
  • MapReduce in R. 


Please bring your computer to class. Install R and R-studio.  Install packages “hdm”, “glmnet”, “nnet”, “randomForest”, “rpart”, “rpart.plot”, “gbm” from cran (e.g. type install.packages(“gbm”)) If you are not familiar with R, try out  several introductory tutorials that are available online. Please read and understand the idea of cross-validation (k-fold cross-validation) to prevent overfitting, and bias and variance tradeoffs in nonparametric estimation.  I will be mentioning these briefly in class, but I will count on you understanding this background concepts.  A good reference is “Elements of Statistical Learning” which is available from Tibshirani’s website.


  • The Elements of Statistical Learning by T. Hastie, R. Tibshirani, and J. Freedman. The book can be downloaded for free! 
  • An Introduction to Statistical Learning with Applications in R, by G. James, D. Witten, T. Hastie and R. Tibshirani.  The website has a lot of handy resources. 
  • "High-Dimensional Methods and Inference on Treatment and Structural Effects in Economics, "J. Economic Perspectives 2014, Belloni et. al. Stata replication code is here.  R code implementation is in package “hdm”. 
  • Inference on Treatment Effects After Selection Amongst High-Dimensional Controls (with an Application to Abortion and Crime),"ArXiv 2011, The Review of Economic Studies, 2013, Belloni et. al. Stata and Matlab programs are here; replication files here.  R code implementation in package “hdm”. 
  • “Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations”, Arxiv 2013, Journal of Econometrics, 2015.  by M. Farrell. 
  • "Post-Selection and Post-Regularization Inference: An Elementary, General Approach," Annual Review of Economics 2015, V. Chernozhukov, C. Hansen, and M. Spindler.  R code implementation in package “hdm”. 
  • "Program Evaluation and Causal Inference with High-Dimensional Data,"ArXiv 2013, Econometrica, 2016+,  A. Belloni et al. R code implementation in package “hdm”.  Replication files via Econometrica website. 
  • “Double Machine Learning for Causal and Treatment Effects”, MIT Working Paper,  V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey. 
  • "Big Data: New Tricks for Econometrics," Journal of Economic Perspectives 2014, H. Varian. 
  • “Economics in the age of big data,” Science 2014, L. Einav, J. Levin. 
  • “Prediction Policy Problems,” American Economic Review P&P 2015, J. Kleinberg, J. Ludwig, S. Mullainathan, Z. Obermeyer. 

Spring 2016

March 31-April 1, 2016
Charles Manski, Northwestern University
Kenneth Wolpin, Rice University
The Role of Theory and Uncertainty in Policy Evaluation

This masterclass brings together two highly distinguished economists to examine the role of theory and uncertainty in policy evaluation. It is organized in six sessions. The first session focuses on the use of descriptive statistics. The second session discusses partial identification of the treatment response model using the right-to-carry laws as a case study. The third session considers partial identification of a structural model. The fourth session, complementary to the third, examines ex ante policy evaluation using both parametric and nonparametric approaches. The fifth session discusses the policy use of discrete choice dynamic programming models and will cover both methodological issues and empirical applications. The final session is devoted to a discussion of the importance of (simple) theory in inferential empirical work. 

Fall 2015

December 7-8, 2015
Steven T. Berry
Yale University
Empirical Models of Differentiated Products

This course considers the identification and estimation of models of market equilibrium with differentiated products, including applications to various policy-relevant markets. Most real-world markets feature differentiated products, in reputation and service quality if not in explicit product characteristics. Examples include private markets for physically differentiated goods (automobiles) and markets for various media products (newspapers), as well as for partially privatized and highly regulated goods (such as education in many countries). In the course, we consider how data can reveal demand and cost parameters, including recent results on formal identification. We go on to discuss theoretical, practical and computational aspects of estimation. While many of the models condition on the set of products being offered, we also consider models with endogenous products characteristics, such as location, type and quality of the product. Empirical applications feature policy relevant markets like health, media and education, as well as classic applications to antitrust analysis.