Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Machine Learning for Impact Variation
Project type
Journal publication
Link
Overview
I was the data scientist on this project, involved in nearly every stage of the workflow. I started by designing the database structure to manage outputs from thousands of simulation scenarios, making sure results were easy to store, organize, and track. Then I built the simulation code itself, functions to generate realistic treatment and outcome patterns that could flex across different sample sizes, covariate sets, and heterogeneity structures. To handle the scale, I added parallel processing so hundreds of replications could run efficiently instead of taking days.
I also worked on how the results were presented. I wrote visualization code to turn the raw simulation output into clear, comparable graphs that showed how different methods behaved across scenarios. Watching those patterns emerge from what first looked like piles of numbers was the most rewarding part. This work has since been submitted to a peer-reviewed journal in 2025.
We ran one of the largest simulation studies to date, testing 18 machine learning methods for estimating conditional treatment effects. BART-S performed best on average, but no method consistently dominated, results shifted with the data, showing there’s no one-size-fits-all solution.

