top of page
  • LinkedIn
  • Medium

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

Machine Learning for Impact Variation

Project type

Journal publication

Overview

I was the data scientist on this project, involved in nearly every stage of the workflow. I started by designing the database structure to manage outputs from thousands of simulation scenarios, making sure results were easy to store, organize, and track. Then I built the simulation code itself, functions to generate realistic treatment and outcome patterns that could flex across different sample sizes, covariate sets, and heterogeneity structures. To handle the scale, I added parallel processing so hundreds of replications could run efficiently instead of taking days.

I also worked on how the results were presented. I wrote visualization code to turn the raw simulation output into clear, comparable graphs that showed how different methods behaved across scenarios. Watching those patterns emerge from what first looked like piles of numbers was the most rewarding part. This work has since been submitted to a peer-reviewed journal in 2025.

We ran one of the largest simulation studies to date, testing 18 machine learning methods for estimating conditional treatment effects. BART-S performed best on average, but no method consistently dominated, results shifted with the data, showing there’s no one-size-fits-all solution.

© 2025 by Polina Polskaia

bottom of page