MDplus_

Kickoff event in

Introduction

Welcome to the 2023 MDplus Datathon! The datathon is a chance for you to be able work with medical students, graduate students, and healthcare workers from all backgrounds to derive data-driven insights and innovate solutions that advance patient care. In the process, you'll have the chance to think and learn about complex healthcare problems and what you can do to tackle them.

The theme of this year's datathon is value-based care. Your goal is to work in teams of 3-5 people to explore a common dataset with the purpose of answering a question or innovating a solution in alignment with this theme. The top submissions will receive exciting prizes (to be announced)!

Logistics

The 2023 MDplus Datathon runs from October 25, 2023 at 6 pm EST to November 15, 2023 at 11:59 pm AOE. Participating teams will use quantitative analyses (e.g. visualization, statistics, and other computational tools) to form clinical insights and contextualize them into actionable proposals for relevant stakeholders. As part of the datathon, participants will be invited to attend (optional) workshops and private events with sponsors (i.e., Python/R bootcamps, oral presentation workshops, fireside chats, etc.).

Signup with a Team or Individually

Sign ups for the 2023 datathon are now closed. Please contact our team with any additional questions or concerns.

Tutorials

The link to the Datathon's GitHub repository can be found here.

Downloading and Overview of the MIMIC-IV Dataset

Written Tutorial

Introduction to Python

Written Tutorial  |  Example Code

Introduction to R

Written Tutorial  |  Example Code

Introduction to MIMIC-IV: Python

Written Tutorial  |  Example Code

Introduction to MIMIC-IV: R

Example Code

Events Schedule

Wed 10/25 @6p EDTDatathon Kickoff Event (Recording)
 Join us for the launch of the 2nd annual MD+ datathon event! We'll cover event logistics, judging, and prizes.

Mon 10/30 @7p EDTIntroduction to Python (Recording)
 Perfect for beginner programmers and for those that have never programmed with Python before.

Wed 11/01 @7p EDTNavigating MIMIC-IV with Python (Recording)
 Learn how to load, parse, and analyze the MIMIC-IV dataset with Python.

Tues 11/07 @7p ESTOffice Hours #1 with Lathan: R, General
 Need help debugging an R program or want to bounce your ideas off us? Talk to our team at virtual office hours!

Wed 11/08 @7p ESTOffice Hours #2 with Michael: Python, General
 Need help debugging a Python program or want to bounce your ideas off us? Talk to our team at virtual office hours!

Sat 11/11 @noon ESTOffice Hours #3 with Michael: Python, General
 Need help debugging a Pythonprogram or want to bounce your ideas off us? Talk to our team at virtual office hours!

Mon 11/20 @5 ESTFinalist Pitch Competition (Zoom Link)
 Join us in hearing pitches from the seven finalist teams for the 2023 datathon as they vie for the grand prize of $3,000.

AsynchronousBest Practices in Data Science (Recording)
 Listen to this talk from last year's datathon by Olivier Humblet, Head of HEOR at Regeneron.

Meet the Judges

Reza Alavi, MD, MHS, MBA

Johns Hopkins University, Quintuple Aim Solutions

Caroline Berchuck, MD, MPH

Brigham & Women's Hospital, McKinsey & Company

Amit Phull, MD

Northwestern University, Doximity

Sid Salvi

ML/AI Product Leader & Advisor, prev. Cerebral

Kathryn Teng, MD, MBA, FACP

Progressive Insurance, prev. Cleveland Clinic

Finalists

Congratulations to our finalist teams and all datathon participants! Finalist teams are invited to the final pitch event happening on Monday 11/20 @5pm EST. This is a public event for all to attend. Finalists are listed below in alphabetical order. Many thanks to Dr. Julia Bondar for her help in helping select the finalist teams.

Adrenergic Data-1 Receptor Agonists

Minimizing Chronic Kidney Disease (CKD) Underdiagnosis Using Machine Learning

Dany Alkurdi (Mt. Sinai), Felipe Giuste (Emory University), Lawrence Huang (Brown Univeristy), Keyvon Rashidi (Texas A&M), Sachin Shankar (University of Cincinnati)

ALEYA

Significant association of social work referral and 30-day unplanned hospital readmission for patients with alcohol-related disorders using MIMIC-IV data

Amy Oh (Brown University), Archita Goyal (Tufts University), Emily Leventhal (Mt. Sinai), Lei Zhou (Mendel, ML consultant)

Dodecahedron

Can We Curb Frequent ED Visits Due to Alcohol-Related Conditions?

Cailin Winston (University of Washington, Seattle), Caleb Winston (Stanford University), Chloe Winston (University of Pennsylvania), Claris Winston (University of Washington, Seattle), Cleah Winston (University of Washington, Seattle)

Rational Rockets

Clinigrapher: Automatic Knowledge Graph Extraction from Medical Discharge Notes for Clinical Decision Support

J.T. Bassett (University of Toledo), Kiran Boyinepally (University of Toledo), Lauren Fang (University of Toledo), John Vergis (University of Toledo)

SuperLearners

Contrast Overuse in Patients with Renal Disease: A Targeted Analysis

Soryan Kumar (Brown University), Ashwin Mahendra (Florida Atlantic University), Arnav Kumar (Princeton University)

Team Pivot

Analyzing Acuity as a Tool for Value-Based Care

Joao Arthur Kawase De Queiroz Goncalves (University of Miami), Ian Ong (University of Pennsylvania), Sophie Reznik (University of Minnesota), Nikola Susic (University of Miami)

Until It Compiles

Machine Learning-driven forecasting and characterization of the ICU-admitted Heart Failure Patient Population in the MIMIC-IV v.04 database

Nikith Erukulla (University of Illinois), Jeff Kim (University of Illinois), Simon Liu (University of Illinois), Lucas Myint (University of Illinois), Shashank Sandu (University of Illinois)

FAQs

I have little/no data science or computational experience. Can I still participate?

Absolutely! Learning the computational tools is half the fun of the event. Participants will have access to tutorials walking through the basics of Python and R, and also how to go about analyzing the dataset. Datathon submissions are also judged on more than just technical complexity. In fact, datathon judges care more about the insights derived and the data analysis than the computational novelty or complexity of the project!

What will we actually be doing?

Students will be provided a dataset (e.g., claims and hospital data) and are then asked to identify an addressible problem (e.g., understanding the impact of hospital quality metrics on spending) to explore through analyzing the dataset (e.g., an observational study comparing high- vs. low- quality hospitals, an interpretable ML model predicting spending rates depending on hospital attributes and outcomes, etc.) to create an actionable recommendation (e.g., quality metrics should be indexed by spending, and efforts to deliver high quality care will result in more value-based spending habits).

I don't have the best laptop for data analysis... Can I still participate?

Yes! We've partnered with Hugging Face 🤗 to bring you free access to powerful computational resources dedicated to the event. To get started, join our MDplus Hugging Face community here.

What's the time commitment look like?

It's flexible and depending on your group and project!

I have additional questions? Who can I reach out to?

Send us an email or DM us via Slack! The best folks to reach out are the co-directors of data science and AI, Eric Shan (email) and Michael Yao (email). We're happy to answer any questions from signing up for the datathon to technical questions during the event.