Healthcare AI & Electronic Health Records
Overview
This learning path teaches you how to apply deep learning to electronic health records (EHR), combining structured medical codes, clinical text, and temporal patient trajectories. You will master healthcare foundation models like ETHOS, understand clinical NLP with ClinicalBERT, and learn the interpretability requirements for medical AI systems.
Thesis Relevance: This path is directly critical for the EmergAI thesis, where you will extend ETHOS with multimodal patient-reported data (symptom text + 3D sketches) to improve emergency department outcome prediction.
Prerequisites
Before starting this path, you should have:
- Strong transformer understanding: Attention mechanisms, multi-head attention
- Language modeling knowledge: GPT architecture, LM training
- Python/PyTorch proficiency: Implement transformer models
- Healthcare motivation: Interest in applying AI to medical data
Learning Objectives
By completing this path, you will be able to:
- Navigate EHR databases (MIMIC-III/IV) and understand medical coding systems (ICD-10, ATC, CPT)
- Tokenize medical events for transformer models using code-level, hierarchical, or BPE approaches
- Process clinical text with ClinicalBERT and perform medical named entity recognition
- Explain ETHOS architecture and masked event modeling pre-training in detail
- Implement multimodal extensions combining EHR + text + images for healthcare prediction
- Validate medical AI using attention visualization, SHAP, clinical validation protocols, and fairness audits
- Design thesis experiments comparing multimodal vs. EHR-only approaches
Path Structure
Week 1: EHR Fundamentals & Clinical NLP
Part 1: EHR Structure and Medical Coding (4 hours)
Concept: Electronic Health Records: Structure and Medical Coding
Topics:
- EHR data components (demographics, diagnoses, procedures, medications, labs, vitals)
- Temporal patient trajectories and event sequences
- Medical coding systems: ICD-10 (diagnoses), ATC (medications), CPT (procedures)
- MIMIC-III/IV database structure and access
- Challenges: missing data, temporal sparsity, high dimensionality, hierarchies
Learning Activities:
- Explore MIMIC-III/IV: Complete CITI training and access MIMIC-III/IV demo
- Query EHR data: Write SQL queries to extract patient trajectories
- Understand coding: Look up ICD-10 codes for common diagnoses (MI, diabetes, pneumonia)
- Visualize trajectories: Plot temporal event sequences for sample patients
Checkpoint: Can you explain the temporal structure of EHR data and query MIMIC-III/IV for patient trajectories?
Part 2: Medical Event Tokenization and Clinical NLP (6 hours)
Concept: Medical Event Tokenization and Clinical NLP
Topics:
- Code-level tokenization: each medical code as a token
- Hierarchical tokenization: leveraging ICD-10/ATC structure
- BPE tokenization: handling rare codes with subwords
- Temporal encoding: sinusoidal and learned time embeddings
- ClinicalBERT: BERT pre-trained on MIMIC clinical notes
- Medical named entity recognition (NER)
- Combining structured codes and clinical text
Learning Activities:
- Implement tokenization: Code-level and hierarchical tokenizers for medical events
- Use ClinicalBERT: Load model from HuggingFace, encode clinical notes, compare to general BERT
- Medical NER: Extract diagnoses from clinical text using BioBERT NER
- Temporal encoding: Implement time embeddings for patient trajectories
Checkpoint: Can you tokenize a patient trajectory and encode clinical notes with ClinicalBERT?
Week 2: Healthcare Foundation Models & Validation
Part 3: Healthcare Foundation Models (6 hours)
Concept: Healthcare Foundation Models: ETHOS, BEHRT, Med-BERT, GatorTron
Topics:
- ETHOS: Zero-shot health trajectory prediction via masked event modeling
- Architecture: encoder-only transformer for EHR sequences
- Pre-training: masked event modeling (15% masking)
- Zero-shot transfer: predict new tasks without fine-tuning
- Performance: AUROC 0.87 for mortality, 0.72 for readmission
- BEHRT: Bidirectional encoding of diagnosis code sequences
- Med-BERT: Hierarchical embeddings (code-level + visit-level)
- GatorTron: 8.9B parameter clinical language model
- Multimodal extension for EmergAI: EHR + symptom text + 3D sketches
Learning Activities:
- Read ETHOS paper: Zero-shot Health Trajectory Prediction (Renc et al., 2024)
- Implement ETHOS architecture: Simplified version with event embeddings + transformer encoder
- Masked event modeling: Pre-training loop with 15% event masking
- Compare models: Understand differences between ETHOS, BEHRT, Med-BERT, GatorTron
- Design multimodal extension: Sketch architecture for EmergAI (ETHOS + ClinicalBERT + sketch encoder)
Checkpoint: Can you explain ETHOS architecture and implement masked event modeling pre-training?
Part 4: Interpretability and Clinical Validation (4 hours)
Concept: Interpretability and Validation for Healthcare AI
Topics:
- Why interpretability is critical in healthcare (safety, trust, regulation)
- Attention visualization: which events the model focuses on
- SHAP values: quantifying feature contributions
- Clinical validation protocol: retrospective, prospective, clinician review
- Failure analysis: identifying systematic errors
- Fairness audits: checking for demographic bias
Learning Activities:
- Visualize attention: Extract and plot attention weights from ETHOS
- Compute SHAP: Use SHAP library to explain predictions
- Failure analysis: Identify patterns in false positives vs. false negatives
- Fairness audit: Evaluate performance across age, sex, race groups
Checkpoint: Can you visualize attention maps and compute SHAP values for EHR predictions?
Key Papers
Primary Reference (Must Read)
- ETHOS: Zero-shot Health Trajectory Prediction Using Transformers (Renc et al., 2024) - Your thesis baseline
Foundation Models
- BEHRT: Transformer for Electronic Health Records (Xie et al., 2020)
- Med-BERT: Pre-trained Contextualized Embeddings (Rasmy et al., 2020)
- GatorTron: Large Language Model for Medical Research (Yang et al., 2022)
Clinical NLP
- ClinicalBERT: Modeling Clinical Notes (Alsentzer et al., 2019)
- BioBERT: Pre-trained Biomedical Language Model (Lee et al., 2019)
Interpretability & Fairness
- SHAP: Unified Approach to Interpreting Predictions (Lundberg & Lee, 2017)
- Healthcare Fairness: Fairness in ML for Healthcare (Feng et al., 2022)
Resources
Datasets
- MIMIC-III - 53,423 ICU admissions (2001-2012, free with CITI training)
- MIMIC-IV v3.1 - Over 65,000 ICU patients + over 200,000 ED patients (2008-2022, enhanced with ICD-9/10, better data quality, provider tracking, improved mortality data; available on BigQuery)
- eICU - Multi-center ICU database
Code & Tools
- ETHOS GitHub - Official ETHOS implementation
- MIMIC Code Repository - SQL queries and preprocessing
- ClinicalBERT on HuggingFace
- SHAP Library - Model interpretability
Tutorials
- MIMIC Database Tutorial - Getting started guide for MIMIC-III/IV
- Clinical NLP with Transformers - HuggingFace blog
Project Ideas
1. Reproduce ETHOS Baselines
- Implement ETHOS architecture from paper
- Pre-train on MIMIC-III or MIMIC-IV with masked event modeling
- Evaluate on mortality, readmission, length-of-stay
- Compare to published results
2. Multimodal EHR Extension
- Combine EHR codes + clinical notes using ETHOS + ClinicalBERT
- Cross-attention fusion between modalities
- Evaluate whether text improves predictions over codes alone
3. Interpretability Study
- Visualize attention patterns in ETHOS
- Compute SHAP values for high-risk predictions
- Analyze what events drive mortality predictions
- Conduct fairness audit across demographics
4. Clinical Validation
- Failure analysis: categorize prediction errors
- Identify systematic biases (age, sex, insurance)
- Propose model improvements based on failure patterns
EmergAI Thesis Connection
This path directly supports your thesis research:
Baseline: ETHOS (EHR-Only)
- Pre-train ETHOS on 8M ED visits from EmergAI dataset
- Evaluate on ED outcomes (admission, ICU transfer, mortality, readmission)
- Establish baseline performance using structured EHR codes only
Extension: Multimodal ETHOS
- Text modality: Patient-reported symptoms from Symptoms.se (encode with ClinicalBERT)
- Visual modality: 3D symptom body sketches (encode with ResNet/ViT)
- Fusion: Cross-attention between EHR, text, and sketch representations
- Hypothesis: Multimodal data improves predictions over EHR-only baseline
Validation
- Interpretability: Attention visualization showing which modalities matter
- SHAP analysis: Quantify contribution of each modality
- Fairness audit: Ensure equitable performance across demographics
- Clinical validation: Failure analysis comparing multimodal vs. baseline errors
Research Questions
- Does adding patient-reported multimodal data improve ED outcome prediction over structured EHR alone?
- Which modality contributes most: EHR codes, symptom text, or symptom sketches?
- Are multimodal models more interpretable than EHR-only models?
- Do multimodal models reduce demographic biases present in EHR-only models?
Assessment
Knowledge Check
After completing this path, you should be able to:
- Explain the structure of EHR data (demographics, diagnoses, procedures, medications, labs, temporal trajectories)
- Describe medical coding systems (ICD-10, ATC, CPT) and their hierarchical structure
- Query MIMIC-III/IV database to extract patient trajectories
- Implement code-level and hierarchical tokenization for medical events
- Encode clinical text using ClinicalBERT and compare to general BERT performance
- Explain ETHOS architecture in detail (event embeddings, temporal encoding, transformer encoder, task heads)
- Describe masked event modeling pre-training and why it enables zero-shot transfer
- Compare ETHOS, BEHRT, Med-BERT, and GatorTron (architecture, pre-training, strengths)
- Design a multimodal extension combining EHR + text + images
- Visualize attention weights to interpret which events matter for predictions
- Compute SHAP values to quantify feature contributions
- Conduct clinical validation (retrospective, prospective, failure analysis, fairness audit)
- Explain why interpretability is critical in healthcare AI (safety, trust, regulation)
Practical Skills
- Load and query MIMIC-III/IV database
- Tokenize patient trajectories (code-level, hierarchical, BPE)
- Use ClinicalBERT to encode clinical notes
- Implement ETHOS architecture in PyTorch
- Pre-train with masked event modeling
- Extract and visualize attention weights
- Compute SHAP values for predictions
- Conduct fairness audit across demographics
Next Steps
After mastering healthcare AI and EHR analysis, consider:
- Advanced Clinical NLP: Med-PaLM, GPT-4 medical applications, clinical question answering
- Medical Imaging: Combine EHR with radiology images (X-ray, CT, MRI)
- Multimodal Healthcare: Vision-language models for medical reports and images
- Reinforcement Learning: Treatment recommendation and clinical decision support
- Federated Learning: Privacy-preserving multi-hospital model training
Related Paths
- Foundation path: Attention & Transformers - Prerequisites for this path
- Advanced path: Multimodal AI for combining EHR, text, and images
Duration: 2 weeks (~16 hours) Difficulty: Advanced Prerequisites: Transformers, language modeling, Python/PyTorch