Skip to Content
ApplicationsHealthcareLearning PathsEHR Analysis

Healthcare AI & Electronic Health Records

Overview

This learning path teaches you how to apply deep learning to electronic health records (EHR), combining structured medical codes, clinical text, and temporal patient trajectories. You will master healthcare foundation models like ETHOS, understand clinical NLP with ClinicalBERT, and learn the interpretability requirements for medical AI systems.

Thesis Relevance: This path is directly critical for the EmergAI thesis, where you will extend ETHOS with multimodal patient-reported data (symptom text + 3D sketches) to improve emergency department outcome prediction.

Prerequisites

Before starting this path, you should have:

Learning Objectives

By completing this path, you will be able to:

  1. Navigate EHR databases (MIMIC-III/IV) and understand medical coding systems (ICD-10, ATC, CPT)
  2. Tokenize medical events for transformer models using code-level, hierarchical, or BPE approaches
  3. Process clinical text with ClinicalBERT and perform medical named entity recognition
  4. Explain ETHOS architecture and masked event modeling pre-training in detail
  5. Implement multimodal extensions combining EHR + text + images for healthcare prediction
  6. Validate medical AI using attention visualization, SHAP, clinical validation protocols, and fairness audits
  7. Design thesis experiments comparing multimodal vs. EHR-only approaches

Path Structure

Week 1: EHR Fundamentals & Clinical NLP

Part 1: EHR Structure and Medical Coding (4 hours)

Concept: Electronic Health Records: Structure and Medical Coding

Topics:

  • EHR data components (demographics, diagnoses, procedures, medications, labs, vitals)
  • Temporal patient trajectories and event sequences
  • Medical coding systems: ICD-10 (diagnoses), ATC (medications), CPT (procedures)
  • MIMIC-III/IV database structure and access
  • Challenges: missing data, temporal sparsity, high dimensionality, hierarchies

Learning Activities:

  1. Explore MIMIC-III/IV: Complete CITI training and access MIMIC-III/IV demo
  2. Query EHR data: Write SQL queries to extract patient trajectories
  3. Understand coding: Look up ICD-10 codes for common diagnoses (MI, diabetes, pneumonia)
  4. Visualize trajectories: Plot temporal event sequences for sample patients

Checkpoint: Can you explain the temporal structure of EHR data and query MIMIC-III/IV for patient trajectories?

Part 2: Medical Event Tokenization and Clinical NLP (6 hours)

Concept: Medical Event Tokenization and Clinical NLP

Topics:

  • Code-level tokenization: each medical code as a token
  • Hierarchical tokenization: leveraging ICD-10/ATC structure
  • BPE tokenization: handling rare codes with subwords
  • Temporal encoding: sinusoidal and learned time embeddings
  • ClinicalBERT: BERT pre-trained on MIMIC clinical notes
  • Medical named entity recognition (NER)
  • Combining structured codes and clinical text

Learning Activities:

  1. Implement tokenization: Code-level and hierarchical tokenizers for medical events
  2. Use ClinicalBERT: Load model from HuggingFace, encode clinical notes, compare to general BERT
  3. Medical NER: Extract diagnoses from clinical text using BioBERT NER
  4. Temporal encoding: Implement time embeddings for patient trajectories

Checkpoint: Can you tokenize a patient trajectory and encode clinical notes with ClinicalBERT?

Week 2: Healthcare Foundation Models & Validation

Part 3: Healthcare Foundation Models (6 hours)

Concept: Healthcare Foundation Models: ETHOS, BEHRT, Med-BERT, GatorTron

Topics:

  • ETHOS: Zero-shot health trajectory prediction via masked event modeling
    • Architecture: encoder-only transformer for EHR sequences
    • Pre-training: masked event modeling (15% masking)
    • Zero-shot transfer: predict new tasks without fine-tuning
    • Performance: AUROC 0.87 for mortality, 0.72 for readmission
  • BEHRT: Bidirectional encoding of diagnosis code sequences
  • Med-BERT: Hierarchical embeddings (code-level + visit-level)
  • GatorTron: 8.9B parameter clinical language model
  • Multimodal extension for EmergAI: EHR + symptom text + 3D sketches

Learning Activities:

  1. Read ETHOS paper: Zero-shot Health Trajectory Prediction  (Renc et al., 2024)
  2. Implement ETHOS architecture: Simplified version with event embeddings + transformer encoder
  3. Masked event modeling: Pre-training loop with 15% event masking
  4. Compare models: Understand differences between ETHOS, BEHRT, Med-BERT, GatorTron
  5. Design multimodal extension: Sketch architecture for EmergAI (ETHOS + ClinicalBERT + sketch encoder)

Checkpoint: Can you explain ETHOS architecture and implement masked event modeling pre-training?

Part 4: Interpretability and Clinical Validation (4 hours)

Concept: Interpretability and Validation for Healthcare AI

Topics:

  • Why interpretability is critical in healthcare (safety, trust, regulation)
  • Attention visualization: which events the model focuses on
  • SHAP values: quantifying feature contributions
  • Clinical validation protocol: retrospective, prospective, clinician review
  • Failure analysis: identifying systematic errors
  • Fairness audits: checking for demographic bias

Learning Activities:

  1. Visualize attention: Extract and plot attention weights from ETHOS
  2. Compute SHAP: Use SHAP library to explain predictions
  3. Failure analysis: Identify patterns in false positives vs. false negatives
  4. Fairness audit: Evaluate performance across age, sex, race groups

Checkpoint: Can you visualize attention maps and compute SHAP values for EHR predictions?

Key Papers

Primary Reference (Must Read)

Foundation Models

Clinical NLP

Interpretability & Fairness

Resources

Datasets

  • MIMIC-III  - 53,423 ICU admissions (2001-2012, free with CITI training)
  • MIMIC-IV v3.1  - Over 65,000 ICU patients + over 200,000 ED patients (2008-2022, enhanced with ICD-9/10, better data quality, provider tracking, improved mortality data; available on BigQuery)
  • eICU  - Multi-center ICU database

Code & Tools

Tutorials

Project Ideas

1. Reproduce ETHOS Baselines

  • Implement ETHOS architecture from paper
  • Pre-train on MIMIC-III or MIMIC-IV with masked event modeling
  • Evaluate on mortality, readmission, length-of-stay
  • Compare to published results

2. Multimodal EHR Extension

  • Combine EHR codes + clinical notes using ETHOS + ClinicalBERT
  • Cross-attention fusion between modalities
  • Evaluate whether text improves predictions over codes alone

3. Interpretability Study

  • Visualize attention patterns in ETHOS
  • Compute SHAP values for high-risk predictions
  • Analyze what events drive mortality predictions
  • Conduct fairness audit across demographics

4. Clinical Validation

  • Failure analysis: categorize prediction errors
  • Identify systematic biases (age, sex, insurance)
  • Propose model improvements based on failure patterns

EmergAI Thesis Connection

This path directly supports your thesis research:

Baseline: ETHOS (EHR-Only)

  • Pre-train ETHOS on 8M ED visits from EmergAI dataset
  • Evaluate on ED outcomes (admission, ICU transfer, mortality, readmission)
  • Establish baseline performance using structured EHR codes only

Extension: Multimodal ETHOS

  • Text modality: Patient-reported symptoms from Symptoms.se (encode with ClinicalBERT)
  • Visual modality: 3D symptom body sketches (encode with ResNet/ViT)
  • Fusion: Cross-attention between EHR, text, and sketch representations
  • Hypothesis: Multimodal data improves predictions over EHR-only baseline

Validation

  • Interpretability: Attention visualization showing which modalities matter
  • SHAP analysis: Quantify contribution of each modality
  • Fairness audit: Ensure equitable performance across demographics
  • Clinical validation: Failure analysis comparing multimodal vs. baseline errors

Research Questions

  1. Does adding patient-reported multimodal data improve ED outcome prediction over structured EHR alone?
  2. Which modality contributes most: EHR codes, symptom text, or symptom sketches?
  3. Are multimodal models more interpretable than EHR-only models?
  4. Do multimodal models reduce demographic biases present in EHR-only models?

Assessment

Knowledge Check

After completing this path, you should be able to:

  • Explain the structure of EHR data (demographics, diagnoses, procedures, medications, labs, temporal trajectories)
  • Describe medical coding systems (ICD-10, ATC, CPT) and their hierarchical structure
  • Query MIMIC-III/IV database to extract patient trajectories
  • Implement code-level and hierarchical tokenization for medical events
  • Encode clinical text using ClinicalBERT and compare to general BERT performance
  • Explain ETHOS architecture in detail (event embeddings, temporal encoding, transformer encoder, task heads)
  • Describe masked event modeling pre-training and why it enables zero-shot transfer
  • Compare ETHOS, BEHRT, Med-BERT, and GatorTron (architecture, pre-training, strengths)
  • Design a multimodal extension combining EHR + text + images
  • Visualize attention weights to interpret which events matter for predictions
  • Compute SHAP values to quantify feature contributions
  • Conduct clinical validation (retrospective, prospective, failure analysis, fairness audit)
  • Explain why interpretability is critical in healthcare AI (safety, trust, regulation)

Practical Skills

  • Load and query MIMIC-III/IV database
  • Tokenize patient trajectories (code-level, hierarchical, BPE)
  • Use ClinicalBERT to encode clinical notes
  • Implement ETHOS architecture in PyTorch
  • Pre-train with masked event modeling
  • Extract and visualize attention weights
  • Compute SHAP values for predictions
  • Conduct fairness audit across demographics

Next Steps

After mastering healthcare AI and EHR analysis, consider:

  1. Advanced Clinical NLP: Med-PaLM, GPT-4 medical applications, clinical question answering
  2. Medical Imaging: Combine EHR with radiology images (X-ray, CT, MRI)
  3. Multimodal Healthcare: Vision-language models for medical reports and images
  4. Reinforcement Learning: Treatment recommendation and clinical decision support
  5. Federated Learning: Privacy-preserving multi-hospital model training
  • Foundation path: Attention & Transformers - Prerequisites for this path
  • Advanced path: Multimodal AI for combining EHR, text, and images

Duration: 2 weeks (~16 hours) Difficulty: Advanced Prerequisites: Transformers, language modeling, Python/PyTorch