Skip to Content
Learning PathsResearch Methods

Research Methodology & Academic Writing

This learning path transitions you from student to researcher. Learn how to critically read papers, formulate research questions, design rigorous experiments, structure academic papers, and publish your work.

Learning Objectives

By completing this path, you will be able to:

  • Read papers efficiently using the three-pass method
  • Identify research gaps and formulate testable hypotheses
  • Design rigorous experiments with proper baselines and ablations
  • Write academic papers using IMRaD structure
  • Conduct statistical validation with confidence intervals and significance tests
  • Plan publication strategy for conferences and journals

Prerequisites

Technical Knowledge

  • Understanding of machine learning fundamentals
  • Familiarity with at least one domain (computer vision, NLP, healthcare AI)
  • Experience implementing models and running experiments

Path Overview

Week 1: Research Fundamentals (6-10 hours)

This intensive week covers all essential research skills for ML research.

Day 1-2: Reading and Understanding Papers (2-3 hours)

Content: Reading Research Papers

Learn:

  • Three-pass method (Quick scan → Detailed reading → Deep understanding)
  • Note-taking strategies for paper reading
  • Building a literature map of your field
  • Critical reading questions (validity, assumptions, limitations)
  • Organizing papers by theme and contribution

Practice:

  1. Read 5 papers using Pass 1 (5-10 min each)
  2. Read 2 papers using Pass 2 (30-60 min each)
  3. Deep-read 1 key paper using Pass 3 (2-4 hours)
  4. Create a literature map for your research area
  5. Write summaries for each paper in your own words

Papers to Start With (if doing healthcare AI):

Day 3: Research Questions and Hypotheses (2 hours)

Content: Formulating Research Questions

Learn:

  • Four types of research gaps (methodological, empirical, theoretical, application)
  • PICOT framework for structuring questions
  • Null vs alternative hypotheses
  • Scoping research appropriately (narrow scope + good depth)
  • Validating your research question

Exercise:

  1. Identify 3 research gaps in your area
  2. Use PICOT to structure one research question
  3. Write null and alternative hypotheses
  4. List 3 specific contributions you would make
  5. Validate feasibility (data, compute, timeline)

Example Output:

“Can multimodal transformers fusing structured EHR with patient symptoms improve ED outcome prediction (AUROC) compared to EHR-only models (ETHOS baseline), evaluated on 50k visits with temporal split?”

Day 4-5: Experimental Design (2-3 hours)

Content: Experimental Design

Learn:

  • Choosing strong baselines (simple, state-of-the-art, ablated)
  • Ablation studies to test component contributions
  • Data splits (temporal vs random, avoiding leakage)
  • Statistical significance testing (t-tests, confidence intervals)
  • Evaluation metrics for ML (accuracy, AUROC, AUPRC, F1)
  • Reproducibility requirements

Practice:

  1. Design experiment suite for your research question:
    • Define 3+ baselines (simple, SOTA, ablations)
    • Specify data splits (train/val/test percentages)
    • Choose evaluation metrics (primary + secondary)
    • Plan statistical tests (which comparisons matter?)
  2. Create experiment tracking template
  3. Write reproducibility checklist for your project

Deliverable: Complete experimental design document with baselines, metrics, splits, and statistical tests.

Day 6-7: Academic Writing (2-4 hours)

Content: Structuring Research Papers

Learn:

  • IMRaD structure (Introduction, Methods, Results, Discussion)
  • Writing each section effectively
  • Creating clear figures and tables
  • Presenting results with confidence intervals
  • Discussing limitations honestly
  • Abstract and conclusion best practices

Practice:

  1. Write Introduction for your research (1 page):
    • Motivation (why this matters)
    • Gap (what’s missing)
    • Contributions (what you’ll provide)
    • Results preview (tease findings)
  2. Outline Methods section with all required details
  3. Design main results table with baselines and metrics
  4. Draft limitations section (be specific and honest)

Template: Use LaTeX template for ICML, NeurIPS, or other target venue.

Research Workflow

Phase 1: Planning (Week 1 of research)

  1. Literature review: Read 30-40 papers (Pass 1-2)
  2. Identify gap: Find what’s missing
  3. Formulate question: Specific, testable, novel
  4. Design experiments: Baselines, metrics, splits
  5. Set up tools: Reference manager, experiment tracking, LaTeX

Phase 2: Implementation (Weeks 2-8 of research)

  1. Start with simple baseline: Logistic regression or simple model
  2. Implement state-of-the-art baseline: Reproduce existing work
  3. Build your model: Incremental complexity
  4. Log all experiments: Keep detailed notes
  5. Write Methods section: Document as you go

Phase 3: Experimentation (Weeks 9-16 of research)

  1. Run baseline comparisons: Establish performance floor
  2. Ablation studies: Test each component
  3. Statistical validation: Confidence intervals, significance tests
  4. Error analysis: Understand failure modes
  5. Write Results section: Tables, figures, text

Phase 4: Writing (Weeks 17-20 of research)

  1. Complete all sections: Introduction, Methods, Results, Discussion
  2. Create figures: High-quality, clear, self-contained
  3. Write Abstract: Do this last, 250 words
  4. Revise thoroughly: Multiple passes, get feedback
  5. Format for venue: Follow submission guidelines

Phase 5: Submission & Revision (Weeks 21-24)

  1. Submit to conference/journal: Follow deadlines
  2. Handle reviews: Read carefully, respond professionally
  3. Revise paper: Address all reviewer comments
  4. Resubmit: Complete revision within deadline

Practical Examples

Example 1: Healthcare AI Research

Question: Can multimodal EHR + symptoms improve ED outcome prediction?

Experiments:

  • Baseline 1: Logistic regression (age, sex, chief complaint)
  • Baseline 2: ETHOS (state-of-the-art EHR model)
  • Ablation 1: EHR + text only
  • Ablation 2: EHR + sketch only
  • Full model: EHR + text + sketch

Results: Report AUROC, AUPRC, calibration with 95% CI

Paper: 8-page conference paper (NeurIPS ML4H, CHIL, or similar)

Example 2: Multimodal VLM Research

Question: Can vision-language pre-training enable zero-shot medical image classification?

Experiments:

  • Baseline 1: ResNet-50 supervised (ImageNet pre-training)
  • Baseline 2: ResNet-50 supervised (medical data only)
  • Baseline 3: CLIP zero-shot (no medical fine-tuning)
  • Proposed: CLIP fine-tuned on medical image-text pairs

Results: Report zero-shot accuracy, few-shot accuracy (1, 5, 10 examples)

Paper: 8-page conference paper (MICCAI, CVPR, or similar)

Tools and Resources

Reference Management

  • Zotero (recommended): Free, open-source, browser integration
  • Mendeley: PDF management, citation plugin
  • Papers: Mac-only, visual library

Writing and LaTeX

  • Overleaf (recommended): Online LaTeX editor, collaboration
  • TeXShop (Mac) or TeXworks (Windows): Local LaTeX editors
  • Conference templates: ICML, NeurIPS, ICLR, ACL, CVPR

Experiment Tracking

  • Weights & Biases (recommended): ML experiment tracking, visualization
  • TensorBoard: PyTorch/TensorFlow logging
  • MLflow: Experiment and model management
  • Git: Version control for code

Statistical Testing

  • SciPy: Python statistics library (scipy.stats)
  • Statsmodels: Advanced statistical modeling
  • Seaborn/Matplotlib: Plotting confidence intervals

Success Criteria

You are ready to execute your research when you can:

  • Read and critically evaluate research papers efficiently
  • Identify clear research gaps in your chosen area
  • Formulate specific, testable research questions
  • Design experiments with proper baselines and ablations
  • Choose appropriate evaluation metrics for your problem
  • Plan statistical validation (confidence intervals, p-values)
  • Write academic papers following IMRaD structure
  • Create clear figures and tables for results
  • Discuss limitations honestly and specifically
  • Use research tools (LaTeX, reference managers, experiment tracking)

Checkpoint Assessment

After completing this path, you should be able to:

Research Design (30 min)

Given a research problem:

  1. Identify the research gap (which type?)
  2. Formulate a testable hypothesis
  3. Design an experiment with 3+ baselines
  4. Specify evaluation metrics and statistical tests
  5. Estimate timeline and resource requirements

Paper Critique (30 min)

Given a research paper:

  1. Summarize the core contribution in 2 sentences
  2. Identify the research gap they address
  3. Evaluate their baseline comparisons (strong or weak?)
  4. Critique their evaluation (metrics, statistical tests)
  5. List 2-3 limitations they should have mentioned

Academic Writing (1 hour)

Write an Introduction section (1 page) for your research:

  1. Motivation: Why does this problem matter?
  2. Gap: What’s missing in current work?
  3. Contributions: What will your work provide?
  4. Results preview: Tease main findings

Peer review: Have a colleague evaluate your Introduction.

Technical Foundations

Domain Applications

Next Steps

After mastering research methodology:

For Healthcare AI Research

  1. Complete Healthcare AI path
  2. Study healthcare foundation models
  3. Design multimodal EHR + symptom prediction experiments
  4. Implement and compare against ETHOS baseline
  5. Write paper for CHIL, NeurIPS ML4H, or JAMIA

For Computer Vision Research

  1. Complete CNN foundations and VLM path
  2. Study CLIP and ViT
  3. Design zero-shot or few-shot learning experiments
  4. Implement vision-language models for your domain
  5. Write paper for CVPR, ICCV, or ECCV

For NLP Research

  1. Complete Transformers and GPT path
  2. Study language model training
  3. Design domain adaptation or fine-tuning experiments
  4. Implement specialized language models
  5. Write paper for ACL, EMNLP, or NAACL

Timeline Summary

WeekFocusHoursDeliverable
Week 1Research Methodology6-10Experiment design document

Total: 1 week intensive study, then apply throughout your research project (3-6 months).

Key Takeaways

  1. Three-pass reading: Quick scan → Detailed read → Deep understanding
  2. Research gaps: Identify what’s missing (methodological, empirical, theoretical, application)
  3. Strong baselines: Compare against simple, state-of-the-art, and ablated models
  4. Statistical rigor: Always report confidence intervals and p-values
  5. IMRaD structure: Standard format for all ML research papers
  6. Honest limitations: Acknowledging weaknesses shows research maturity
  7. Reproducibility: Document everything for replication

Ready to start your research? You have the methodology. Now apply it to your chosen problem.