Skip to Content
BlogResearch & WritingResearch Questions

Formulating Research Questions

Good research begins with a well-formulated question. This guide teaches you how to identify gaps in the literature and develop testable hypotheses.

Identifying Research Gaps

Good research fills a gap. There are four main types of gaps:

  1. Methodological gap: A technique doesn’t exist for this problem
  2. Empirical gap: Something hasn’t been evaluated on specific data
  3. Theoretical gap: A phenomenon is not yet explained
  4. Application gap: A method hasn’t been applied to a specific domain

Example: Healthcare AI Thesis

A healthcare AI thesis might address multiple gap types:

  1. Methodological: Multimodal fusion of structured EHR + text + sketches
  2. Empirical: Patient-reported symptoms for ED outcome prediction
  3. Application: 3D symptom drawings in clinical prediction models

The PICOT Framework

Adapted from medical research, PICOT helps structure research questions:

  • Population: Who are you studying? (e.g., Emergency department patients)
  • Intervention: What are you testing? (e.g., Multimodal AI prediction model)
  • Comparison: What’s the baseline? (e.g., Structured EHR-only models)
  • Outcome: What are you measuring? (e.g., Prediction accuracy for mortality, admission)
  • Time: When/how long? (e.g., Prospective validation over 6 months)

Example PICOT

P: Emergency department patients I: Multimodal transformer (EHR + symptoms + sketches) C: ETHOS (structured EHR only) O: Outcome prediction accuracy (AUROC, AUPRC) T: Retrospective analysis of 50,000 visits (2020-2024)

Hypothesis Formation

Types of Hypotheses

TypeExample
Null HypothesisAdding patient-reported symptoms does NOT improve predictions over structured EHR alone
Alternative HypothesisAdding patient-reported symptoms significantly improves predictions (p < 0.05)
Research QuestionTo what extent do multimodal patient-reported symptoms improve ED outcome prediction?

Characteristics of Good Hypotheses

A good hypothesis should be:

  • Testable: Can be evaluated empirically
  • Specific: Clear about what is being measured
  • Falsifiable: Could potentially be proven wrong
  • Relevant: Addresses an important question
  • Novel: Adds something new to the field

Scoping Your Research

The Research Triangle

Balance three competing factors:

  1. Scope: How broad is the question?
  2. Depth: How thoroughly will you investigate?
  3. Time: How long do you have?

For a thesis: narrow scope + good depth + 6 months = publishable work

Avoiding Common Pitfalls

Too Broad: “How can AI improve healthcare?”

  • Problem: Impossible to answer in one thesis

Too Narrow: “Does batch size 32 work better than 31 for this specific dataset?”

  • Problem: Not interesting or generalizable

Just Right: “Can multimodal transformers fusing structured EHR with patient-reported symptoms improve ED outcome prediction compared to EHR-only models?”

  • Specific, testable, and impactful

Example Thesis Statement

“This thesis investigates whether multimodal transformers that fuse structured EHR data with patient-reported symptom text and anatomical sketches can improve emergency department outcome prediction compared to models using structured data alone.”

Breaking Down the Statement

  • Problem: ED outcome prediction
  • Method: Multimodal transformers
  • Data: EHR + text + sketches
  • Comparison: Structured data alone
  • Evaluation: Improvement in prediction accuracy

From Question to Contribution

Your research question should lead to clear contributions:

Methodological Contributions

  • Novel architecture for multimodal fusion
  • Techniques for handling sparse patient-reported data
  • Methods for incorporating anatomical sketches

Empirical Contributions

  • First evaluation of patient symptoms for ED prediction
  • Comparison of fusion strategies
  • Ablation studies showing component importance

Practical Contributions

  • Demonstrated improvement over clinical baselines
  • Interpretable predictions for clinicians
  • Deployable system architecture

Validating Your Research Question

Before committing to a research question, check:

  1. Literature check: Has this been done before?
  2. Feasibility check: Can you get the data and compute?
  3. Timeline check: Can you complete this in 6 months?
  4. Impact check: Will this be publishable and useful?
  5. Supervisor check: Does your advisor agree?

Connection to Literature Review

Your research question emerges from reading papers:

  1. Read broadly: Understand the field (30-40 papers, Pass 1-2)
  2. Identify patterns: What’s been done? What’s missing?
  3. Find gaps: Where can you contribute?
  4. Formulate question: Specific, testable, novel
  5. Design experiments: Experimental design to answer the question

Refining Your Question

Iterate on your research question:

Initial (too broad): “Multimodal AI for healthcare”

Refined: “Multimodal AI for ED outcome prediction”

More specific: “Fusing EHR and patient symptoms for ED outcomes”

Final: “Multimodal transformers combining structured EHR, symptom text, and anatomical sketches for ED outcome prediction vs. EHR-only baselines”

Exercise: Write Your Research Question

Using the PICOT framework, write:

  1. Your research question in one sentence
  2. Your null hypothesis
  3. Your alternative hypothesis
  4. Three specific contributions your work will make

Examples from Machine Learning Research

Computer Vision

Question: “Can vision transformers match or exceed CNN performance on ImageNet without extensive data augmentation?”

Gap: Transformers successful in NLP but not proven in vision Contribution: ViT paper showing transformers competitive with CNNs

Multimodal Learning

Question: “Can natural language supervision enable zero-shot visual recognition?”

Gap: Fixed-class supervised learning doesn’t scale Contribution: CLIP enabling zero-shot transfer via language

Healthcare AI

Question: “Can masked event modeling on EHR sequences enable zero-shot clinical prediction?”

Gap: Healthcare models need labeled data for each task Contribution: ETHOS showing zero-shot generalization

Key Takeaways

  1. Identify gaps: Four types - methodological, empirical, theoretical, application
  2. Use PICOT: Structure your question systematically
  3. Scope appropriately: Narrow scope + good depth + realistic timeline
  4. Make it testable: Formulate clear null and alternative hypotheses
  5. Plan contributions: Know what you’ll contribute before starting
  6. Validate early: Check feasibility, novelty, and impact before committing