Formulating Research Questions
Good research begins with a well-formulated question. This guide teaches you how to identify gaps in the literature and develop testable hypotheses.
Identifying Research Gaps
Good research fills a gap. There are four main types of gaps:
- Methodological gap: A technique doesn’t exist for this problem
- Empirical gap: Something hasn’t been evaluated on specific data
- Theoretical gap: A phenomenon is not yet explained
- Application gap: A method hasn’t been applied to a specific domain
Example: Healthcare AI Thesis
A healthcare AI thesis might address multiple gap types:
- Methodological: Multimodal fusion of structured EHR + text + sketches
- Empirical: Patient-reported symptoms for ED outcome prediction
- Application: 3D symptom drawings in clinical prediction models
The PICOT Framework
Adapted from medical research, PICOT helps structure research questions:
- Population: Who are you studying? (e.g., Emergency department patients)
- Intervention: What are you testing? (e.g., Multimodal AI prediction model)
- Comparison: What’s the baseline? (e.g., Structured EHR-only models)
- Outcome: What are you measuring? (e.g., Prediction accuracy for mortality, admission)
- Time: When/how long? (e.g., Prospective validation over 6 months)
Example PICOT
P: Emergency department patients I: Multimodal transformer (EHR + symptoms + sketches) C: ETHOS (structured EHR only) O: Outcome prediction accuracy (AUROC, AUPRC) T: Retrospective analysis of 50,000 visits (2020-2024)
Hypothesis Formation
Types of Hypotheses
| Type | Example |
|---|---|
| Null Hypothesis | Adding patient-reported symptoms does NOT improve predictions over structured EHR alone |
| Alternative Hypothesis | Adding patient-reported symptoms significantly improves predictions (p < 0.05) |
| Research Question | To what extent do multimodal patient-reported symptoms improve ED outcome prediction? |
Characteristics of Good Hypotheses
A good hypothesis should be:
- Testable: Can be evaluated empirically
- Specific: Clear about what is being measured
- Falsifiable: Could potentially be proven wrong
- Relevant: Addresses an important question
- Novel: Adds something new to the field
Scoping Your Research
The Research Triangle
Balance three competing factors:
- Scope: How broad is the question?
- Depth: How thoroughly will you investigate?
- Time: How long do you have?
For a thesis: narrow scope + good depth + 6 months = publishable work
Avoiding Common Pitfalls
Too Broad: “How can AI improve healthcare?”
- Problem: Impossible to answer in one thesis
Too Narrow: “Does batch size 32 work better than 31 for this specific dataset?”
- Problem: Not interesting or generalizable
Just Right: “Can multimodal transformers fusing structured EHR with patient-reported symptoms improve ED outcome prediction compared to EHR-only models?”
- Specific, testable, and impactful
Example Thesis Statement
“This thesis investigates whether multimodal transformers that fuse structured EHR data with patient-reported symptom text and anatomical sketches can improve emergency department outcome prediction compared to models using structured data alone.”
Breaking Down the Statement
- Problem: ED outcome prediction
- Method: Multimodal transformers
- Data: EHR + text + sketches
- Comparison: Structured data alone
- Evaluation: Improvement in prediction accuracy
From Question to Contribution
Your research question should lead to clear contributions:
Methodological Contributions
- Novel architecture for multimodal fusion
- Techniques for handling sparse patient-reported data
- Methods for incorporating anatomical sketches
Empirical Contributions
- First evaluation of patient symptoms for ED prediction
- Comparison of fusion strategies
- Ablation studies showing component importance
Practical Contributions
- Demonstrated improvement over clinical baselines
- Interpretable predictions for clinicians
- Deployable system architecture
Validating Your Research Question
Before committing to a research question, check:
- Literature check: Has this been done before?
- Feasibility check: Can you get the data and compute?
- Timeline check: Can you complete this in 6 months?
- Impact check: Will this be publishable and useful?
- Supervisor check: Does your advisor agree?
Connection to Literature Review
Your research question emerges from reading papers:
- Read broadly: Understand the field (30-40 papers, Pass 1-2)
- Identify patterns: What’s been done? What’s missing?
- Find gaps: Where can you contribute?
- Formulate question: Specific, testable, novel
- Design experiments: Experimental design to answer the question
Refining Your Question
Iterate on your research question:
Initial (too broad): “Multimodal AI for healthcare”
Refined: “Multimodal AI for ED outcome prediction”
More specific: “Fusing EHR and patient symptoms for ED outcomes”
Final: “Multimodal transformers combining structured EHR, symptom text, and anatomical sketches for ED outcome prediction vs. EHR-only baselines”
Exercise: Write Your Research Question
Using the PICOT framework, write:
- Your research question in one sentence
- Your null hypothesis
- Your alternative hypothesis
- Three specific contributions your work will make
Examples from Machine Learning Research
Computer Vision
Question: “Can vision transformers match or exceed CNN performance on ImageNet without extensive data augmentation?”
Gap: Transformers successful in NLP but not proven in vision Contribution: ViT paper showing transformers competitive with CNNs
Multimodal Learning
Question: “Can natural language supervision enable zero-shot visual recognition?”
Gap: Fixed-class supervised learning doesn’t scale Contribution: CLIP enabling zero-shot transfer via language
Healthcare AI
Question: “Can masked event modeling on EHR sequences enable zero-shot clinical prediction?”
Gap: Healthcare models need labeled data for each task Contribution: ETHOS showing zero-shot generalization
Related Resources
- Reading Research Papers - Identifying gaps through literature
- Experimental Design - Testing your hypotheses
- Structuring Papers - Communicating your findings
Key Takeaways
- Identify gaps: Four types - methodological, empirical, theoretical, application
- Use PICOT: Structure your question systematically
- Scope appropriately: Narrow scope + good depth + realistic timeline
- Make it testable: Formulate clear null and alternative hypotheses
- Plan contributions: Know what you’ll contribute before starting
- Validate early: Check feasibility, novelty, and impact before committing