Skip to Content
LibraryStatistics

Library Statistics

Current size and growth of the mlnotes knowledge base.

Last Updated: November 11, 2025

Overview

Total Content Items: 90

The mlnotes library has grown to 90 interconnected content items spanning concepts, papers, examples, learning paths, and blog posts.

By Content Type

TypeCountDescription
Concepts40Core ML concepts, architectures, and methods
Blog Posts13Applications, research guides, and overviews
Learning Paths11Structured learning sequences
Papers9Research paper analyses
Other17Examples, resources, and domain overviews

Content Type Breakdown

  • 40 Concepts: Foundation of the knowledge base - atomic, reusable concept pages
  • 13 Blog Posts: Practical applications (CNNs, transformers, LLMs, VLMs, diffusion) and research methodology
  • 11 Learning Paths: Structured curricula from foundations to advanced topics
  • 9 Papers: In-depth analyses of seminal research papers
  • 17 Other: Examples, healthcare resources, datasets, and domain-specific content

By Difficulty

DifficultyCountPercentage
Beginner1517%
Intermediate4853%
Advanced1921%
Not Classified89%

The majority of content (53%) targets intermediate learners, with solid coverage of foundational (17%) and advanced (21%) topics.

By Topic (Top 10 Tags)

RankTagCountDescription
1deep-learning16General deep learning concepts
2transformers16Attention and transformer architectures
3neural-networks14Foundational neural network concepts
4healthcare13Healthcare AI applications
5computer-vision12Vision and image processing
6attention11Attention mechanisms
7diffusion-models10Generative diffusion models
8multimodal10Multimodal learning (VLMs, fusion)
9cnns9Convolutional neural networks
10nlp9Natural language processing

Topic Coverage Highlights

  • Transformers & Attention: 27 items (transformers + attention) - comprehensive coverage
  • Computer Vision: 21 items (computer-vision + cnns) - from foundations to modern architectures
  • Healthcare: 13 items - most comprehensive domain-specific coverage
  • Generative Models: 10 items (diffusion-models) - modern generative AI

Library Sections

Core Library (content/library/)

  • Concepts (concepts/): 40 concept pages organized by topic
  • Papers (papers/): 9 paper analyses from 2012-2022
  • Blog (blog/): 13 posts covering applications and research methods

Applications (content/applications/)

  • Healthcare: 17 pages (concepts, paths, resources, datasets)
    • Most comprehensive domain coverage
    • Complete learning path for EHR analysis
    • Research methodology and validation guidelines

Learning Paths (content/paths/)

  • 11 structured learning paths:
    • Foundation: Neural Networks, CNNs, Transformers, GPT
    • Advanced: VLMs, Diffusion, Advanced Training
    • Specialized: Healthcare EHR, Research Methods
    • Meta: Advanced concepts overview

Cross-References

The content-addressable architecture enables rich cross-referencing:

  • Every concept links to prerequisites and related concepts
  • Papers link to concept implementations and related work
  • Paths reference library content via stable IDs
  • Blog posts connect applications to underlying concepts

Estimated total cross-references: 500+ ContentLinks across 90 items

Growth Metrics

Migration Progress

  • Source: 109 legacy module files
  • Migrated: 100 files (92%)
    • High-priority: 64/63 (101% - exceeded target)
    • Medium-priority: 30/31 (97%)
    • Low-priority: 6/15 (40% - remaining are low-value)
  • Result: 90 content items in new structure
  • Consolidation ratio: 100 source files → 90 content items (10% reduction through deduplication)

Quality Improvements

  • Atomicity: Split multi-concept pages into focused, reusable pages
  • Deduplication: Eliminated redundant content across modules
  • Rich linking: Added 500+ cross-references using ContentLink
  • Validation: Build-time checking prevents broken links
  • Stable IDs: Content can be safely reorganized without breaking links

Learning Time Estimates

Based on learning path metadata:

LevelPathsTotal Time
Foundation4 paths54-73 hours
Advanced4 paths40-55 hours
Specialized3 paths25-35 hours
Total Curriculum11 paths120-160 hours

Approximately 3-4 months of dedicated study for complete curriculum.

Content Distribution

By Module Origin

  • Module 01 (Foundations): 10 concepts + 1 example + 1 path = 12 items
  • Module 02 (CNNs): 5 concepts + 3 papers + 1 path + 1 blog = 10 items
  • Module 03 (Transformers): 6 concepts + 1 paper + 1 path + 1 blog = 9 items
  • Module 04 (GPT/LLMs): 6 concepts + 1 path + 1 blog = 8 items
  • Module 05 (VLMs): 4 concepts + 2 papers + 1 path + 1 blog + 1 overview = 9 items
  • Module 06 (Diffusion): 4 concepts + 3 papers + 1 path + 1 blog + 1 overview = 10 items
  • Module 07 (Advanced): 3 concepts + 1 path + 1 overview = 5 items
  • Module 09 (Research): 6 blog posts + 1 path = 7 items
  • Healthcare: 10 concepts + 3 overviews + 1 path + 2 resources + 1 index = 17 items
  • Section overviews: 3 items (foundation, advanced, paths)

Next Milestones

Short Term (Next 3 Months)

  • Expand to 120+ content items
  • Add 10+ code examples
  • Create 5+ additional learning paths
  • Expand computer vision and NLP domain coverage

Medium Term (6-12 Months)

  • Reach 200+ content items
  • Complete all major ML subdomain coverage
  • Add interactive demos and visualizations
  • Community contributions framework

Long Term (1-2 Years)

  • Comprehensive ML encyclopedia (500+ items)
  • Multi-domain expertise (healthcare, robotics, finance)
  • Integration with external learning platforms
  • Multilingual support

This statistics page is automatically generated from the content registry. Run pnpm build:registry to update.