Library Statistics
Current size and growth of the mlnotes knowledge base.
Last Updated: November 11, 2025
Overview
Total Content Items: 90
The mlnotes library has grown to 90 interconnected content items spanning concepts, papers, examples, learning paths, and blog posts.
By Content Type
| Type | Count | Description |
|---|---|---|
| Concepts | 40 | Core ML concepts, architectures, and methods |
| Blog Posts | 13 | Applications, research guides, and overviews |
| Learning Paths | 11 | Structured learning sequences |
| Papers | 9 | Research paper analyses |
| Other | 17 | Examples, resources, and domain overviews |
Content Type Breakdown
- 40 Concepts: Foundation of the knowledge base - atomic, reusable concept pages
- 13 Blog Posts: Practical applications (CNNs, transformers, LLMs, VLMs, diffusion) and research methodology
- 11 Learning Paths: Structured curricula from foundations to advanced topics
- 9 Papers: In-depth analyses of seminal research papers
- 17 Other: Examples, healthcare resources, datasets, and domain-specific content
By Difficulty
| Difficulty | Count | Percentage |
|---|---|---|
| Beginner | 15 | 17% |
| Intermediate | 48 | 53% |
| Advanced | 19 | 21% |
| Not Classified | 8 | 9% |
The majority of content (53%) targets intermediate learners, with solid coverage of foundational (17%) and advanced (21%) topics.
By Topic (Top 10 Tags)
| Rank | Tag | Count | Description |
|---|---|---|---|
| 1 | deep-learning | 16 | General deep learning concepts |
| 2 | transformers | 16 | Attention and transformer architectures |
| 3 | neural-networks | 14 | Foundational neural network concepts |
| 4 | healthcare | 13 | Healthcare AI applications |
| 5 | computer-vision | 12 | Vision and image processing |
| 6 | attention | 11 | Attention mechanisms |
| 7 | diffusion-models | 10 | Generative diffusion models |
| 8 | multimodal | 10 | Multimodal learning (VLMs, fusion) |
| 9 | cnns | 9 | Convolutional neural networks |
| 10 | nlp | 9 | Natural language processing |
Topic Coverage Highlights
- Transformers & Attention: 27 items (transformers + attention) - comprehensive coverage
- Computer Vision: 21 items (computer-vision + cnns) - from foundations to modern architectures
- Healthcare: 13 items - most comprehensive domain-specific coverage
- Generative Models: 10 items (diffusion-models) - modern generative AI
Library Sections
Core Library (content/library/)
- Concepts (
concepts/): 40 concept pages organized by topic - Papers (
papers/): 9 paper analyses from 2012-2022 - Blog (
blog/): 13 posts covering applications and research methods
Applications (content/applications/)
- Healthcare: 17 pages (concepts, paths, resources, datasets)
- Most comprehensive domain coverage
- Complete learning path for EHR analysis
- Research methodology and validation guidelines
Learning Paths (content/paths/)
- 11 structured learning paths:
- Foundation: Neural Networks, CNNs, Transformers, GPT
- Advanced: VLMs, Diffusion, Advanced Training
- Specialized: Healthcare EHR, Research Methods
- Meta: Advanced concepts overview
Cross-References
The content-addressable architecture enables rich cross-referencing:
- Every concept links to prerequisites and related concepts
- Papers link to concept implementations and related work
- Paths reference library content via stable IDs
- Blog posts connect applications to underlying concepts
Estimated total cross-references: 500+ ContentLinks across 90 items
Growth Metrics
Migration Progress
- Source: 109 legacy module files
- Migrated: 100 files (92%)
- High-priority: 64/63 (101% - exceeded target)
- Medium-priority: 30/31 (97%)
- Low-priority: 6/15 (40% - remaining are low-value)
- Result: 90 content items in new structure
- Consolidation ratio: 100 source files → 90 content items (10% reduction through deduplication)
Quality Improvements
- Atomicity: Split multi-concept pages into focused, reusable pages
- Deduplication: Eliminated redundant content across modules
- Rich linking: Added 500+ cross-references using ContentLink
- Validation: Build-time checking prevents broken links
- Stable IDs: Content can be safely reorganized without breaking links
Learning Time Estimates
Based on learning path metadata:
| Level | Paths | Total Time |
|---|---|---|
| Foundation | 4 paths | 54-73 hours |
| Advanced | 4 paths | 40-55 hours |
| Specialized | 3 paths | 25-35 hours |
| Total Curriculum | 11 paths | 120-160 hours |
Approximately 3-4 months of dedicated study for complete curriculum.
Content Distribution
By Module Origin
- Module 01 (Foundations): 10 concepts + 1 example + 1 path = 12 items
- Module 02 (CNNs): 5 concepts + 3 papers + 1 path + 1 blog = 10 items
- Module 03 (Transformers): 6 concepts + 1 paper + 1 path + 1 blog = 9 items
- Module 04 (GPT/LLMs): 6 concepts + 1 path + 1 blog = 8 items
- Module 05 (VLMs): 4 concepts + 2 papers + 1 path + 1 blog + 1 overview = 9 items
- Module 06 (Diffusion): 4 concepts + 3 papers + 1 path + 1 blog + 1 overview = 10 items
- Module 07 (Advanced): 3 concepts + 1 path + 1 overview = 5 items
- Module 09 (Research): 6 blog posts + 1 path = 7 items
- Healthcare: 10 concepts + 3 overviews + 1 path + 2 resources + 1 index = 17 items
- Section overviews: 3 items (foundation, advanced, paths)
Next Milestones
Short Term (Next 3 Months)
- Expand to 120+ content items
- Add 10+ code examples
- Create 5+ additional learning paths
- Expand computer vision and NLP domain coverage
Medium Term (6-12 Months)
- Reach 200+ content items
- Complete all major ML subdomain coverage
- Add interactive demos and visualizations
- Community contributions framework
Long Term (1-2 Years)
- Comprehensive ML encyclopedia (500+ items)
- Multi-domain expertise (healthcare, robotics, finance)
- Integration with external learning platforms
- Multilingual support
This statistics page is automatically generated from the content registry. Run pnpm build:registry to update.