AI / ML / AI Engineering — Learning Path

A structured, opinionated learning path that takes you from linear algebra to building production AI systems. Each stage is a folder; read in order or jump in where you fit.

Note on grounding. Earlier wiki content in ../wiki/ is strictly grounded in ../raw/ source PDFs. This organized/ tree is pedagogy-first — it draws from the curriculum, canonical books, and modern practice (through 2026) without forcing every claim back to a citation. Where a specific source matters (e.g. a paper, a book chapter), it’s cited inline. Treat this as a textbook you can navigate, not a citation index.

How to use this path

  • Total or vertical-slice. The 15 stages are sequential, but you can vertical-slice: read the foundation of each stage (the README.md) and then drill into one application area (e.g. RAG → production → applications).
  • Read the stage README first. Every stage opens with prerequisites, a learning ladder, key concepts, and a “minimum viable understanding” checklist. Don’t skip it.
  • Build, then read more. Most stages have an “exercises” section. The fastest way through this material is to implement a small thing in each stage before moving on.
  • Backtrack freely. Hit a wall in stage 6 because vector spaces feel hand-wavy? Go back to stage 1. The path is a graph, not a staircase.

The 15 stages

#StageWhy it’s hereTime
01Math foundationsLinear algebra, probability, calculus, info theory — the language ML is written in2–4 weeks
02ML fundamentalsSupervised/unsupervised, loss & optimization, evaluation, classical algorithms2–4 weeks
03Neural networksPerceptrons → MLPs → backprop → optimizers — how networks actually learn2–3 weeks
04Language modelingn-grams → RNNs → why transformers won1–2 weeks
05Tokens & embeddingsHow text becomes vectors; static vs contextual embeddings1 week
06TransformersSelf-attention (KQV), multi-head, positional encoding, GPT from scratch2–3 weeks
07Modern LLMsScaling laws, MoE, reasoning models, long-context, frontier architectures1–2 weeks
08PromptingZero/few-shot, CoT, structured output, sampling, prompt patterns1 week
09RAGFundamentals → chunking → vector DBs → hybrid search → reranking → eval2 weeks
10Fine-tuningWhen to FT, SFT, LoRA/QLoRA, RLHF/DPO/GRPO, embedding FT, datasets2–3 weeks
11AgentsAgent loops, tools, memory, planning, multi-agent, browser/vision agents2 weeks
12MultimodalCLIP, VLMs, diffusion, video gen, TTS, synthetic data1–2 weeks
13ProductionEvals, guardrails, observability, scaling, hallucinations, enterprise2–3 weeks
14ApplicationsText-to-SQL, code gen, browser agents, financial reasoning, case studies1–2 weeks
15Engineering & careerRoles, roadmap, staying current, what to actually buildongoing

Total: 24–36 weeks if you build alongside reading. Less if you’re already partway up the stack.

Three reading tracks

Not everyone needs all 15 stages. Pick a track. Each one has a week-by-week cheat sheet:

TrackForTimeCheat sheet
A — SWE → AI Product EngineerYou can ship code; you want to ship AI features8–12 weeksTRACK_A_SWE_TO_AI.md
B — ML Engineer → LLM SpecialistYou know ML; you want LLM depth12–18 weeksTRACK_B_ML_TO_LLM.md
C — Complete from scratchStarting fresh; want the full foundation24–36 weeksTRACK_C_FROM_SCRATCH.md

Each cheat sheet has weekly reading + building targets, milestones, and pivot signals.

Exercise solutions

Worked solutions to the most concrete exercises live in EXERCISE_SOLUTIONS/. Try the exercise yourself first; check the solution after.

Prerequisites

  • Python. All examples are Python. If you’re new, do one Python crash course (e.g. Automate the Boring Stuff) first.
  • High-school math. Algebra, basic geometry. Stage 01 covers everything beyond that.
  • A GPU is helpful but optional. Most exercises run on CPU or Colab/Modal/Lightning. Heavy training (GPT from scratch, fine-tuning) is the only place a GPU is required.
  • Curiosity over credentials. None of this requires a CS degree.

Tooling you’ll meet

LayerTools
NumericalNumPy, PyTorch, JAX
Tokenizationtiktoken, SentencePiece, HuggingFace tokenizers
LLM accessAnthropic SDK, OpenAI SDK, LiteLLM, Ollama
Embeddingssentence-transformers, OpenAI/Voyage/Cohere APIs
Vector DBspgvector, Qdrant, Weaviate, LanceDB, Pinecone
Fine-tuningTRL, PEFT, Axolotl, LlamaFactory, Unsloth
AgentsAnthropic Agent SDK, LangGraph, smolagents, raw loops
Evalpromptfoo, Inspect, Braintrust, Langfuse
ObservabilityLangfuse, LangSmith, Phoenix, OpenTelemetry

We’ll introduce them as needed. No pre-reading required.

Conventions

  • Code blocks use triple-backtick fences with language tags. Run them in order; they accumulate.
  • Callouts use blockquotes with prefixes: > Why:, > Pitfall:, > Aside:.
  • “See also” sections at the bottom of each article link forward and backward in the path.
  • Year markers appear when something is recent: e.g. (2025). The path is current as of early 2026.

What this path is not

  • Not a research curriculum. We don’t derive proofs. Goodfellow et al.’s Deep Learning is the textbook for that.
  • Not a benchmark race. We talk about state-of-the-art conceptually; specific leaderboard numbers move every month.
  • Not vendor-locked. Models from Anthropic, OpenAI, Google, Meta, and open-source all show up. Pick what works for you.

Updating this path

The path is a living document. When something fundamental shifts (e.g. a new training paradigm, a new architecture class), the relevant stage gets updated and the change is noted at the top of that stage’s README.


Ready? Start with 01-math-foundations or jump to your track.