Track A — Software Engineer → AI Product Engineer

For someone who can already write code at a senior-engineer level and wants to ship AI-powered features in production. You’ll skip the math/ML internals on the first pass and circle back when something breaks that requires it.

Time: 8–12 weeks at ~10 hours/week. Endpoint: you can scope, build, evaluate, and operate an AI feature — RAG, agent, or both — at production quality.

What you skip (for now)

Stages 1–4 — math, ML fundamentals, neural network internals, language modeling history. You don’t need them to ship; come back when you hit a debugging wall.

You’ll also skim, not read large parts of:

Stage 6’s GPT-from-scratch.
Stage 10’s RLHF/DPO/GRPO mechanics.

You can build a great AI product without ever training a model. But you do need to understand what’s happening enough to debug.

Week-by-week

Week 1 — Mental model + first calls

Read:

Top README and LEARNING_PATH.md.
Stage 4 — Why transformers (skim — you need the intuition, not the history).
Stage 5 — README, Tokenization.
Stage 8 — Prompt fundamentals.

Build:

100 API calls to Claude or GPT through the official SDK.
A CLI tool: stdin → model → stdout, with streaming.
Try varying temperature, top-p; observe.

Goal at end of week: “I can call an LLM and explain what every parameter does.”

Week 2 — Prompting and structured output

Read:

Build:

An email-classifier that takes a subject + body and returns one of 8 categories with reasons. JSON output.
Run it on 100 test emails. Calculate accuracy.
Make it output JSON 100/100 times reliably (use strict mode / tool calls).

Goal: “I trust my LLM calls to give me the structure I asked for.”

Week 3 — Embeddings and search

Read:

Build:

Take 1k documents (your notes, Wikipedia subset, ArXiv abstracts).
Embed with text-embedding-3-small or bge-large-en-v1.5.
Store in pgvector or ChromaDB.
Query CLI: text in → top-5 nearest neighbors out.
Evaluate: write 30 query/expected-doc pairs; measure recall@5.

Goal: “I can build semantic search and tell you when it’s broken.”

Week 4 — Full RAG

Read:

Build:

Wrap your search from week 3 with a generation step: retrieve → prompt → answer with citations.
Add hybrid search (BM25 + dense).
Add a reranker (Cohere rerank-3.5, bge-reranker-v2-m3, or LLM-as-judge).
Build a 50-query eval set with expected sources.
Measure: recall@10, faithfulness via LLM judge.

Goal: “My RAG works, and I can prove it.”

Week 5 — Advanced RAG and agents

Read:

Stage 9 — Advanced retrieval patterns (skim — pick 2 to understand deeply).
Stage 11 — Agent loop & architecture.
Stage 11 — Tool use.

Build:

An agent loop in <100 lines, no framework.
Five tools: search the web, read URL, search your KB, calculator, finalize.
Make it answer a multi-hop question correctly: “Who was the CEO of Apple when the iPhone 7 launched?”

Goal: “I can write an agent loop from scratch and explain every step.”

Week 6 — Agent depth

Read:

Build:

Add: budget caps, retry-on-tool-failure, conversation summarization for long sessions, basic input filter (length cap, profanity).
Try the same agent with a reasoning model (Claude with extended thinking, o-series). Compare quality, cost, latency.

Goal: “I know when to use a reasoning model and when it’s overkill.”

Week 7 — Production discipline

Read:

Build:

Add tracing to your RAG and your agent (Langfuse, Phoenix, or LangSmith).
Add cost monitoring per request.
Add prompt caching for static prefixes.
Set up two-tier routing (cheap model first, fallback to frontier).

Goal: “I can debug a production issue from the trace alone.”

Week 8 — Guardrails, hallucination, and evals

Read:

Build:

Add: input validation (PII detection, length caps, prompt injection scan), output validation (schema check, citation verification).
Build a regression eval: 50 cases, run on every prompt change.
Add an LLM-as-judge faithfulness check.
Wire all of this into a CI pipeline.

Goal: “I won’t ship a regression.”

Weeks 9–12 — Ship something real

Pick one of:

A vertical agent for a domain you know (legal contract review, recipe assistant, code reviewer for a specific framework).
A vertical RAG over a corpus you care about (your own notes, a public dataset, internal docs at work).
A real workflow (transcribe meetings → action items, summarize daily news, monitor a topic for changes).

Polish to “would show a stranger” quality. Write up the engineering decisions. Post on GitHub + your blog or LinkedIn.

Goal at end: something public with your name on it.

When to backtrack into the foundations

Skip Stages 1–4 until one of these happens:

Embedding similarity is doing weird things. → Stage 5 — Semantic geometry and a refresher on linear algebra (cosine, dot products, dimensionality).
You’re choosing between models and don’t know what B in 7B means. → Stage 6 — Transformer block.
Someone asks you why their fine-tune broke and you can’t even ask the right questions. → Stage 10 — When to fine-tune.
You hit a real classification problem that prompting can’t solve. → Stage 2 — ML Fundamentals.

You can always come back. The path is a graph, not a staircase.

What “done with Track A” looks like

You can:

Take a fuzzy product idea (“we should add AI to X”) and design a concrete system.
Estimate cost and latency before writing code.
Build a RAG or agent end-to-end with proper evals and guardrails.
Debug a production failure from a trace in <30 minutes.
Articulate when fine-tuning would help and when it wouldn’t.
Ship without breaking the bank.

That’s most of what an AI product engineer does day-to-day. From here, the next investments are:

Stage 14 case studies to lift patterns from real products.
Stage 15 career to think about specialization.
Stages 1–7 when curiosity or job needs pull you there.

Track A — Software Engineer → AI Product Engineer

What you skip (for now)

Week-by-week

Week 1 — Mental model + first calls

Week 2 — Prompting and structured output

Week 3 — Embeddings and search

Week 4 — Full RAG

Week 5 — Advanced RAG and agents

Week 6 — Agent depth

Week 7 — Production discipline

Week 8 — Guardrails, hallucination, and evals

Weeks 9–12 — Ship something real

When to backtrack into the foundations

What “done with Track A” looks like

See also