Stage 11 — Agents
An agent is an LLM in a loop with tools. The model takes an action, observes the result, decides what to do next, and repeats until done. This loop is the basic unit of every “AI agent” you’ve seen — Claude Code, ChatGPT browsing, code-fixing bots, customer support agents, browser agents.
The architecture is simple. Making it reliable in production is not.
Prerequisites
- Stage 08 (prompting, structured outputs)
- Stage 09 (RAG, for retrieval-augmented agents)
Learning ladder
- Agent loop & architecture — the core pattern
- Tool use & function calling — how the model invokes external capabilities
- Memory systems — working, episodic, semantic
- Planning & reflection — ReAct, plan-and-execute, reflexion
- Multi-agent orchestration — supervisor, swarm, debate
- Guardrails & safety — keeping agents inside their lane
- Browser & vision agents — the embodied frontier
MVU
You can:
- Build a single-agent loop in <100 lines of code without a framework
- Define tools with clean schemas and good error semantics
- Articulate when to add a second agent vs more tools to one
- Prevent the most common failure modes (loops, drift, runaway cost)
Exercise
Build an agent that can search the web, read pages, and answer multi-hop questions. No agent framework allowed for the first version. Then add: a tool registry, basic memory (summarize old turns), retry logic, a budget cap. Then ask: would a framework actually help me here?
Why this stage matters
In 2026, “agents” is what most product teams want to ship. Most of them ship something that works in demos but fails in production. The difference is in this stage’s content: tool design, memory management, error handling, evaluation.
Hands-on companions
This stage has the most code-side companion content on the site. After the theory:
Ship the agent stack:
- /ship/09 — tools and function calling — tool registry, JSON-schema from type hints, OSS-model adapters
- /ship/10 — build the agent loop — three-axis budgets, history pruning, named failure modes (thrashing, premature giving-up, format drift)
- /ship/11 — multi-agent orchestration — supervisor / workers / critic, plus an honest “skip the orchestrator” flowchart
See it as a real product — three case studies, increasing in complexity:
- /case-studies/02 — code-review agent — propose-then-act tools, action-rate as the metric, when not to comment
- /case-studies/03 — research assistant — multi-agent fan-out, real cost/latency benchmark, synthesis-not-concatenation
- /case-studies/04 — customer-support bot — RAG + tools + escalation logic; the product that composes everything
See also
- Stage 09 — RAG — agentic RAG is one common pattern
- Stage 13 — Production
- Stage 14 — Applications
