Stage 13 — Production

Shipping AI is different from shipping a feature. Models are stochastic. Inputs are open-ended. Costs scale unpredictably. This stage covers the operational layer: deploying, evaluating, monitoring, and trusting AI in production.

Prerequisites

  • Stages 08–11 (you’ve built something to ship)

Learning ladder

  1. Deployment architectures — API providers, self-hosted, edge
  2. Evaluation & benchmarks — offline, online, LLM-as-judge
  3. Guardrails — input/output filters, schema validation, jailbreak defense
  4. Observability & tracing
  5. Cost & latency — caching, batching, speculative decoding, model routing
  6. Hallucination mitigation
  7. Data systems for AI — ingestion, skew, drift, feedback loops, lineage
  8. Enterprise considerations — security, compliance, data boundaries

MVU

You can:

  • Take a working LLM feature from prototype to production with confidence
  • Set up evals, observability, and guardrails before shipping
  • Estimate per-query cost and latency
  • Diagnose a regression in production within hours, not days

Exercise

Take a Stage 09 RAG or a Stage 11 agent. Add: end-to-end evals, production observability, two layers of guardrails, a cost cap. Run for a week with real or simulated traffic. Read the traces. Find one bug.

Hands-on companions

The entire production-ops layer of this stage has a code-side companion in /ship’s Production release:

For products that put all of this into one running service, the /case-studies section ships four — pick the one closest to what you’re shipping.

See also