ship · production · ops

Ship a production LLM stack from open source.

A code-along walkthrough for the 80% of LLM work that isn't pretraining: pick a base model, run it locally with vLLM, build an eval harness, wire RAG and tools, ship it with observability and cost tuning. Every concept cross-references the matching article and demo on this site, so the theory and the visualization are one click away.

18
steps
18
ready to read
6h 28m
of reading
9h 5m
of hands-on
who this is for

Engineers who deploy, not pretrain

if you build apps on LLMs
You're the audience
The 80% of working AI engineers who use OSS or API base models and need to ship reliable applications on top.
if you've finished /build
This is the next leg
/build taught you how a model works. /ship teaches you how to run one in production: serving, retrieval, tools, observability, cost.
if you're new to AI eng
Start here, not /build
You don't need to write a transformer from scratch to ship an LLM application well. Start at step 00; we'll cover what you need from /articles as it comes up.
the path

The 18-step curriculum

Three release tranches. Foundations is the one that ships first; Building and Production come next. Numbers are stable — bookmark step 03 today and it'll still be step 03 next month.

How this fits the rest of the site

Every step cross-references the matching theory article and demo. When you wire RAG in step 06, the article links to RAG Fundamentals and the RAG Visualizer. When you build the observability layer in step 12, the Observability Trace demo shows what the spans look like. The site becomes a multi-modal study environment: shell + editor in some tabs, math + visualizations in others.

Status: 18 live · 0 wip · 0 stubbed. Ships in three releases (foundations → building → production).