click · drag · type · learn

57 demos. 13 narrated walkthroughs. Zero theory.

Every demo is anchored to a single article and built around one mechanism. Click any card. The animated ones auto-play with voice narration; the interactive ones let you mess with the dials yourself.

57

demos live

13

animated walkthroughs

25

run real models

15

curriculum stages

if you only try four

The marquee demos

The four that earn the visit. Two are animated walkthroughs; all four are tier-A (real models or real algorithms running in your browser, not cartoons).

Inference Pipeline

The capstone. Question → tokens → embeddings → 12 layers → logits → sampling → next token → answer. Step through a real GPT-2 forward pass end to end.

tier A 06 · Transformers

Attention Inspector ▶ animated

Three modes: narrated walkthrough, interactive inspector, and a Composition view that drives one slider through pattern → diversity → entropy → semantic resolution. Real GPT-2 attention.

tier A 06 · Transformers

Diffusion Process ▶ animated

Voice-narrated forward + reverse diffusion on a 2D smiley, plus the full interactive sandbox with four shapes. The toy version of every image generator.

tier B 12 · Multimodal

Two modes: a 3-strategy comparison (dense / BM25 / hybrid) and a Composition view where top-k drives a wave through scores → retrieved → recall → cost-per-query.

tier A 09 · RAG

filter the catalogue

All 57 demos

Foundations

The math and the building blocks every model rests on.

16 / 16 demos

Linear Algebra Lab

Drag two vectors, watch dot product, projection, and angle update live. The math behind cosine similarity, attention scores, and embeddings.

01 · Math Foundations

mathvectorsgeometry

Eigenvectors Lab

Spin a vector, apply a matrix, see which directions only scale instead of rotating. The math behind PCA, PageRank, and stability analysis.

01 · Math Foundations

mathvectorsdecomposition

Bayes Theorem Lab

Why a 99%-accurate medical test can be mostly wrong. Drag prevalence, sensitivity, and false-positive rate; see the posterior live on a 1000-person grid.

01 · Math Foundations

mathprobability

Gradient Descent

The math that trains every neural network. Drop a ball on a loss curve, watch it find the minimum. Animated explainer + full interactive sandbox.

01 · Math Foundations

mathoptimizationtraining

Entropy & KL Divergence

Two distributions. Drag the bars. Watch entropy, cross-entropy, and KL divergence update live. The math behind every neural-network loss function.

01 · Math Foundations

mathinformation-theoryloss

Bias-Variance Tradeoff

Slide the polynomial degree from 1 to 15. Watch a straight line miss the curve, a degree-3 fit nail it, and a degree-15 fit chase noise. The most important plot in ML.

02 · ML Fundamentals

mlregularizationtraining

Confusion Matrix Lab

Slide the classifier threshold and watch precision, recall, F1, and the ROC curve update live. Why there is no single 'right' threshold.

02 · ML Fundamentals

mlevaluationmetrics

K-Means Clustering

Narrated walkthrough of one full convergence (with voice narration), plus the interactive sandbox. The classic unsupervised algorithm in motion.

02 · ML Fundamentals

mlclusteringunsupervised

Linear Regression Lab

Drag a line through noisy data. Watch MSE update live. Click 'fit' for the closed-form OLS solution. The simplest supervised model.

02 · ML Fundamentals

mlsupervisedregression

Activation Functions Lab

ReLU, Leaky ReLU, GELU, SiLU, tanh, and sigmoid overlaid on one plot — including their derivatives, which is what backprop multiplies through.

03 · Neural Networks

neural-networksactivations

Backpropagation Walkthrough

A tiny computational graph. Animated explainer with voice narration + interactive step-through. Backprop is not magic — it's calculus, applied mechanically.

03 · Neural Networks

neural-networkstrainingcalculus

Neural Network Trainer

A 2-layer MLP trains live on moons / circles / XOR / blobs. Watch the decision boundary deform every frame as SGD adjusts the weights.

03 · Neural Networks

neural-networkstrainingsupervised

SGD, momentum, RMSprop, and Adam race down the same loss landscape. The visceral answer to 'why is Adam the default everywhere?'

03 · Neural Networks

optimizationtrainingneural-networks

n-gram Language Model

Predict the next word the way NLP did it for 20 years before transformers — by counting. Pick a corpus, pick an n, watch the limits emerge.

04 · Language Modeling

language-modelingn-gramhistory

Animated side-by-side: vanishing vs exploding gradients on the same RNN cell, narrated. Plus the interactive sandbox with a real W_h matrix.

04 · Language Modeling

language-modelingrnnhistory

Perplexity Calculator

The number every LM paper opens with — computed live. Score any text against three n-gram models trained on different corpora; see why 'lower is better' only works inside a tokenizer family.

04 · Language Modeling

language-modelingevaluationn-gram

LLM Internals

How a transformer actually reads, attends, and predicts.

13 / 13 demos

Tokenizer Surgery

Paste any text, see how BPE breaks it into tokens, compare GPT-2 / GPT-4 / GPT-4o, find out why your prompt is more expensive than you think.

05 · Tokens & Embeddings

tokenstokenizationreal-model

Embedding Playground

Pick two phrases, see their cosine similarity. Watch synonyms cluster and antonyms drift in a 2D PCA scatter of real sentence embeddings.

05 · Tokens & Embeddings

embeddingssemantic-searchreal-model

Word Arithmetic

king − man + woman = ? Compose vector arithmetic over a curated 350-word vocabulary and watch the nearest neighbour shift live.

05 · Tokens & Embeddings

embeddingsstatic-embeddings

▶ animated ◆ composition

Attention Inspector

Three modes: narrated walkthrough, interactive inspector, and a Composition view that drives one slider through pattern → diversity → entropy → semantic resolution. Real GPT-2 attention.

06 · Transformers

attentiontransformersreal-model

Head Specialization Gallery

All 12 heads of one layer at once, in small multiples. See which head learned coreference, which one tracks punctuation, which one just smears.

06 · Transformers

attentiontransformersreal-model

Positional Encoding Lab

Sinusoidal vs learned vs RoPE vs ALiBi. Same sequence, four encodings, animated cursor sweeping through positions — see how each tells the model 'this token is 3rd from the start'.

06 · Transformers

transformerspositional-encoding

Scaling Laws Calculator

Slide model size and training tokens, watch Chinchilla's loss curve respond. See exactly where GPT-3 went wrong (undertrained) and why LLaMA-3 trains 70B on 15T tokens.

07 · Modern LLMs

scalingtrainingmodern-llms

MoE Routing Visualizer

Watch a Mixture-of-Experts router dispatch tokens, one at a time, to specialist experts. Lights up which experts fire and accumulates load. The trick that powers Mixtral, Grok, GPT-4 (rumored), DeepSeek-V3.

07 · Modern LLMs

modern-llmsmoearchitecture

Reasoning vs Non-Reasoning

Standard LLM versus a reasoning model on hard problems. Watch test-time compute earn its tokens — the trade behind o-series, R1, and extended thinking.

07 · Modern LLMs

modern-llmsreasoninginference

Long Context Attention

Sliding window, sparse, and global-local attention masks visualized. Why O(N²) is the long-context wall and how modern models work around it.

07 · Modern LLMs

modern-llmsattentionlong-context

Toggle the cache on and off, watch per-token compute explode. The single biggest optimization in modern inference engines.

06 · Transformers

transformersinferenceoptimization

Slide a messy input distribution. Watch it get centered, normalized to unit variance, then rescaled by learnable γ and β. The reason every transformer block trains.

06 · Transformers

transformersnormalizationtraining

live · capstone

Inference Pipeline

The capstone. Question → tokens → embeddings → 12 layers → logits → sampling → next token → answer. Step through a real GPT-2 forward pass end to end.

06 · Transformers

transformersinferencereal-model

Building with LLMs

Once you have a model: prompts, retrieval, tuning, agents.

20 / 20 demos

Drag temperature, top-p, top-k. Watch the next-token probability distribution reshape live on real GPT-2 logits.

08 · Prompting

samplingdecodingreal-model

Chain-of-Thought Effect

Same model, same problem, two prompts. See how 'let's think step by step' rescues the same model from getting it wrong on five classic reasoning tasks.

08 · Prompting

promptingreasoning

Few-shot vs Zero-shot

Pick a task. Slide from 0 to 3 examples in the prompt. Watch the same model go from useless to nailing it without changing the model at all.

08 · Prompting

promptingin-context-learning

Structured Outputs

Schema-enforced JSON, side by side with freeform prose. Live JSON validation. Why production AI runs on structured outputs, not 'please respond in JSON'.

08 · Prompting

promptingproduction

Beam Search vs Sampling

Three decoding strategies side by side: greedy, beam search, and stochastic sampling. Watch the beam tree expand and prune step by step.

08 · Prompting

decodingsampling

◆ composition

Two modes: a 3-strategy comparison (dense / BM25 / hybrid) and a Composition view where top-k drives a wave through scores → retrieved → recall → cost-per-query.

ragretrievalreal-model

Chunking Strategies

Paste a long document, switch between fixed-char / fixed-token / sentence / paragraph chunking, see chunk sizes and overlap regions.

Watch a cross-encoder reranker re-order results from initial dense retrieval — animated reorder shows the shuffle in motion. The single biggest quality lever in production RAG.

ragretrievalreal-model

Visualize Low-Rank Adaptation: slide the rank, watch a small B·A factorization approximate a full weight update at 800× fewer parameters.

10 · Fine-Tuning

fine-tuningloraefficient-training

Quantization Lab

Watch FP32 weights cast down to INT8, INT4, and below. See memory drop 8× while reconstruction error stays manageable. The trick that runs 70B on a Pi.

10 · Fine-Tuning

fine-tuningquantizationinference

RLHF Preference Simulator

Pick A or B on paired completions. Watch a reward model emerge from your preferences. The pattern behind RLHF, DPO, and every aligned chat model.

10 · Fine-Tuning

fine-tuningrlhfalignment

◆ composition

Distillation Lab

Real GPT-2 teacher logits, a learnable student, KL divergence as the loss. Includes a Composition view that drives temperature through entropy → gradient field → trajectory → cost-per-token.

10 · Fine-Tuning

fine-tuningdistillationkl-divergence

Agent Trace Viewer

Step through agent loops, frame by frame, or hit play and watch them unfold. User → reasoning → tool calls → tool results → answer. Four hand-crafted traces.

Tool Use Builder

The JSON-schema dance behind function calling. Pick a query, watch the model emit a tool call, see the runtime execute it, see the answer come back.

Multi-Agent Orchestrator

Narrated fan-out → parallel work → fan-in walkthrough on a curated scenario, plus the interactive sandbox with three patterns. The pattern behind Claude's research mode.

agentsorchestration

Working, episodic, semantic — the three layers of agent memory. Walk a conversation through time, watch what each layer remembers.

ReAct & Reflection

The Reason-Act-Observe loop, plus self-reflection. Watch an agent walk through reasoning, take actions, observe results — and sometimes catch its own mistakes.

agentsreasoningreflection

Diffusion Process

Voice-narrated forward + reverse diffusion on a 2D smiley, plus the full interactive sandbox with four shapes. The toy version of every image generator.

12 · Multimodal

multimodaldiffusiongenerative

See how an image becomes a sequence of tokens. Each patch flattens to a vector and feeds a transformer just like text — same self-attention, different shape.

12 · Multimodal

multimodalvittokens

CLIP Zero-Shot Classifier

One shared 512-d space for images and text. Pick an image, watch a real CLIP-ViT-B/32 model rank captions; pick a caption, see which images match. The trick behind every reverse-search and image-conditioning stack.

12 · Multimodal

multimodalembeddingsreal-model

Production

Shipping AI that stays shipped.

8 / 8 demos

Cost & Latency Calculator

Pick a model, plug in your workload, see monthly cost and p50/p99 latency. Compare prompt-caching, batching, and speculative decoding side by side.

13 · Production

productioncostlatency

The eval pattern behind every fast iteration loop. Pick a sample, tune rubric weights, watch the winner change. The technique modern AI dev cycles run on.

13 · Production

productionevaluation

See input + output safety filters in action. Type any prompt, watch which guards fire on each layer of the defense.

13 · Production

productionsafety

Observability Trace

The waterfall view every production AI app needs. Press play and watch spans draw in real-time order. Find bottlenecks, spot retries, allocate cost.

13 · Production

productiontracing

Hallucination Mitigation

Same question, three responses — ungrounded, RAG-grounded, abstention. See where each one earns trust and where it doesn't.

13 · Production

productionsafetyrag

Calibration Lab

Reliability diagrams + ECE on four model types. See why most models are overconfident and how temperature scaling fixes it.

13 · Production

productionevaluationcalibration

Text-to-SQL Playground

Pick a natural-language question. See the generated SQL, then run it on a real (tiny) database. The application that proved LLMs could ship.

14 · Applications

applicationssql

Watch an agent navigate a web page step by step — observe the DOM, click, type, observe again. The pattern behind Computer Use and Operator.

14 · Applications

applicationsagentstool-use