demo · strategies + composition
Three ways to find the right chunk
Dense embeddings catch paraphrase. BM25 catches rare keywords. Hybrid (Reciprocal Rank Fusion) catches both. Pick a query and see the same 12 chunks ranked three different ways. Switch to Composition to see the full RAG tradeoff — top-k drives a wave through scores, retrieval, recall, and the production bill.
The three strategies
- Dense — embed the query, embed each chunk, rank by cosine similarity. Captures paraphrase ("how does attention work" matches a chunk titled "Self-attention" even though the words don't overlap).
- Sparse (BM25) — classic IR scoring over
token overlap. Captures rare keywords (a query for
"LoRA"nails the LoRA chunk because "LoRA" is rare in the corpus). Misses paraphrase entirely. - Hybrid (RRF) — Reciprocal Rank Fusion. For
each chunk, sum
1/(60 + rank)across both strategies. The constant 60 is the standard pick from Cormack 2009 ; tune it if you like. RRF gives a robust merge without having to calibrate the very different score scales of dense and sparse.
What to look for
- Paraphrase wins. Try
"How does self-attention work in transformers?"— dense correctly ranks the "Self-attention" chunk first; BM25 gets distracted by chunks that mention "transformer" more often. - Rare keyword wins. Try
"What's the difference between LoRA and full fine-tuning?"— BM25 absolutely nails the LoRA chunk because the keyword is a strong signal. Dense gets it too, but with less margin. - Compare the rank table. When dense and sparse ranks disagree on a chunk, hybrid does the actual merging. When they agree, hybrid just confirms.
What this leaves out
A real RAG stack adds: a reranker (a small cross-encoder that reads query + chunk together; dramatically better top-1 quality), filtering by metadata (date, source, permission), query rewriting (HyDE, FLARE), and evaluation on a golden set with retrieval@k and faithfulness metrics. This demo shows the core retrieval step in isolation — the place where most teams get the most lift for the least effort.
Anchored to 09-rag/rag-fundamentals
and 09-rag/hybrid-search-and-reranking .