Every RAG system bets everything on a single chunk size. 500 tokens? 800? Pick wrong, and half your queries fail before they start. But here's what nobody tells you: all the picks are wrong; there is no single chunk size that works for all queries. We ran oracle experiments across meeting transcripts, story chapters, and TV scripts. The result? Queries disagree violently on what chunk size works best - sometimes by 40 percentage points. Your "tuned" chunk size isn't a compromise; it's systematic underperformance. In this talk, we'll expose why fixed chunking fails and show you a dead-simple fix: index at multiple chunk sizes, aggregate at retrieval time using Reciprocal Rank Fusion. No retraining. No LLM overhead. Just 1-37% better recall across benchmarks by letting queries vote with their ranks instead of forcing them into one-size-fits-all boxes. Walk away knowing exactly when your chunk size is sabotaging you - and how to stop leaving 20-40% of your retrieval performance on the table. Speakers: Yuval Belfer — AI21 Labs; Niv Granot.
Search & Retrieval sessions at AI Engineer World's Fair 2026 in San Francisco.
Tuesday, June 30, 2026
3:20 PM - 3:40 PM·20m
Track 3 · Room 2003
Capacity: 250 attendees
Sign in to add this talk to your schedule.

Yuval Belfer
Senior Developer Advocate
AI21 Labs
Senior Developer Advocate at AI21 Labs; also involved with AI Tinkerers and YAAP (Yet Another AI Podcast).

Niv Granot
Niv Granot is speaking at AI Engineer World's Fair 2026.