AI Engineer World's Fair 2026

Stop Chunking Like It's 2022

TalkIntermediate

Every RAG system bets everything on a single chunk size. 500 tokens? 800? Pick wrong, and half your queries fail before they start. But here's what nobody tells you: all the picks are wrong; there is no single chunk size that works for all queries. We ran oracle experiments across meeting transcripts, story chapters, and TV scripts. The result? Queries disagree violently on what chunk size works best - sometimes by 40 percentage points. Your "tuned" chunk size isn't a compromise; it's systematic underperformance. In this talk, we'll expose why fixed chunking fails and show you a dead-simple fix: index at multiple chunk sizes, aggregate at retrieval time using Reciprocal Rank Fusion. No retraining. No LLM overhead. Just 1-37% better recall across benchmarks by letting queries vote with their ranks instead of forcing them into one-size-fits-all boxes. Walk away knowing exactly when your chunk size is sabotaging you - and how to stop leaving 20-40% of your retrieval performance on the table. Speakers: Yuval Belfer — AI21 Labs; Niv Granot.

About the Search & Retrieval Track

Search & Retrieval sessions at AI Engineer World's Fair 2026 in San Francisco.

Stop Chunking Like It's 2022

About the Search & Retrieval Track

When

Where

Speakers (2)

Stop Chunking Like It's 2022

About the Search & Retrieval Track

When

Where

Speakers (2)