There is no best model. There's only the right model for a given task, and the right model depends on your team's preferences, not a benchmark score. This talk makes the case for preference-aligned routing: choosing models by the constraints that actually matter — cost, latency, task type, model preference — instead of a single leaderboard number. We'll demo a sub-200ms routing decision running on a purpose-built 30B MoE model with no application code changes, walk through real coding workflows routing most traffic to open models without losing accuracy, and show where this goes next: evals, caching, and personalization. Speakers: Archana Kamath; Tyler Gillam.
AI Architects: AI Factories sessions at AI Engineer World's Fair 2026 in San Francisco.
Thursday, July 2, 2026
12:05 PM - 12:25 PM·20m
Leadership 2 · Room 3020
Capacity: 550 attendees
Sign in to add this talk to your schedule.

Archana Kamath
Archana Kamath is speaking at AI Engineer World's Fair 2026.

Tyler Gillam
Tyler Gillam is speaking at AI Engineer World's Fair 2026.