AI Engineer WF 2026
ScheduleSpeakers
Sign In
Sign In
Speakers/Felipe Blanes
Felipe Blanes

Felipe Blanes

Amazon

Felipe Blanes is speaking at AI Engineer World's Fair 2026.

Sessions (1)

Designing Evals That Earn User Trust
2:50 PM·Expo Stage 2

Most teams measure their agent against a benchmark, ship it, and hope. But when your agent serves real users, a benchmark won't tell you if it's actually working. This session is about building an eval suite that captures what success looks like in production, runs against real user workflows, and feeds back into product decisions. Here's the flywheel we use in practice: start with what success looks like from the user's perspective, instrument production workflows to capture those signals, diagnose where the agent falls short, and feed those insights into the next thing you build. You'll see how it shaped concrete product bets, turning eval results from a report card into a discovery tool.

Expo Stage 2intermediatetalk