Human-agent collaboration is changing, becoming more visual. The agents most teams ship today still wait for us to type a paragraph to explain what we're looking at. They cannot see a screen, navigate a UI that changes, or recover when an application throws an unexpected modal. That is the architectural gap between agents that demo well and agents that work alongside real teams in real software. Perception agents close it. They see and use computers the way people do, reason about what they see, and act with clicks and keystrokes.
Autoresearch sessions at AI Engineer World's Fair 2026 in San Francisco.
Wednesday, July 1, 2026
9:50 AM - 10:10 AM·20m
Main Stage
Capacity: 4000 attendees
Sign in to add this talk to your schedule.

Antje Barth
Member of Technical Staff
Amazon AGI Lab
@anbarth
Member of Technical Staff at Amazon AGI, AI product leader, keynote speaker, and O'Reilly author. She also co-instructed Generative AI with Large Language Models with DeepLearning.AI.