At Sakana AI we build agents that run for hundreds of turns to read literature, run experiments, and draft papers. The model rarely breaks. The harness around it is the weak point: the agent contradicts a decision it made 80 turns ago, redoes finished work, or drifts from the question it started on. This is the binding-constraint thesis. For long-horizon tasks, reliability is set as much by the harness as by the model as clearly instantiated in autoresearch recent efforts. This is a field guide to the harness's memory layer. I'll trace a real research agent through its lifecycle, show exactly where context rot and drift set in, and cover the patterns that hold over 100+ turns: three-tier memory, progressive disclosure, recall-first compaction, sub-agent isolation, and architectural memory beyond the vector database. I will show how to measure whether your memory harness actually helps, at the trajectory level, so you stop tuning prompts to fix what's really a state-management bug.
Memory & Continual Learning sessions at AI Engineer World's Fair 2026 in San Francisco.
Wednesday, July 1, 2026
11:40 AM - 12:00 PM·20m
Main Stage
Capacity: 4000 attendees
Sign in to add this talk to your schedule.

Stefania Druga
Research Scientist
Sakana.ai
@Stefania_druga
Hi! I am Stef. I am currently a Research Scientist at Sakana AI in Tokyo, Japan working on novel architectures beyond the transformer. Previously I was a research at Google Deep Mind working on novel multimodal AI applications. I graduated with a Ph.D. in Creative AI Literacies at the University of Washington Information School. I am a former an AI Resident at X Moonshot Factory, product engineer at Fixie.ai, a Weizenbaum Research Fellow. An awardee of the NSF Formal Verification in the Field Grant and the Jacobs Foundation Grant. I was previously a LEGO Papert Fellow during my time as a master student at MIT researching with Prof. Mitch Resnick and the Scratch team.