As demand for large scale AI systems grows, the limiting factor is no longer model capability, but compute availability, efficiency, and system design. Lambda cofounder/CEO Stephen Balaban and Gradient General Partner Zach Bratun-Glennon will examine how modern workloads interact with real-world compute constraints and the software investment and developments taking place to get the most out of cutting edge GPUs. They’ll unpack where training and inference workloads diverge in their infrastructure requirements, the practical limits of GPU utilization and what drives underperformance, and what we can expect to see next in AI infrastructure as constraints continue to evolve. Speakers: Zach Bratun-Glennon — Gradient; Stephen Balaban — Lambda.
Inference sessions at AI Engineer World's Fair 2026 in San Francisco.
Thursday, July 2, 2026
2:25 PM - 2:45 PM·20m
Leadership 2 · Room 3020
Capacity: 550 attendees
Sign in to add this talk to your schedule.

Zach Bratun-Glennon
General Partner
Gradient
@thezbg
General Partner at Gradient Ventures; invests in AI/ML, data science, vertical software, B2B marketplaces, fintech, and more. Prior to Gradient, led acquisitions and strategic investments for Google Cloud.

Stephen Balaban
Co-founder / CTO
Lambda
@stephenbalaban
Co-founder of Lambda, an AI infrastructure company focused on GPU servers, workstations, and cloud services for training neural networks. Former first engineering hire at Perceptio.