Inference

Vertical Mobility: Building an AI Inference Platform That Scales from MVP to Trillion-Parameter Workloads

TalkIntermediate

The future of AI inference is not one-size-fits-all. This talk explores a multi-tiered architecture that supports the full AI lifecycle, from rapid, pay-per-token experimentation to dedicated, SLO-bound production and extreme-scale, self-managed deployments. Learn about lessons learned from CoreWeave’s inference stack as performance, cost, and control requirements evolve. Speakers: Rita Zhang — Coreweave; Sitanshu Gupta — Coreweave.

About the Inference Track

Inference sessions at AI Engineer World's Fair 2026 in San Francisco.

When

Thursday, July 2, 2026

12:05 PM - 12:25 PM·20m

Where

Track 9 · Room 2016

Capacity: 250 attendees

Speakers (2)

Rita Zhang

Coreweave

Rita Zhang is speaking at AI Engineer World's Fair 2026.

Sitanshu Gupta

Coreweave

Sitanshu Gupta is speaking at AI Engineer World's Fair 2026.

AI Engineer World's Fair 2026