Workshops Day 1

What is an Inference Engine, Anyway?

WorkshopAdvanced

To run state-of-the-art inference yourself, you must master the inference engine: vLLM, SGLang, TRT-LLM, or your own jawn. The inference engine manages the lifecycle of an inference request, from input to output. In this workshop, we'll examine the architecture of modern high performance inference engines, the key techniques that inference engines need to deliver that performance, and the traces and metrics that inference engines emit.

About the Workshops Day 1 Track

Workshops Day 1 sessions at AI Engineer World's Fair 2026 in San Francisco.

When

Monday, June 29, 2026

11:05 AM - 12:05 PM·1h

Where

Track 8 · Room 2020

Capacity: 250 attendees

Speaker

Charles Frye

AI Engineer

Modal

AI Engineer at Modal Labs focused on AI infrastructure and inference workloads. Holds a PhD in neural network optimization from UC Berkeley and previously worked at Weights & Biases; has contributed to Full Stack Deep Learning / LLM Bootcamp educational initiatives.

AI Engineer World's Fair 2026