ShadowRay exposed over a billion dollars of data through a missing authentication check. It wasn't a zero-day. It wasn't a clever new attack class. It was a default config someone never flipped off. That story is not the exception in production ML, it's the rule. We synthesized 139 peer-reviewed papers on production ML security across access control, runtime security, infrastructure, and operations. Five findings stood out, and one of them upends how most teams think about ML security: - Misconfiguration, not missing features, is the dominant failure mode. The mechanisms exist. Teams aren't using them, or are using them wrong. - Adversarial defenses impose 15–30% inference overhead, which is why almost no production system actually runs them. - ML-specific security tooling lags general DevOps tooling by years. - Security, data-science, and ops teams operate in expertise silos that create persistent gaps no single team can see. - LLM and multi-tenant GPU threats are evolving faster than defenses (prompt injection, RAG poisoning, GPU side channels). This talk walks through the four-pillar defense-in-depth framework, the six-category threat taxonomy that maps each attack to its primary and secondary defenses, and a four-level security maturity model that matches overhead budgets to deployment contexts. You leave knowing where your stack actually sits and which 3 misconfigurations account for most of the risk.
Security sessions at AI Engineer World's Fair 2026 in San Francisco.
Tuesday, June 30, 2026
11:10 AM - 11:30 AM·20m
Track 5 · Room 2005
Capacity: 250 attendees
Sign in to add this talk to your schedule.

Lovina Dmello
Senior Software Developer
NVIDIA
Senior Software Developer at NVIDIA specializing in TensorRT infrastructure, with over eight years of experience across cloud infrastructure, machine learning, and security; previously worked at Apple and Oracle.