Sumanyu Sharma

Founder & CEO

Hamming AI

Sumanyu Sharma is the founder and CEO of Hamming AI, the first publicly launched dedicated QA platform for voice agents and now a testing, monitoring, and red-teaming platform for production voice agents. Hamming monitors 10,000+ agents and has analyzed millions of calls across healthcare, financial services, hospitality, home services, support automation, and enterprise deployments. Before Hamming, Sumanyu was Head of Data at Citizen, where his team worked on real-time public-safety systems that turned messy audio and incident signals into alerts people acted on. Before Citizen, he was a Senior Staff Data Scientist at Tesla, where he built production ML systems tied to hundreds of millions in annual revenue. Hamming combines those two instincts: Tesla-style release discipline for autonomous systems, and Citizen-style audio forensics for high-stakes real-world events.

Sessions (1)

I Monitored Crime Audio. Voice Agents Scare Me More.

2:25 PM·Track 6 · Room 2014

Bad voice-agent calls are starting to look less like QA bugs and more like incident scenes. I learned that instinct at Citizen, where noisy radio, ambiguous speech, fast-moving incidents, and real-time alerts became information people might actually act on. That work was stressful for obvious reasons. Voice agents scare me more. Not because they sound creepy. Because they sound good enough that people trust them. And now they are connected to calendars, CRMs, EHRs, reservation systems, refunds, transfers, account data, and support workflows. At Hamming, we monitor more than 10,000 voice agents and have analyzed millions of calls. The weird thing you learn at that scale is that production voice agents do not usually fail like demos. They fail quietly. The agent sounds natural, but misses a two-word answer. It handles the happy path, but loses the plot when the caller interrupts. It says the address was updated, but no tool call happened. It supports six languages, but gets worse at the switch point between two of them. This talk is about treating every bad voice-agent call like an incident scene. The evidence is there if you collect it: transcript, waveform, latency waterfall, interruption points, ASR uncertainty, tool trace, system-of-record state, and post-call outcome. At Tesla, I learned that autonomous systems need release gates and regression loops before they hit the real world. At Citizen, I learned that messy audio becomes safety-critical when people act on it. Voice agents need both instincts. The takeaway is a voice-agent forensics loop. What did the caller say? What did the agent think happened? What did the tool actually do? What does the system of record say? And how do we turn that weird production failure into a regression test before it happens 10,000 more times?

Voice & Realtime AIintermediate