How to know if your agents are correct with Dylan Williams

LimaCharlie

Apr 16, 2026

Join us for this week's Defender Fridays as we explore AI agent evaluation with Dylan Williams, Co-founder and Chief Research Officer of Spectrum Security.

At Defender Fridays, we delve into the dynamic world of information security, exploring its defensive side with seasoned professionals from across the industry. Our aim is simple yet ambitious: to foster a collaborative space where ideas flow freely, experiences are shared, and knowledge expands.

What We'll Discuss

In this episode, Dylan Williams breaks down one of the hardest problems in agentic AI: how do you actually know your agents are doing the right thing? From building expert rubrics to deploying agent judges in production, Dylan shares lessons from the front lines of building and evaluating AI-driven security workflows.

Key Topics:

Why human expert review is the gold standard for agent QA -- and why it doesn't scale
How to build and calibrate an agent judge using labeled production traces
Why deterministic validation should always come before vibes-based evaluation
How agent judges drift over time and why turning failures into tests is the fix
The role of trajectory analysis in diagnosing what agents actually did -- and why
What a self-improving agentic eval loop could look like in cybersecurity

About Our Guest

Dylan Williams is Founder of Spectrum Security, a company building at the intersection of agentic AI and security. A longtime blue teamer with deep roots in detection engineering, Dylan has been working on the hard problem of AI agent correctness and evaluation since before most teams knew they needed to.

Register for Live Sessions

Join us every Friday at 10:30am PT for live, interactive discussions with industry experts. Whether you're a seasoned professional or just curious about the field, these sessions offer an engaging dialogue between our guests, hosts, and you -- our audience.

Register here: https://limacharlie.io/defender-fridays

Subscribe to our YouTube channel and hit the notification bell to never miss a live session or catch up on past episodes on our website!