Security Comprehension and Awareness Measure (SCAM) Demo

1Password

Feb 12, 2026

What happens when a state-of-the-art AI assistant can read your email, browse the web, and fill in your passwords — but can’t reliably tell a scam from the real thing?

In this video, you’ll see real examples of frontier AI agents:

Summarizing phishing emails without recognizing the threat
Logging into fake websites with stolen-lookalike URLs
Forwarding sensitive information without reading it
Entering personal and credit card details into fraudulent storefronts

These aren’t edge cases.

This is the result of 1Password’s new benchmark: SCAM — Security Comprehension & Awareness Measure.

Unlike traditional AI safety tests that directly ask a model whether something is malicious, SCAM evaluates AI agents in realistic, real-world scenarios. Instead of asking, “Is this phishing?”, we let the AI perform tasks where it might encounter:

Fake login pages
Fraudulent storefronts
Malicious emails
Lookalike domains

The results? Even the best-performing frontier models failed critical security scenarios. Safety scores ranged from 38% to 92%, and even the top model averaged multiple critical security failures across 30 scenarios.

But there’s good news.

When we added a short, general cybersecurity training “skill” to put the AI in a more security-aware mindset, performance improved dramatically across every model.

We’re open-sourcing:

The SCAM benchmark
The full results
The evaluation tooling

Our goal is to help developers build safer AI assistants — and to support the work 1Password is doing to enable AI agents to act securely on your behalf.

🔎 Learn more, explore the leaderboard, or contribute:
1Password.github.io/scam

#1Password #AI #CyberSecurity #Phishing #AIAgents #SecurityResearch