Can You Trust AI Code? I Built a Scanner to Find Out

Snyk

Feb 16, 2026

Can you trust the code AI generates? In this video, we build a custom AI Security Benchmarking tool to put models like Gemini, Mistral, and GLM 4.5 to the test. Using Windsurf, OpenRouter, and Snyk, we automate a pipeline that prompts multiple LLMs to write an application, then immediately scans the output for security vulnerabilities.

Use Snyk for free to find and fix security issues in your applications today! https://snyk.co/ugLYn

⏲️ Chapters ⏲️

00:00 The Big Question: Is AI code secure?

00:32 Identifying vulnerabilities

00:49 Setting up the stack: OpenRouter & Snyk API keys

01:32 Configuring your IDE (Windsurf & Cursor)

01:52 Designing the master security prompt

03:07 Automating the build with AI agents

04:14 Exploring the benchmarking dashboard

05:11 Testing different LLMs (GLM 4.5 & Trinity)

09:25 Analyzing the Snyk security report

10:03 Final Verdict: Can you trust AI-generated code?

⚒️ About Snyk ⚒️

Snyk helps you find and fix vulnerabilities in your code, open-source dependencies, containers, infrastructure-as-code, software pipelines, IDEs, and more! Move fast, stay secure.

Learn more about Snyk: https://snyk.co/ugLYl

📱 Connect with Us 📱

🖥️ Website: https://snyk.co/ugLYl
🐦 X: http://twitter.com/snyksec
💼 LinkedIn: https://www.linkedin.com/company/snyk
💬 Discord: https://discord.gg/devsecops-community-918181751526948884