# AI Is the Best Thing to Happen to Security

> AI makes attackers more capable, developers faster, and attack surfaces bigger. The asymmetry between offense and defense grows - and that's why security as a domain is about to get a lot more investment.

Date: June 28, 2026
Tags: AI, Security


LLMs have been around for a while now. When Anthropic [released a statement](https://www.anthropic.com/news/disrupting-AI-espionage) that nation state attackers are using Claude for attacks, I read it with a lot of skepticism.

<!--more-->

Back then, I had to beg models to invoke tools the right way. Passing a valid input to achieve ~~function calling~~ tool calling was painful. I couldn't see how attackers were getting any real value out of these models let alone "autonomously" hacking the world.

But over the last six months, something changed. Tool calling became a solved problem for larger models. Models started producing structured output reliably. And AI labs are pushing the same thinking and [tool calling](https://gorilla.cs.berkeley.edu/leaderboard.html) capabilities down to smaller models that can run on your laptop.

That shift was massive for truly autonomous agents in offensive side. Agents can reliably use tools, edit raw HTTP requests, follow source to sink, read outputs, store memory and decide what actions to be taken. The reliability had massive improvements over non-determinism.

### The monkey with a button

Let's slightly deviate and do a thought experiment.

Do you think a monkey can hack websites?

A decade ago, give a monkey a button. Every time the monkey presses it, SQLmap runs against random targets on internet. The monkey could hack into many vulnerable targets, thanks to the tool.

<img src="monkey-with-button.webp" alt="Monkey with a button" class="img-inline-right" />

However, the bottleneck was the tool. SQLmap follows a fixed set of checks. Throw in some custom logic - endpoints behind authentication, multiple user roles, heavy JavaScript - and the tool misses it. The monkey keeps pressing the button but nothing happens even if there's a valid second order SQL injection in the application.

Now replace SQLmap with an AI harness (OpenCode, Codex, etc).

The harness reads the response from the target and the underlying LLM models adapts the course of action. It switches tools. If SQLmap doesn't work, it tries something else. If it finds there are API endpoints to create users with different roles, it executes curl commands to create them. It chains multiple tools together based on what it sees. The AI doesn't replace tools like SQLmap - it orchestrates them on the fly.

Yes, AI agents are non-deterministic. Same target, different results each time. Some runs find real issues, some produce garbage. But hit or miss is still better than 100% miss. Over enough button presses, the monkey with the harness finds more vulnerabilities than SQLmap ever could.

The ceiling of what the button could do went up.

### Guardrails are a feature, not a guarantee

You might be thinking - don't these popular LLM models have safety guardrails? Won't these models just reject your request if you tell "Hack this domain". 

They do reject. Claude refuses harmful requests. So does GPT models and others. Jailbreaking is a [cat-and-mouse game](https://www.f5.com/labs/casi) that labs keep patching.

But that's only half the picture.

Capable open-weight models like DeepSeek and GLM can be self hosted. And tools like [heretic](https://github.com/p-e-w/heretic) permanently strip safety and censorship from model weights.

We can use these uncensored open weight models and just make it attack sites without wasting time on convincing that the agent is *authorized to attack* or *you're the CISO of that target company*. The closed weight models might be better than open weight alternatives, but uncensored self hostable models are going to achieve the goal.

### Code is cheap

Then there's the other side of the equation. The stuff being attacked. The stuff that pays security industry to "secure." The code itself.

When creating code is cheap, a lot gets created at a much faster pace. More APIs. More endpoints. More infrastructure. More attack surface.

It's not just volume that increases but also the complexity. AI helps teams ship faster, but faster doesn't mean simpler. Teams pick frameworks and stacks based on LLM recommendations. Especially when you're learning how to code. Teams with less coding experience ship production systems using AI assistance - changing the underlying threat model without even realizing it. New tech gets adopted, existing systems get entangled, and assumptions pile up.

More surface. More programming languages. More dependencies. More shadow IT. More complexity. More things that can go wrong.

### Defenders have the same tools. Not the same game.

Defenders can use AI tools to attack and secure their codebases. 

AI-powered SAST, anomaly detection, automated triage, AI code review. Or just run Claude Code in a [loop](https://ghuntley.com/loop/) while also providing tooling to determine if a detected issue is actually exploitable or just missing best practice that doesn't apply to our usecase.

But there are two asymmetries that make this game fundamentally unfair.

**Asymmetry #1: attackers aim wins, defenders aim to prevent losses.**

An attacker finds at least one hole to get in. A defender has to cover almost every possible hole to reduce the changes of getting compromised. AI helps defenders cover more but the attack surface is growing faster than coverage can scale. Bring in an open source tool that helps other AI tools perform well, now it's part of attack surface. You'll have to manage the CVEs, authz/authn, storage, etc for that tool.

Then there's consequences angle. If an AI-powered WAF blocks legitimate traffic, the company loses money. If an AI-powered SOC auto-closes a real alert as a false positive, you *might* get breached. Attackers don't have this problem. If an attack using LLM fails, they just retry or move on to a different target. The monkey just presses the button again.

**Asymmetry #2: offense is technical, defense is political and organizational.**

Amazing [quote](https://youtu.be/PLJJY5UFtqY?t=51) from Halver Flake's BlackHat keynote.

An attacker uses AI to attack sites, gets the binary result - pwned or not. Simple. If pwned, continue further and exploit. If not pwned, retry or give up or pivot.

On the defender's side, finding bugs using AI tools is just one part. Detected security issues with no action items is just visibility. To get them fixed - convince the VP of Engineering to prioritize the fix, negotiate maintenance/patch window with ops, create process docs and document these findings for compliance.

The bottleneck on defense was never just technical. *Was never just finding bugs*.

### AI is the best thing to happen to security domain

Now, here's what I'm thinking.

AI helps attackers find more with less skill. AI helps developers ship more code faster. Both directly increase what defenders need to protect against. Defenders have AI too, but unlike attackers, they need it to be reliable, accountable, and organizationally approved. That gap continues to exist.

That imbalance leads to more breaches (or at least attempts of breaches). More breaches eventually force stricter regulations, bigger security budgets, and more demand for security products and people. Companies that ignored security for years will no longer have that luxury.

I can't think of any other innovation that could have an impact on security domain like AI does. 

It increases money that flows into security domain, but will it increase the pay to seasoned individuals defending organizations. I don't know. Time will tell.

