Lyrie
Industry-Analysis
0 sources verified·15 min read
By Lyrie Threat Intelligence·5/2/2026

Claude Security Hits Public Beta: Anthropic Just Reframed What "Static Analysis" Means and What That Tells Us About the AI-Defense Stack

TL;DR

Anthropic launched Claude Security in public beta this week. It is an agentic application-security scanner powered by Claude Opus 4.7 that reads source code the way a human security researcher would: it traces data flow across files, models how components interact, and identifies vulnerability classes (injection, authentication bypass, memory corruption, multi-component logic bugs) that traditional rule-based SAST tools systematically miss. Initial availability is limited to Enterprise customers with Team and Max tiers expected later.

This is not just another SAST tool with an AI sticker on the box. It's a category redefinition. Where SAST tools for two decades have been pattern-matchers (regex, AST queries, taint-tracking with hand-written rules), Claude Security operates as a reasoning agent over source code — closer to what Mitch Ashley at Futurum Group called "collapsing application security detection and remediation into one agent-driven workflow."

The strategic significance is bigger than the product itself: the foundation-model vendors are now expanding directly into the security-tooling market. Anthropic's launch follows OpenAI shipping Codex Security capabilities earlier this year and Google's Project Naptime becoming a productized scanner this past quarter. The major AI labs have all read the same room: the industry pays good money for security tooling, AI is uniquely good at security tooling, why would they not build it themselves.

For the broader cyber-defense ecosystem this raises four important questions, all of which we'll work through below:

1. What does Claude Security actually do that traditional SAST doesn't?

2. What does it NOT do — what gap remains for downstream defensive tooling?

3. What does it mean strategically that the foundation-model vendors are now selling defense?

4. How should defenders, founders, and CISOs think about positioning their stack relative to this?

This is the seventh and final piece in our weekly series tracking the rapid expansion of AI capability into the cybersecurity stack. The previous six covered identity, system, kernel, network, production agents, deception doctrine, and anonymity. This one covers the layer where AI vendors are now eating their own lunch: vulnerability discovery in source code itself.

What Anthropic Actually Shipped

From Anthropic's product announcement and Help Center documentation:

The pitch

"Claude Security understands context, traces data flows across files, and identifies complex, multi-component vulnerability patterns that traditional scanners might not detect. Then it suggests a fix."
"Claude reasons about code the way a security researcher does, tracing data flows, reading source code, and working out how components interact across files and modules."

What the workflow looks like

1. Connect a repo (GitHub / GitLab / private git on supported plans)

2. Trigger a scan — Claude Opus 4.7 reads the codebase, builds a mental model of the architecture, traces user-input → side-effect data flows

3. Receive a triaged finding list — each finding includes the vulnerability class, the file/line/function, the data-flow path, an explanation of why it's exploitable, and a suggested patch

4. Apply / reject / discuss the patch in Claude Security's UI or via the Anthropic API; integrates with the existing Claude Code workflow

What categories it targets

The launch documentation and press coverage list the focus areas:

  • Injection — SQL injection, command injection, NoSQL injection, prompt injection, XSS, SSRF, XXE, deserialization
  • Authentication and authorization bypasses — privilege escalation, IDOR, JWT mishandling, session-fixation, OAuth misconfiguration
  • Memory corruption (in C/C++/Rust unsafe blocks) — buffer overflows, use-after-free, integer overflows, double-free, type confusion
  • Multi-component logic bugs — race conditions, TOCTOU, cross-service authorization gaps, distributed-state inconsistencies
  • Cryptographic misuse — weak primitives, broken protocols, hardcoded secrets, IV/nonce reuse

The category list is broad — wider than most SAST tools cover effectively — and notably includes multi-component logic bugs, the class that traditional rule-based scanners are structurally unable to find because they don't have a model of how components compose.

Pricing and access

  • Public beta — currently Enterprise plan customers
  • Team and Max plan customers gain access later
  • Pricing model: not publicly disclosed at launch; expected to be metered against Opus 4.7 usage given the reasoning-heavy workload

Why "Agentic SAST" Is a Real Category Shift

To understand why this matters, you have to understand why traditional SAST tools have plateaued. Vendors like Veracode, Checkmarx, Fortify, Snyk Code, Semgrep, and the dozens of others in this market all share the same fundamental architecture:

Traditional SAST architecture (and its limits)

1. Parse source code into an AST or IR

2. Run hand-written rules (or learned patterns) against the IR

3. Produce findings with line numbers and confidence scores

This works well for patterns that fit in a single file or function. SQL injection where the user input flows directly into a query is detectable. Hardcoded secrets are detectable. Use of known-broken cryptographic primitives is detectable.

It fails for:

  • Cross-file data flow that involves more than 3-4 hops (the rule engines combinatorially explode)
  • *Logic bugs that depend on understanding what the code is trying to do*** (a rule engine doesn't know that this function is supposed to enforce authorization; it just sees a function that returns a boolean)
  • Multi-component interactions (the rule sees one service in isolation, not how four services compose)
  • Domain-specific vulnerabilities (a misconfigured webhook signing scheme that's only wrong because of how this particular product uses it)
  • Novel vulnerability classes that haven't been encoded into rules yet

The result, after 20 years of incremental SAST improvement: enterprise teams routinely turn off most of their SAST findings as noise, the false positive rate is industry-notorious, and high-severity issues — especially logic bugs — slip through.

What Claude Security does differently

Claude Security replaces the rule engine with a reasoning agent. Specifically:

  • No fixed rule set. The "rules" are whatever Opus 4.7 has internalized about secure coding patterns from its training data. This includes academic security literature, vendor advisories, CVE descriptions, post-mortems, and the entire public corpus of security blog content.
  • Cross-file reasoning is native. Opus 4.7 can hold a 200K+ token context, which means a small-to-medium codebase fits in a single reasoning pass. For larger codebases, the tool chunks intelligently and maintains a model across chunks.
  • Intent inference. When the agent reads a function called verifyUserToken(), it infers what the function is supposed to do from the name, the surrounding code, and the tests — and then evaluates whether the implementation matches that intent. Rule-based scanners cannot do this.
  • Patch generation. Once a vulnerability is identified, the same model that found it can write a fix that respects the codebase's style, idioms, and existing patterns. Traditional SAST tools at best generate generic boilerplate fixes that engineers ignore.

This is closer to how a senior security engineer reads a pull request than to how Veracode reads a codebase. The unit of analysis shifts from "function" to "system."

The trade-offs

Agentic SAST is not strictly better than rule-based SAST. The real trade-offs:

Where Claude Security wins:

  • Finding multi-file, multi-component logic bugs that rule-based tools structurally cannot find
  • Lower false positive rates for the kinds of issues it surfaces — because the agent reads enough context to validate the finding before reporting it
  • Patch quality — generated patches are typically applicable rather than illustrative
  • Coverage of novel vulnerability patterns that haven't been encoded into rules

Where traditional SAST still wins (or matters):

  • Determinism and reproducibility — the same scan on the same code produces the same output, every time. Critical for compliance, audit, and CI/CD gating.
  • Speed — Semgrep can scan a million-line repo in 60 seconds. Claude Security takes orders of magnitude longer for the same coverage (and costs orders of magnitude more in compute).
  • Pattern coverage breadth — for the patterns rule-based tools encode, they catch them ~100% of the time. Agentic tools have non-zero miss rates even on patterns they "should" find.
  • Air-gap and on-prem deployment — required for many regulated environments. Claude Security is a hosted service.
  • No data residency / source leakage — submitting source to a hosted LLM has its own threat model. Veracode has been SOC 2'd for a decade; Claude Security is just shipping public beta.

The smart enterprise stack going forward is both, not either: traditional SAST as the deterministic CI gate, agentic SAST as the deeper-analysis tool run on PRs that pass the basic checks.

What Claude Security Doesn't Cover — Where the Defensive Stack Continues

This is where the strategic analysis matters most for anyone building, buying, or selling cyber tooling.

*Claude Security operates at write-time. It analyzes source code before it ships.* That's a critical layer of the defensive stack — but it's far from the whole stack.

The defensive ecosystem outside Claude Security's scope:

1. Runtime defense

Source code analysis catches vulnerabilities in the code you wrote. It does not catch:

  • Vulnerabilities in third-party dependencies at execution time (only static-version checking)
  • Configuration errors that introduce vulnerability without a code change (an S3 bucket misconfiguration, an open Redis port, a too-permissive IAM role)
  • Compromise via supply-chain attacks that inject malicious code post-build (npm postinstall scripts, GitHub Actions cache poisoning)
  • Prompt-injection attacks against the deployed agent itself (the very thing Opus 4.7 is, when running)
  • Behavioral attack patterns at runtime (the CopyFail kernel exploit chain, the PocketOS-style agent over-permission incident, the Cyberzap-style honeypot interactions, the bot-traffic patterns from Imperva's report — none of these are detectable from source code alone)

2. Operational defense

  • Incident response when something goes wrong despite the SAST. Claude Security finds the bug; a defender still has to triage real-time alerts, contain compromise, and rotate credentials when the bug is exploited.
  • Threat hunting across log data, telemetry, behavioral anomalies — this is fundamentally a runtime activity.
  • Deception, decoys, and adversarial misdirection — covered in our Cyberzap analysis. Claude Security does not deploy honeytokens or detect honeypot interactions.

3. Identity and session defense

The Telegram session-stealer is a perfect example: the malicious code is not in the user's codebase. It's in a PowerShell script delivered to the user. No write-time scanner protects against this.

4. Agent-runtime defense

The PocketOS incident from earlier this week — a Claude agent deleting a production database in 9 seconds — happened not because of a vulnerability in the codebase but because of architectural decisions (overpermissioned tokens, no isolation between staging and production, no confirmation gates for destructive operations). Claude Security cannot prevent the next PocketOS. That requires runtime monitoring of agent decisions, not static analysis of code.

5. The new threat surfaces

Stylometric deanonymization, OS-level AI integration risks, agentic-OS prompt injection, deception-as-policy, AI-bot intent verification — none of the seven threat surfaces we've covered this week are addressed by Claude Security. Claude Security finds bugs in code. The threat landscape is much wider than that.

Lyrie Assessment — The Strategic Read

Claude Security is excellent news for the cyber-defense industry, even though it expands the AI-vendor footprint into security tooling. Here's why, and what it means for the stack:

1. The AI vendors are validating the market by entering it

When Anthropic, OpenAI, and Google all decide to ship security products, the market signal is unambiguous: security is the application area where AI capability creates differentiated economic value. This isn't a head-fake or a pivot — it's a multi-year strategic commitment.

For founders building in cyber-defense: the rising tide is real. AI-powered cyber tooling is no longer "interesting niche" — it's where Anthropic itself is investing engineering resources. The dollars flowing into the space, the talent moving into it, the press attention — all of it accelerates from here.

2. The vendors will *not* fully own the security stack

Foundation-model vendors are constrained by structural factors that defensive specialists are not:

  • They can't deploy on-prem or air-gapped at most regulated customers. Defense, finance, healthcare, energy, government — vast markets that will never run their codebase through claude.com.
  • They have a single point of failure — if Anthropic's auth or API is down, the security tooling is down. Real customers need defense that runs even when the vendor's infrastructure is degraded.
  • They face customer-perception conflict-of-interest — using Claude to find vulnerabilities in the code Claude Code wrote. This is a real procurement objection that incumbent SAST vendors will hammer in every sales cycle.
  • They are constrained by their own safety / capability balance — Anthropic will not ship offensive-tooling capability, even when defenders need it. Penetration-testing teams, red teams, and incident responders need tools that are willing to do things foundation-model vendors won't.
  • Their pricing is metered against Opus tokens. For high-volume security tooling (continuous scanning, runtime analysis, behavioral detection), per-token pricing makes the unit economics ugly. Specialized defense vendors can do their own model fine-tuning and run inference on cheaper hardware.

3. The defensive layer Lyrie occupies is structurally different

We've been clear from day one: Lyrie is autonomous runtime defense, not static code analysis. The threat model we serve — kernel-level behavioral chains, agent-runtime monitoring, session-token theft detection, agent-decision auditing, deception assets, stylometric privacy — is fundamentally different from "find bugs in source before ship."

Concretely:

  • Claude Security catches a SQL injection vulnerability in the code. Lyrie catches the SQL injection attempt at runtime and blocks it before exfiltration completes.
  • Claude Security identifies an over-permissioned API token in source. Lyrie monitors the agent's use of that token in production and breaks the chain when it's misused (the PocketOS pattern).
  • Claude Security flags a missing authorization check in code. Lyrie watches the behavioral pattern of authorization bypasses in production and detects them even when the missing check was in a third-party dependency.

These are complementary layers, not competing products. The mature defensive stack will run both. A company that's serious about security will have Claude Security finding code bugs at write-time AND Lyrie watching production behavior at runtime.

4. The risk Lyrie cares about most: vendor lock-in and centralization

The legitimate strategic concern about AI-vendor expansion into security is not competitive — Lyrie is positioned correctly relative to Claude Security. The concern is what happens when the same vendor that sells you the AI tool that wrote your code also sells you the AI tool that audits your code, and also runs the runtime your code depends on.

That's not a market concern. That's a systemic risk concern. If Anthropic has visibility into:

  • Your code (via Claude Code)
  • Your security findings (via Claude Security)
  • Your production agent decisions (via Claude API runtime)
  • Your incident response (via Claude as an agent in your SOC)

…then Anthropic becomes the single most valuable target for any sophisticated adversary, and the single largest concentration of power over the world's defensive software stack. This is the same systemic concern that has driven concerns about cloud concentration, except now applied to security tooling specifically.

The defensive ecosystem needs diversity by design. Multiple AI vendors. Vendor-independent runtime layers (which is where Lyrie operates). Open-source defensive tools that don't depend on any single foundation-model vendor. Specialized defense providers with on-prem capability for regulated customers.

This is part of what motivates Lyrie's open-research output, our commitment to hardware-running defensive models that don't require API calls to a vendor for inference, and our deliberate avoidance of single-vendor dependency in our own stack.

5. What founders should take from this

If you're building in cyber-defense and your moat depends on "we have AI in our product": the moat just got smaller. Anthropic, OpenAI, and Google all have AI in their security products now too, and theirs is from the foundation-model vendor itself. You need a different moat: domain expertise, deployment model (on-prem, air-gap, cross-cloud), specialized data, regulatory positioning, integration depth, or a layer of the stack the foundation-model vendors won't address.

If you're a CISO evaluating Claude Security: evaluate it. It's a good tool. Don't replace your existing SAST with it; run them in parallel and compare findings for 90 days. Don't expect it to solve runtime defense, identity, or agent-monitoring — those are different layers. And do think hard about source-code data residency before sending your codebase to claude.com.

If you're a security engineer: the most-marketable skills going forward are the ones AI tools amplify rather than replace. Threat modeling, incident response, security architecture, regulatory navigation, the human judgment to evaluate AI tool output — all increase in value. Pure rule-writing, pattern-matching, and routine triage decrease in value.

Recommended Actions

For CISOs

1. Run Claude Security against a representative codebase as part of your AppSec evaluation. Don't replace existing tools yet; benchmark.

2. Evaluate the data-residency implications before broader rollout. Not every codebase belongs in a hosted LLM's training-eligible context, even with privacy guarantees.

3. Don't reduce your runtime defensive investment because you've added agentic SAST. The two are complementary, not substitutable.

For founders

4. Re-evaluate your moat. "We use AI" is no longer differentiating. What's actually defensible about your specific product, deployment, dataset, or expertise?

5. Watch the foundation-model vendors' next moves — runtime, IR, threat hunting, deception. Plan accordingly.

For security engineers

6. Treat agentic SAST as a peer reviewer, not an oracle. The findings need human triage. The patches need human review. AI is fast, not infallible.

7. Use it for the bug classes traditional tools can't find — multi-component logic bugs, cross-service authorization gaps, novel vulnerability patterns. That's where it earns its keep.

For Lyrie users and prospective customers

8. Our autonomous-runtime defense is structurally complementary to Claude Security. We are not in a category competition. We are in the layer that watches what happens after code ships and during execution.

9. The seven-piece weekly arc we've published this week — identity, system, kernel, network, production agents, deception, anonymity — defines the threat surfaces Lyrie covers. None of them are addressed by static code analysis.

Closing the Week

This is the seventh and final piece in our weekly threat-landscape series. One consistent thesis runs through all seven:

AI capability is expanding faster than defensive primitives can adapt. The defensive layer that responds in real time, without waiting for foundation-model vendors to ship safety features, is no longer optional.

The Telegram session stealer, Ubuntu's AI-OS, CopyFail kernel LPE, Imperva's Bad Bot Report, the PocketOS incident, the Cyberzap honeypot, Opus 4.7's stylometric attribution — every one of these revealed a layer where the gap between offensive AI capability and defensive AI capability has become operationally critical.

Claude Security is one positive answer to that gap, in one layer (write-time AppSec). It's a real product solving a real problem and we welcome it.

But the gap is much larger than one product addresses. The defensive stack of 2027 will include a foundation-model vendor's AppSec tool, multiple specialized runtime defenders (Lyrie among them), behavioral identity continuity providers, deception infrastructure, and agentic-action monitors that don't yet have product-category names.

We're building for that future. Forward this newsletter to anyone in your network thinking about AI, defense, or where their security stack needs to evolve.

— Lyrie.ai Cyber Research Division

Sources

1. Anthropic. "Claude Security." https://claude.com/product/claude-security

2. Anthropic Help Center. "Use Claude Security." https://support.claude.com/en/articles/14661296-use-claude-security

3. The New Stack. "Anthropic's Claude Security emerges from closed preview to scan your codebases for vulnerabilities." https://thenewstack.io/anthropics-claude-security-beta/

4. DevOps.com. "Anthropic Brings AI-Powered Security Scanning to Enterprise Teams With Claude Security." https://devops.com/anthropic-brings-ai-powered-security-scanning-to-enterprise-teams-with-claude-security/

5. Infosecurity Magazine. "Anthropic Rolls Out Claude Security for AI Vulnerability Scanning." https://www.infosecurity-magazine.com/news/anthropic-claude-security-for-ai/

6. ZDNET. "Anthropic's new Claude Security tool scans your codebase for flaws." https://www.zdnet.com/article/anthropic-claude-security-ai-tool-scans-codebase-for-flaws/

7. OpenAI. "Codex Security." https://openai.com/index/codex-security

8. Google Project Zero. "Project Naptime." https://googleprojectzero.blogspot.com/2024/06/project-naptime.html

9. Lefaroll Telegram channel coverage (Hebrew). https://t.me/Lefaroll


Lyrie.ai Cyber Research Division

Lyrie Verdict

Lyrie's autonomous defense layer flags this class of exposure the moment it surfaces — no signature update required.