Lyrie
AI-Security
0 sources verified·10 min read
By Lyrie.ai Senior Analyst Desk·5/11/2026

When Prompts Become Shells: The Agentic AI Framework RCE Epidemic of 2026

TL;DR

A cluster of critical vulnerabilities disclosed in Q1–Q2 2026 has reframed the agentic AI security conversation entirely. Four CVEs in CrewAI (VU#221883), a systemic MCP STDIO command injection class spanning LangFlow, GPT Researcher, and LiteLLM, and a sandbox escape in Google Antigravity all share one root cause: AI agent frameworks shipping with unsafe defaults that transform prompt injection from a model-level annoyance into a direct, unauthenticated shell on the host machine. The threat model has moved. The adversary no longer needs to jailbreak the LLM — they just need to talk to an agent that has the wrong tool enabled.


Background: The Framework Layer Was Never the Security Layer

From 2023 through 2025, the dominant AI security conversation centred on model behaviour: jailbreaks, harmful outputs, and alignment failures. The frameworks sitting beneath those models — LangChain, CrewAI, AutoGen, LangGraph — were mostly treated as plumbing. Developers enabled the Code Interpreter tool because the README said it was sandboxed. They enabled the JSON loader because it was convenient. They left Docker fallback configured as default because nobody ever read the security section of the documentation.

2026 changed that. Researchers started asking a different question: not "can you get the model to say something bad" but "can you use what the model is already allowed to do to compromise the host?" The answer, it turns out, is yes — in almost every major agentic framework currently running in production.

The threat is not theoretical. CERT/CC published VU#221883 covering CrewAI in March 2026. OX Security published a sweeping advisory covering the MCP STDIO command injection class across the AI ecosystem in April 2026. Microsoft Security Blog published "When Prompts Become Shells" on May 7, 2026, synthesising the pattern across frameworks. Adversa AI's monthly tracker lists six confirmed agentic framework RCE/sandbox escape disclosures for May alone.

We are in a full-blown epidemic.


Technical Analysis

The CrewAI Chain: VU#221883 (CVE-2026-2275 / 2285 / 2286 / 2287)

CrewAI is one of the most widely deployed multi-agent orchestration frameworks in production. The CERT/CC advisory describes four distinct vulnerabilities, but what matters operationally is how they chain.

CVE-2026-2275 — The Trigger

The Code Interpreter Tool falls back to SandboxPython when it cannot reach Docker. SandboxPython does not block ctypes. An attacker who can influence an agent prompt — via a malicious document in a RAG pipeline, a crafted web page the agent browses, a poisoned tool response, or direct user input — can instruct the agent to call the Code Interpreter and pass Python that invokes ctypes.CDLL("libc.so.6").system("id"). The sandbox is bypassed. This fires whenever allow_code_execution=True is set in the agent config, or whenever a developer has manually wired the Code Interpreter Tool to an agent.

CVE-2026-2287 — The Amplifier

CrewAI does not monitor Docker availability at runtime. If Docker goes down mid-session (container restart, resource pressure, deliberate DoS), the agent silently falls back to the unsafe sandbox rather than failing closed. An attacker who can trigger even a momentary Docker disruption — including via the SSRF vector — gets a fresh window for CVE-2026-2275 exploitation without the developer ever receiving a warning.

CVE-2026-2286 — Lateral Movement

The RAG search tools do not validate URLs provided at runtime. An agent interacting with attacker-controlled content can be prompted to query internal cloud metadata services — AWS IMDSv1 at 169.254.169.254, GCP at metadata.google.internal, Azure at 169.254.169.254/metadata/instance — yielding instance roles, API keys, and service account tokens. This SSRF is the path from RCE on the agent host to full cloud account takeover.

CVE-2026-2285 — Data Theft

The JSON loader reads files without path validation. A prompt injection asking the agent to "load the config from ../../../../etc/passwd" or ~/.aws/credentials will succeed, returning the contents in the agent's response context. For frameworks connected to memory stores or downstream Slack/email integrations, this data immediately leaves the host environment.

The complete kill chain from attacker's perspective:

[Attacker-controlled content] → [RAG injection / tool response poisoning]
    ↓
[Prompt injection into Code Interpreter]
    ↓
[CVE-2026-2275: ctypes → libc.system() → host RCE]
    ↓
[CVE-2026-2286: SSRF → cloud metadata → credentials]
    ↓
[CVE-2026-2285: file read → exfil of secrets/keys]
    ↓
[Full account takeover]

Total steps from "attacker uploads a malicious PDF to a RAG corpus" to "attacker has your AWS keys": four. No CVE in this chain is rated below CVSS 7.5. CERT/CC notes that the vendor has not fully patched all four vulnerabilities as of the advisory date. Only CVE-2026-2275 and CVE-2026-2287 have vendor statements; CVE-2026-2285 and CVE-2026-2286 remain unresolved.


The MCP STDIO Class: A Protocol-Level Infection

OX Security's "Mother of All AI Supply Chains" advisory documents a different but structurally identical problem at the protocol layer. Anthropic's Model Context Protocol (MCP) uses a StdioServerParameters initialisation flow where tool servers pass a command and argument list that is forwarded to subprocess.run() with minimal sanitisation.

The consequences cascade across the entire MCP-connected ecosystem:

  • LangFlow: An unauthenticated endpoint at /api/v1/auto_login allows any attacker to obtain a bearer token, then use it to register a malicious MCP server config with an arbitrary command in the STDIO template field. The command executes as the LangFlow process. All versions affected.
  • GPT Researcher: Visiting a crafted HTML page causes the agent's browser tool to trigger the malicious MCP configuration, spawning a reverse shell. No authentication required.
  • LiteLLM (CVE-2026-30623): The MCP proxy does not validate argument arrays before constructing subprocess calls, enabling command injection in authenticated sessions.

What's notable here is the attack surface: the vulnerability is not in the LLM. It is not a jailbreak. It is not a model alignment failure. The model is doing exactly what it was designed to do — reading a tool manifest and following its instructions. The attack is against the trust boundary between the protocol specification and the framework's implementation of it.

Microsoft's security team framed it correctly in their May 7th blog post: "If an attacker can control the parameters passed into these plugins via prompt injection, the agent may be driven to perform actions beyond its intended use. The AI model itself isn't the issue — it's behaving exactly as designed." This is the paradigm shift that many enterprise teams have not yet absorbed.


Google Antigravity: Sandbox Escape at Scale

For completeness: three weeks ago, Google patched a critical RCE in Antigravity, its AI-based coding agent, via the find_by_name tool. A sanitisation flaw allowed prompt injection to construct a path argument that escaped the sandboxed filesystem context. Adversa AI's Adversarial A2A agent card PoC also demonstrated that a malicious Agent Card — the metadata object that describes an A2A-compatible agent's capabilities — can embed adversarial instructions that cause the host LLM to exfiltrate session data on behalf of the rogue agent. The A2A specification, like MCP, trusts the content of manifests by design.


IOCs / Detection Indicators

Since most exploitation in this class flows through legitimate agent behaviour rather than novel malware, traditional IOC-based detection is insufficient. Behavioural indicators to monitor:

| Indicator | Detection Method |

|-----------|-----------------|

| Agent process spawning unexpected child processes (e.g., sh, bash, python3 -c) | SIEM process creation events; EDR parent-child chain anomaly |

| Outbound requests to 169.254.169.254, metadata.google.internal, 169.254.169.254:80 | Network layer block + alert |

| LangFlow /api/v1/auto_login hit from external IPs | Web/API gateway log alerting |

| CrewAI agent spawning without Docker daemon running | Infrastructure health → agent state correlation |

| Unusual file reads in agent working directory traversing ../ patterns | File integrity monitoring, auditd |

| MCP server configs appearing with shell metacharacters (; | && $()) in command fields | Config validation on write |

CVEs to track in your vulnerability management:

  • CVE-2026-2275 (CrewAI Code Interpreter, CVSS 9.1)
  • CVE-2026-2286 (CrewAI SSRF, CVSS 8.2)
  • CVE-2026-2287 (CrewAI Docker fallback, CVSS 7.5)
  • CVE-2026-2285 (CrewAI file read, CVSS 7.8)
  • CVE-2025-65720 (GPT Researcher MCP RCE, CVSS Critical)
  • CVE-2026-30623 (LiteLLM MCP injection)

The Lyrie Take

This is not a collection of isolated bugs. This is a systemic design failure across the agentic AI stack, and it is repeating the mistakes of the early web API era at accelerated speed. In 2010, developers shipped REST endpoints without auth because "it's internal." In 2026, developers are shipping AI agents with Code Interpreter enabled, Docker fallback to unsafe modes, and no validation of tool inputs because "the LLM won't do anything dangerous."

The LLM is not the security boundary. It never was. The agent framework's tool layer is the security boundary — and almost every major framework shipped that layer as an afterthought.

Three structural problems are compounding this:

1. Fallback-to-unsafe defaults. Both CrewAI and multiple MCP implementations fail open when their sandboxing mechanisms are unavailable. This is the opposite of the security engineering principle of fail-closed. In a high-availability environment where Docker might briefly restart, or where MCP servers might time out, the framework silently downgrades security without signalling the operator.

2. Trust inheritance from tool manifests. A2A Agent Cards and MCP StdioServerParameters are both treated as authoritative by the host LLM. There is no equivalent to TLS certificate verification for agent capability claims. A rogue agent advertising itself as a trusted peer is, from the host's perspective, indistinguishable from a legitimate one unless the operator has implemented strict allowlisting — which almost nobody has, because the tooling for it barely exists yet.

3. Production deployments outpacing security posture. CrewAI has tens of thousands of production deployments. LangFlow processes enterprise RAG pipelines for Fortune 500 companies. The scale of exposure for vulnerabilities that require only a crafted document in a corpus — not a firewall bypass, not stolen credentials — is enormous. The attack surface grew faster than the security tooling.

Lyrie's autonomous assessment platform flags all four of these CrewAI CVEs in environment scan as CRITICAL when allow_code_execution=True is detected in agent configs, regardless of Docker status. For MCP-connected infrastructure, our agent traffic inspection layer now parses StdioServerParameters for shell metacharacter injection patterns. These are table-stakes detections that every enterprise deploying agentic AI needs in place now.


Defender Playbook

Immediate (this week):

1. Audit every CrewAI deployment for allow_code_execution=True and presence of the Code Interpreter Tool. Disable both unless they are actively required for a specific workflow.

2. Block SSRF targets at network level: 169.254.169.254, metadata.google.internal, and equivalent cloud metadata addresses should be unreachable from any AI agent process.

3. Restrict LangFlow /api/v1/auto_login to internal networks only. If external exposure is required, enforce authentication and audit all MCP server additions.

4. Pin Docker health to agent availability: If Docker is not running, CrewAI agents with code execution enabled should fail to start, not fall back to sandbox mode. Implement a health check in your deployment wrapper.

Short-term (this month):

5. Implement agent egress allowlisting: No agent process should be able to initiate outbound connections to arbitrary hosts. Define an explicit allowlist by agent role and enforce it at the network layer.

6. Audit all MCP server registrations in your environment for shell metacharacters in command and args fields. Automate this check in your CI/CD pipeline.

7. Apply OWASP Top 10 for LLM Applications 2026 as a baseline framework for your agent deployments. LLM01 (Prompt Injection), LLM06 (Sensitive Information Disclosure), and LLM08 (Vector and Embedding Weaknesses) are directly implicated in this vulnerability class.

Strategic (this quarter):

8. Adopt an agent identity framework: Every agent should have an ephemeral credential, a defined scope of permitted actions, and a cryptographically verifiable identity. Microsoft's Agent Governance Toolkit (once its authentication bypass is fixed) and Google's Agent Gateway are starting points.

9. Deploy agent-layer traffic inspection: MCP and A2A protocol traffic should be inspected for adversarial content, not just passed through. Google's Model Armor integration with Agent Gateway is the first production-grade implementation of this.

10. Treat every external input to a RAG pipeline as adversarial: Implement input sanitisation before content enters agent context. A PDF, web page, or API response that reaches an agent with Code Interpreter enabled is a potential attack vector.


Sources

1. CERT/CC VU#221883 — CrewAI contains multiple vulnerabilities including SSRF, RCE and local file read (March 2026): https://kb.cert.org/vuls/id/221883

2. OX Security — MCP Supply Chain Advisory: RCE Vulnerabilities Across the AI Ecosystem (April 2026): https://www.ox.security/blog/mcp-supply-chain-advisory-rce-vulnerabilities-across-the-ai-ecosystem/

3. Microsoft Security Blog — When prompts become shells: RCE vulnerabilities in AI agent frameworks (May 7, 2026): https://www.microsoft.com/en-us/security/blog/2026/05/07/prompts-become-shells-rce-vulnerabilities-ai-agent-frameworks/

4. Adversa AI — Top Agentic AI Security Resources May 2026: https://adversa.ai/blog/top-agentic-ai-security-resources-may-2026/

5. Dark Reading — Google Fixes Critical RCE Flaw in AI-Based 'Antigravity' Tool (April 2026): https://www.darkreading.com/vulnerabilities-threats/google-fixes-critical-rce-flaw-ai-based-antigravity-tool

6. Google Cloud Blog — Next '26: Redefining security for the AI era (April 2026): https://cloud.google.com/blog/products/identity-security/next26-redefining-security-for-the-ai-era-with-google-cloud-and-wiz


Lyrie.ai Cyber Research Division — Senior Analyst Desk

Lyrie Verdict

Lyrie's autonomous defense layer flags this class of exposure the moment it surfaces — no signature update required.