Lyrie
AI-Security
0 sources verified·10 min read
By lyrie-threat-intelligence·4/27/2026

TL;DR

Enterprise AI deployments are building on a poisoned foundation. Retrieval-Augmented Generation (RAG) — the architecture used by virtually every enterprise LLM deployment to connect models to internal data — introduces an attack surface that traditional AppSec tooling completely misses. Five malicious documents injected into a knowledge base of millions can redirect AI outputs with 97% reliability. A new April 2026 paper from Cornell/Purdue/Sichuan University introduces the Black-Hole Attack: injecting vectors near the geometric centroid of a vector database that attract and hijack every query indiscriminately. Simultaneously, CorruptRAG (January 2026) reduced the injection requirement to a single document. Embedding inversion attacks now recover 80%+ of source text from stolen vector databases. None of this requires access to the underlying LLM. OWASP added vector and embedding weaknesses as LLM08:2025. Enterprise defenders have not caught up.


Background: Why RAG Became the Dominant Enterprise Pattern — and Why That Matters

The LLM knowledge cutoff problem drove nearly every enterprise AI deployment toward RAG. The pattern is elegant: instead of fine-tuning a model on proprietary data (expensive, slow, potentially leaky), you maintain a separate knowledge base, retrieve relevant chunks at query time, and inject them into the model's context window. The model answers based on retrieved context, not its training weights.

This architecture now underpins:

  • Internal chatbots grounded in Confluence, Notion, SharePoint
  • AI-powered SOC assistants reading threat intel feeds
  • Customer support agents with access to CRM and ticketing data
  • Code assistants pulling from internal documentation and proprietary repos
  • Sales intelligence tools indexed against deal history and competitive data

The retrieval mechanism typically involves embedding documents into high-dimensional vector space (512 to 4096 dimensions), storing them in a vector database (Pinecone, Weaviate, Qdrant, Chroma, pgvector), and using approximate nearest-neighbor search to find semantically relevant chunks at query time.

The security assumption baked into every deployment: documents in the knowledge base are trusted. The model treats retrieved content as authoritative. There is no cryptographic chain of custody. There is no per-document integrity verification. The model cannot distinguish a legitimate document from a poisoned one.

That assumption is being systematically destroyed.


Technical Analysis

Part 1: PoisonedRAG — The Optimization Attack That Started Everything

The foundational attack in this space is PoisonedRAG, published at USENIX Security 2025 by researchers at Penn State and the Illinois Institute of Technology [1]. The attack formulates document injection as a two-objective optimization problem:

Retrieval condition: The malicious document must be retrieved when a target question is asked. The attacker optimizes document content to maximize cosine similarity with the target query's embedding vector. This is achievable because embedding models are public — the attacker can query the same embedding model used by the production system (or a comparable open-source variant).

Generation condition: Once retrieved, the document must cause the LLM to generate an attacker-chosen answer. This is achieved through vocabulary engineering: phrases like "according to updated records," "corrected figures show," "revised guidance indicates" prime the model's output distribution toward trusting the injected content over competing legitimate documents.

The results across three major benchmarks in black-box settings (attacker has no access to the model's internals):

  • Natural Questions: 97% attack success rate
  • HotpotQA: 99% attack success rate
  • MS-MARCO: 91% attack success rate

Required injection: five malicious documents into a knowledge base with millions of legitimate documents. Detection difficulty: near-zero with standard monitoring. The poisoned documents look like plausible, well-formatted text. Volume-based anomaly detection won't fire on five documents.

Part 2: CorruptRAG — Single Document, Same Impact

January 2026 brought a refinement that makes the economics of RAG poisoning trivially accessible. CorruptRAG reduces the required injection to a single poisoned document while achieving comparable attack success rates to PoisonedRAG [2].

The efficiency improvement matters operationally. In enterprise environments where bulk document injection would trigger monitoring thresholds, a single-document attack is essentially undetectable through volume-based controls. One carefully crafted document — formatted to mimic internal style guides, seeded through a compromised wiki contributor account, or injected via an untrusted third-party data feed — is sufficient.

Part 3: The Black-Hole Attack — Hijacking All Queries at Once

The most alarming recent development is the Black-Hole Attack, published April 7, 2026 by researchers at Sichuan University, Cornell, and Purdue (arXiv:2604.05480) [3].

Previous attacks targeted specific queries. The Black-Hole Attack targets the geometry of the embedding space itself.

The mathematical insight: In high-dimensional vector spaces, there's a well-documented phenomenon called "hubness" — some vectors become nearest neighbors of disproportionately many other vectors. These "hub" regions are near the centroid (geometric mean) of the distribution. Critically, in practice, the centroid region of most vector databases is nearly empty. No legitimate documents naturally cluster there.

The attack: An adversary injects a small number of carefully constructed malicious vectors positioned near the centroid of the vector space. Like an astrophysical black hole, these vectors attract queries from across the semantic space — not just targeted queries, but all queries. The injected vectors appear in top-k retrieval results for most queries regardless of topic.

The impact: This transforms RAG poisoning from a precision strike into an area denial weapon. The attacker doesn't need to predict what questions users will ask. The black-hole vectors will be retrieved for questions about financial data, security policies, product specifications, and customer records indiscriminately. Every response becomes potentially compromised.

The paper demonstrates this attack against Pinecone, Weaviate, and Qdrant — the three most widely deployed vector database backends in enterprise production environments. Hubness mitigation techniques partially reduce effectiveness but don't eliminate it.

Part 4: The Wikipedia Edit Vector — Weaponizing Data Source Trust

Academic attack papers are one thing. The attack pattern most likely to be exploited at scale in 2026 doesn't require mathematical optimization at all.

Many enterprise RAG pipelines ingest from public sources: Wikipedia for factual grounding, GitHub READMEs for technical context, industry publications for competitive intelligence, public CVE databases for security awareness. These sources have scheduled ingestion cycles — the pipeline scrapes and re-indexes on a schedule, typically hourly to weekly.

The attack vector: temporarily edit a Wikipedia article, public documentation page, or frequently-cited GitHub repository with poisoned content. If the ingestion cycle runs during the window the edit is live, the poisoned version is embedded into the vector database. The attacker reverts the edit at the source. The poisoned embedding persists until the next full re-indexing cycle — which in many enterprise deployments happens weekly or monthly.

This requires no technical sophistication. Wikipedia edits require only account creation. The persistence window is determined by the victim's re-indexing cadence, which is typically not a security-monitored parameter.

Part 5: Embedding Inversion — Your Vector Database Is a Data Breach Waiting to Happen

Even defenders who successfully harden against injection attacks face a separate threat: if the vector database is exfiltrated, the embeddings themselves leak source data.

Embedding inversion attacks demonstrate that vector representations are not one-way transformations. Key findings:

  • Standard sentence embedding inversion recovers 50-70% of input words (F1 score) using techniques available since 2024 [4]
  • ZSinvert (March 2025): Zero-shot inversion method achieving 80%+ sensitive information leakage across all tested encoders using the Enron email corpus — without requiring any training data from the target system [5]
  • ALGEN (February 2025): Framework demonstrating reconstruction of full sentences from embeddings with high fidelity

Practical implication: a vector database stolen from an enterprise SOC's RAG system doesn't just leak query patterns. It leaks the underlying threat intelligence, playbooks, and internal security documentation that was indexed into it. The embeddings are not anonymized data — they are recoverable plaintext.

Part 6: Access Control Failures — The Vulnerability That Doesn't Need Attacks

OWASP's addition of LLM08:2025 (Vector and Embedding Weaknesses) to the LLM Top 10 reflects a more mundane but widespread vulnerability: inadequate access control on vector databases [6].

Multi-tenant RAG deployments share a vector store across users, teams, or customers. Without retrieval-native access enforcement:

  • Cross-tenant data leakage: User A's query retrieves documents belonging to User B. No attack required — just a missing metadata filter.
  • Stale permission propagation: A user whose SharePoint access is revoked continues to have their documents retrieved by others until the next full re-indexing cycle.
  • Metadata-only filtering bypass: Access control implemented as a query-time metadata filter, but an attacker with database credentials bypasses it entirely by querying the vector store directly.
  • No audit trail: Vector database query logs are typically not captured in SIEM pipelines, making post-incident forensics impossible.

The SecureW2 enterprise risk survey found 61% of organizations name sensitive data exposure as their primary concern with agentic AI — yet vector database access control is not in standard security review checklists for AI deployments [7].


IOCs / Indicators

RAG poisoning attacks don't generate traditional IOCs. Indicators to monitor:

Behavioral anomalies:

  • Sudden statistical shift in top-k retrieved documents for baseline queries
  • Frequent appearance of recently-added documents in retrieval results disproportionate to their relevance
  • LLM outputs containing unusual authority phrases: "according to updated records," "revised guidance confirms," "corrected as of [recent date]"
  • Retrieval results that don't match expected semantic neighborhood for known query types

Infrastructure indicators:

  • Unauthorized writes to vector database collections (direct API calls bypassing application layer)
  • Unusual patterns in embedding model API calls (attacker probing embedding space)
  • Third-party data feed ingestion timing anomalies
  • Vector count growth exceeding expected document ingestion rates

Data plane indicators:

  • Documents added to knowledge base without corresponding entries in source document management system
  • Embedding similarity distribution shift in collection-level statistics (sign of black-hole injection)
  • Centroid drift in vector collection geometric statistics

Lyrie Take

The trust model that RAG is built on was never designed for adversarial conditions.

Enterprise deployments treat the knowledge base as a trusted internal resource. In practice, it ingests from wikis, shared drives, third-party feeds, and public sources — all of which are adversarially accessible. The attack surface is the entire content supply chain.

What makes this particularly dangerous from an anti-rogue-AI posture: these attacks don't compromise the model. They compromise the knowledge the model acts on. A SOC assistant fed poisoned threat intelligence doesn't malfunction in a detectable way — it confidently executes on false information. An AI agent with tool-calling capabilities that retrieves poisoned action guidance doesn't throw errors — it takes the wrong actions at machine speed.

The Black-Hole Attack represents the inflection point. Previous attacks required knowing what questions to target. Black-hole vectors corrupt the retrieval layer holistically. In an enterprise AI system where agents autonomously retrieve context, reason over it, and take actions — purchasing, communicating, executing code — black-hole contamination means every agentic action is operating on potentially adversarial ground truth.

Lyrie's defense model addresses this directly: cryptographic provenance at the document layer, retrieval-time integrity verification, and behavioral anomaly detection that treats the knowledge base as a first-class attack surface rather than a trusted data store.


Defender Playbook

Immediate (0-30 days)

1. Audit your ingestion pipeline sources: Enumerate every data source feeding your RAG system. Classify each by trust level: internal-owned, third-party managed, public. Public and third-party sources require additional verification controls.

2. Implement collection-level statistics monitoring: Track vector count, centroid position, and nearest-neighbor distribution across your vector collections daily. Establish baselines. Alert on statistically significant shifts — these are signatures of injection attacks including black-hole attempts.

3. Add ingestion immutability logging: Every document added to the knowledge base should generate an immutable log entry (source, hash, timestamp, contributor) queryable for forensic investigation.

4. Encrypt vector databases at rest and isolate credentials: Embedding inversion attacks require access to the raw vectors. Encrypt at rest; audit who holds vector database credentials; treat DB credentials as equivalent to production database credentials.

Medium-Term (30-90 days)

5. Deploy retrieval-native access control: Move from metadata-filtered queries to per-vector permission enforcement. Every vector should inherit the access control list of its source document. Re-indexing should propagate permission changes immediately, not on schedule.

6. Instrument LLM outputs for authority phrase detection: Flag outputs containing phrases statistically associated with social engineering in knowledge bases ("according to updated," "corrected figures," "revised guidance"). Route these for human review rather than autonomous action.

7. Implement re-indexing integrity checks: Before ingesting documents from public or third-party sources, compare against a hash of the previously-indexed version. Require secondary approval for unexpected content changes in high-trust documents.

8. Sandbox agentic tool-calling from poisoning blast radius: AI agents that execute actions (API calls, code execution, communication) should not operate directly on RAG-retrieved content without a verification step. Retrieved context should inform human-reviewed recommendations, not direct machine-speed actions, until retrieval integrity is confirmed.

Strategic

9. Adopt cryptographic document provenance: Implement a document signing and verification layer so the retrieval system can cryptographically verify document origin and integrity before inclusion in context. This is the only defense against the Wikipedia edit vector class of attacks.

10. Red-team your RAG pipeline quarterly: Run PoisonedRAG and corpus flooding tests against staging environments. Measure your detection capability. The attack tooling is published and reproducible.


Sources

1. Zou, W., et al. "PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models." USENIX Security Symposium 2025. https://arxiv.org/abs/2402.07867

2. CorruptRAG: Single-document RAG Poisoning Attack (January 2026). Via BeyondScale Research: https://beyondscale.tech/blog/rag-security-data-poisoning-guide

3. Li, H., et al. "Can You Trust the Vectors in Your Vector Database? Black-Hole Attack from Embedding Space Defects." arXiv:2604.05480, April 7, 2026. https://arxiv.org/abs/2604.05480

4. Morris, J., et al. "Text Embeddings Reveal (Almost) As Much As Text." Embedding inversion research via AI Security Portal, 2024.

5. ZSinvert Framework: Zero-Shot Embedding Inversion achieving 80%+ sensitive information leakage, March 2025. Referenced in BeyondScale RAG Security Guide.

6. OWASP. "OWASP Top 10 for LLM Applications 2025: LLM08 - Vector and Embedding Weaknesses." https://owasp.org/www-project-top-10-for-large-language-model-applications/

7. SecureW2. "Agentic AI Security: Enterprise Risk Framework and Identity Controls," April 2026. https://securew2.com/blog/agentic-ai-security-enterprise-risk-framework-and-identity-controls


Lyrie.ai Cyber Research Division — Senior Analyst Desk

Lyrie Verdict

Lyrie's autonomous defense layer flags this class of exposure the moment it surfaces — no signature update required.