Lyrie
← All streams

AI Threats

AI cyber-warfare watch — agents, models, MCP, supply chain.

223 stories

AgenticMail: Unauthenticated inbound mail triggers bypassPermissions resume of the operator's Claude Code sess

1 min·1 sources·agent-threats-agenticmail-unauthenticated-inbound-mail-triggers-mqjrzb8j

PraisonAI: Server-Side Request Forgery (SSRF) in SearxNG / search_web tools via attacker-controlled searxng_ur

1 min·1 sources·agent-threats-praisonai-server-side-request-forgery-ssrf-in-s-mqjljtjg

npm PraisonAI MCPSecurity Basic/OAuth authentication policies accept invalid credentials without validation

1 min·1 sources·agent-threats-npm-praisonai-mcpsecurity-basic-oauth-authenticati-mqjljtjc

PraisonAI ToolsMCPServer legacy SSE transport accepts attacker Host/Origin and exposes registered tools

1 min·1 sources·agent-threats-praisonai-toolsmcpserver-legacy-sse-transport-acce-mqjkh8vv

SafeClawBench: Separating Semantic, Audit-Evidence, and Sandbox Harm in Tool-Using LLM Agents

1 min·1 sources·agent-threats-safeclawbench-separating-semantic-audit-evidence-mqj3byku

PhantomSkill: Malicious Code Injection in Agent Skill Ecosystems

1 min·1 sources·agent-threats-phantomskill-malicious-code-injection-in-agent-sk-mqiwwhs3

Code-Augur: Agentic Vulnerability Detection via Specification Inference

1 min·1 sources·agent-threats-code-augur-agentic-vulnerability-detection-via-sp-mqitoqpi

OpenAnt: LLM-Powered Vulnerability Discovery Through Code Decomposition, Adversarial Verification, and Dynamic

1 min·1 sources·agent-threats-openant-llm-powered-vulnerability-discovery-throu-mqitoqph

Image Prompt Reconstruction Attacks on Distributed MLLM Inference Frameworks

1 min·1 sources·agent-threats-image-prompt-reconstruction-attacks-on-distributed-mqitoqpg

LangChain4j: SQL injection via metadata filters in langchain4j-mariadb and langchain4j-pgvector

1 min·1 sources·agent-threats-langchain4j-sql-injection-via-metadata-filters-in-mqifra2u

Claude Code: Out-of-Band Data Exfiltration via Pre-Approved HuggingFace Domain in WebFetch

1 min·1 sources·agent-threats-claude-code-out-of-band-data-exfiltration-via-pre-mqieomca

OpenClaw: MCP Streamable HTTP redirects could forward configured custom headers to another origin

1 min·1 sources·agent-threats-openclaw-mcp-streamable-http-redirects-could-forw-mqidm0p9

OTRO: Oblivious Tokenization Path with Square-Root ORAM

1 min·1 sources·agent-threats-otro-oblivious-tokenization-path-with-square-root-mqhge1q7

SoK: AI-Augmented Binary Reversing

1 min·1 sources·agent-threats-sok-ai-augmented-binary-reversing-mqhge1q7

Security and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs Respond

1 min·1 sources·agent-threats-security-and-privacy-prompts-in-the-wild-what-use-mqhge1q6

A Red-Team Study of Anthropic Fable 5 & Opus 4.8 Models

1 min·1 sources·agent-threats-a-red-team-study-of-anthropic-fable-5-amp-opus-4-mqhge1q5

An AI Security Agent for Banking: Multi-Vector Fraud and AML Detection Across Retail and Corporate Accounts

1 min·1 sources·agent-threats-an-ai-security-agent-for-banking-multi-vector-fra-mqhge1q4

An Evaluation of Data Leakage Risks in Tool-Using LLM Agents in Realistic Scenarios

1 min·1 sources·agent-threats-an-evaluation-of-data-leakage-risks-in-tool-using-mqhge1q3

Seeing Is Not Screening: Multimodal Hidden Instruction Attacks on Agent Skill Scanners

1 min·1 sources·agent-threats-seeing-is-not-screening-multimodal-hidden-instruc-mqhge1q2

Pi Agent: Potential XSS in HTML session exports via Markdown URL sanitization bypass

1 min·1 sources·agent-threats-pi-agent-potential-xss-in-html-session-exports-vi-mqhb1d5n

LangChain: Path traversal and sandbox escape in LangChain file-search middleware and loaders

1 min·1 sources·agent-threats-langchain-path-traversal-and-sandbox-escape-in-la-mqgst9nf

Dynamic Malicious Skills in Agentic AI

1 min·1 sources·agent-threats-dynamic-malicious-skills-in-agentic-ai-mqg0yfw2

Transferable Self-Evolving Playbooks for Agentic Security Auditing

1 min·1 sources·agent-threats-transferable-self-evolving-playbooks-for-agentic-s-mqg0yfw1

How Much Can We Trust LLM Search Agents? Measuring Endorsement Vulnerability to Web Content Manipulation

1 min·1 sources·agent-threats-how-much-can-we-trust-llm-search-agents-measuring-mqg0yfw0

SkillVetBench: LLM-as-Judge for Multi-Dimensional Security Risk Evaluation in Open-Source LLM Agent Skills

1 min·1 sources·agent-threats-skillvetbench-llm-as-judge-for-multi-dimensional-mqg0yfvz

The Proxy Knows Too Much: Sealing LLM API Routers with Attested TEEs

1 min·1 sources·agent-threats-the-proxy-knows-too-much-sealing-llm-api-routers-mqg0yfvz

CmdNeedle: Measuring the Incompleteness of Command Denylists for AI Agents

1 min·1 sources·agent-threats-cmdneedle-measuring-the-incompleteness-of-command-mqg0yfvy

FragFuse: Bypassing Access Control of Large Language Model Agents via Memory-Based Query Fragmentation and Fus

1 min·1 sources·agent-threats-fragfuse-bypassing-access-control-of-large-langua-mqfxqtes

From Prompts to Responses: Dual-Sided Data Leakage and Defense in Split Large Language Models

1 min·1 sources·agent-threats-from-prompts-to-responses-dual-sided-data-leakage-mqekfsb6

From Shield to Target: Denial-of-Service Attacks on LLM-Based Agent Guardrails

1 min·1 sources·agent-threats-from-shield-to-target-denial-of-service-attacks-o-mqekfsb5

SkillMutator: Benchmarking and Defending Language-and-Code Cross-modal Attacks on LLM Agent Skills

1 min·1 sources·agent-threats-skillmutator-benchmarking-and-defending-language-mqekfsb4

Smarter Saboteurs, Better Fixers: Scaling & Security in Linear Multi-Agent Workflows

1 min·1 sources·agent-threats-smarter-saboteurs-better-fixers-scaling-amp-se-mqaa49zk

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

1 min·1 sources·agent-threats-pi-hunter-automated-red-teaming-for-exposing-and-mqaa49zi

DIG: Oracle-Guided Directed Input Generation for One-Day Vulnerabilities

1 min·1 sources·agent-threats-dig-oracle-guided-directed-input-generation-for-o-mqaa49zh

SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems

1 min·1 sources·agent-threats-smsr-certified-defence-against-runtime-memory-poi-mqaa49zg

MAStrike: Shapley-Guided Collusive Red-Teaming on Multi-Agent Systems

1 min·1 sources·agent-threats-mastrike-shapley-guided-collusive-red-teaming-on-mqaa49zf

Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment

1 min·1 sources·agent-threats-can-open-source-llm-agents-replace-static-applicat-mq8vqxq9

Mind your key: An Empirical Study of LLM API Credential Leakage in iOS Apps

1 min·1 sources·agent-threats-mind-your-key-an-empirical-study-of-llm-api-crede-mq8vqxq8

Claude Code Action: Malicious MCP Server Configuration in PRs Enables Remote Code Execution and Secret Exfiltr

1 min·1 sources·agent-threats-claude-code-action-malicious-mcp-server-configura-mq8htfa4

Understanding and mitigating the risks of OpenClaw for non-technical users: A practical guide with Skill

1 min·1 sources·agent-threats-understanding-and-mitigating-the-risks-of-openclaw-mq7hdqi3

Assessing Automated Prompt Injection Attacks in Agentic Environments

1 min·1 sources·agent-threats-assessing-automated-prompt-injection-attacks-in-ag-mq7e5y1s

Securing Code Understanding: Detecting Natural Backdoor Vulnerability in Code Language Models

1 min·1 sources·agent-threats-securing-code-understanding-detecting-natural-bac-mq7e5y1r

Game-Theoretic Multi-Agent Control for Robust Contextual Reasoning in LLMs

1 min·1 sources·agent-threats-game-theoretic-multi-agent-control-for-robust-cont-mq7e5y1q

Training LLMs to Enforce Multi-Level Instruction Hierarchies via Gravity-Weighted Direct Preference Optimizati

1 min·1 sources·agent-threats-training-llms-to-enforce-multi-level-instruction-h-mq7e5y1p

Semantic Multi-Agent Intrusion Detection for IoT:Zero-Day and Adversarial Threats with Risk-Aware Reasoning

1 min·1 sources·agent-threats-semantic-multi-agent-intrusion-detection-for-iot-z-mq7e5y1o

Advancing the State-of-the-Art in Empirical Privacy Auditing

1 min·1 sources·agent-threats-advancing-the-state-of-the-art-in-empirical-privac-mq7e5y1n

MemVenom: Triggered Poisoning of Multimodal Memories in Web Agents

1 min·1 sources·agent-threats-memvenom-triggered-poisoning-of-multimodal-memori-mq7e5y1m

What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks

1 min·1 sources·agent-threats-what-the-eyes-see-the-llms-miss-exploiting-human-mq655kv3

RAILS: Verification-Native Clearing For Agentic Commerce

1 min·1 sources·agent-threats-rails-verification-native-clearing-for-agentic-co-mq60v89x

Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Cha

1 min·1 sources·agent-threats-unveiling-privacy-risks-in-multi-modal-large-langu-mq60v89w

Customization under Fire: Plugin Poisoning in Text-to-Image Ecosystem

1 min·1 sources·agent-threats-customization-under-fire-plugin-poisoning-in-text-mq60v89u

From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperability

1 min·1 sources·agent-threats-from-privacy-to-workflow-integrity-communication-mq4ja8dq

HAVE: Host Active Verification Engine for Closing the Contextual Reality Gap in Security Digital Twins

1 min·1 sources·agent-threats-have-host-active-verification-engine-for-closing-mq4ja8dp

Defending Jailbreak Attacks on Large Language Models via Manifold Trajectory Kinetics

1 min·1 sources·agent-threats-defending-jailbreak-attacks-on-large-language-mode-mq4ja8do

MCP Server Kubernetes: kubectl-generic flag injection enables Kubernetes bearer token exfiltration

1 min·1 sources·agent-threats-mcp-server-kubernetes-kubectl-generic-flag-inject-mq141hxi

RedEdit: Agentic Red-Teaming of Image Safety Classifiers via MCTS-Guided Photo-Editing

1 min·1 sources·agent-threats-rededit-agentic-red-teaming-of-image-safety-class-mq0kr3pi

SlotGCG: Exploiting the Positional Vulnerability in LLMs for Jailbreak Attacks

1 min·1 sources·agent-threats-slotgcg-exploiting-the-positional-vulnerability-i-mq0kr3pi

Steering LLM Viewpoints through Fabricated Evidence Injection

1 min·1 sources·agent-threats-steering-llm-viewpoints-through-fabricated-evidenc-mq0kr3ph

GenTI: Benchmarking LLMs for Autonomous IDPS Rule Generation for Unseen Attacks

1 min·1 sources·agent-threats-genti-benchmarking-llms-for-autonomous-idps-rule-mq0kr3pg

Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals

1 min·1 sources·agent-threats-will-the-agent-recuse-itself-measuring-llm-agent-mq0kr3pf

WebMCP Tool Surface Poisoning: Runtime Manipulation Attacks on LLM Agents

1 min·1 sources·agent-threats-webmcp-tool-surface-poisoning-runtime-manipulatio-mq0kr3pe

Cascading Hallucination in Agentic RAG: The CHARM Framework for Detection and Mitigation

1 min·1 sources·agent-threats-cascading-hallucination-in-agentic-rag-the-charm-mpyzyauq

CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilities

1 min·1 sources·agent-threats-cybergym-e2e-scalable-real-world-benchmark-for-ai-mpyzyauo

A-Live: Passive Liveness Detection via Neuromuscular Micro-Motion Signatures on Commodity Sensors

1 min·1 sources·agent-threats-a-live-passive-liveness-detection-via-neuromuscul-mpyzyaun

Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agent

1 min·1 sources·agent-threats-caught-in-the-act-ivation-toward-pre-output-and-mpyzyaum

Description-Code Inconsistency in Real-world MCP Servers: Measurement, Detection, and Security Implications

1 min·1 sources·agent-threats-description-code-inconsistency-in-real-world-mcp-s-mpyzyaul

From Control Boundary to Insurance Claim: Reconstructing AI-Mediated Losses Through the CER Framework

1 min·1 sources·agent-threats-from-control-boundary-to-insurance-claim-reconstr-mpxf5u98

Bastet: A Fine-Grained Expert-Labeled Dataset for DeFi Smart Contract Vulnerability Detection

1 min·1 sources·agent-threats-bastet-a-fine-grained-expert-labeled-dataset-for-mpxf5u97

FORGE: Multi-Agent Graduated Exploitation and Detection Engineering

1 min·1 sources·agent-threats-forge-multi-agent-graduated-exploitation-and-dete-mpxf5u96

AI Agents Enable Adaptive Computer Worms

1 min·1 sources·agent-threats-ai-agents-enable-adaptive-computer-worms-mpxf5u95

$π$Creds: Privately Inferred Credentials

1 min·1 sources·agent-threats-creds-privately-inferred-credentials-mpxf5u94

ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

1 min·1 sources·agent-threats-clawhub-security-signals-when-virustotal-static-mpw4069d

Benign Inputs, Harmful Outputs: Cross-Modal Jailbreaking via Distributed Semantic Recomposition

1 min·1 sources·agent-threats-benign-inputs-harmful-outputs-cross-modal-jailbr-mpw4069c

Needles at Scale: LLM-Assisted Target Selection for Windows Vulnerability Research

1 min·1 sources·agent-threats-needles-at-scale-llm-assisted-target-selection-fo-mpw4069b

Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs

1 min·1 sources·agent-threats-confused-chatgpt-cross-app-context-poisoning-via-mpvyn4ju

SS-ZKR: Spatial-Semantic Zero-Knowledge Routing for Privacy-Preserving Multi-Agent Collaboration

1 min·1 sources·agent-threats-ss-zkr-spatial-semantic-zero-knowledge-routing-fo-mpvyn4jt

@agenticmail/mcp Missing Authentication for Critical Function

1 min·1 sources·agent-threats-agenticmail-mcp-missing-authentication-for-critic-mpv9zr5j

Automatically Attacking Software Reverse Engineering AI Agents

1 min·1 sources·agent-threats-automatically-attacking-software-reverse-engineeri-mpuka1h2

Strengthening Polymorphic Prompt Assembling: Dynamic Separator Generation Against Emerging Prompt Injection At

1 min·1 sources·agent-threats-strengthening-polymorphic-prompt-assembling-dynam-mpuka1h1

Dissecting the Black Box: Circuit-Level Analysis of LLM Vulnerability Detection

1 min·1 sources·agent-threats-dissecting-the-black-box-circuit-level-analysis-o-mpqij50i

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

1 min·1 sources·agent-threats-agentdog-1-5-a-lightweight-and-scalable-alignment-mpqij50h

nono: Sandbox escape on Linux via D-Bus: `systemd-run --user`

1 min·1 sources·agent-threats-nono-sandbox-escape-on-linux-via-d-bus-systemd-mppx3buc

A Wolf in Sheep's Clothing: Targeted Routing Hijacking in Federated RAG

1 min·1 sources·agent-threats-a-wolf-in-sheep-s-clothing-targeted-routing-hijac-mpp45mwt

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

1 min·1 sources·agent-threats-can-it-reach-the-generator-investigating-the-surv-mpoysw7a

When Think-with-Image Meets Safety: What Determines Multimodal Jailbreak Robustness?

1 min·1 sources·agent-threats-when-think-with-image-meets-safety-what-determine-mpoysw79

MaskClaw: Edge-Side Personalized Privacy Arbitration for GUI Agents with Behavior-Driven Skill Evolution

1 min·1 sources·agent-threats-maskclaw-edge-side-personalized-privacy-arbitrati-mpoysw78

Technical Report: Exploring the Emerging Threats of the Agent Skill Ecosystem

1 min·1 sources·agent-threats-technical-report-exploring-the-emerging-threats-o-mpoysw78

Disentangling Adversarial Prompts: A Semantic-Graph Defense for Robust LLM Security

1 min·1 sources·agent-threats-disentangling-adversarial-prompts-a-semantic-grap-mpoysw77

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

1 min·1 sources·agent-threats-snare-adaptive-scenario-synthesis-for-eliciting-o-mpoysw76

Langroid has Prompt to SQL Injection, Leading to RCE

1 min·1 sources·agent-threats-langroid-has-prompt-to-sql-injection-leading-to-r-mpohno1e

Claude Code as a Daily Driver: Claude.md, Skills, Subagents, Plugins, and MCPs

1 min·1 sources·agent-threats-claude-code-as-a-daily-driver-claude-md-skills-mpnt06gg

Sandlock: Confining AI Agent Code with Unprivileged Linux Primitives

1 min·1 sources·agent-threats-sandlock-confining-ai-agent-code-with-unprivilege-mpng55ym

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

1 min·1 sources·agent-threats-sec-bench-pro-can-language-models-solve-long-hori-mpng55yl

Lessons from Penetration Tests on Large-Scale Agent Systems

1 min·1 sources·agent-threats-lessons-from-penetration-tests-on-large-scale-agen-mpng55yk

How Agentic AI Coding Assistants Become the Attacker's Shell

1 min·1 sources·agent-threats-how-agentic-ai-coding-assistants-become-the-attack-mpm4zlru

Demystifying the Mythos or Disrupting Bugonomics? From Zero-Day Asymmetry to Defender Remediation Throughput

1 min·1 sources·agent-threats-demystifying-the-mythos-or-disrupting-bugonomics-mpm4zlrt

APT-Agent: Automated Penetration Testing using Large Language Models

1 min·1 sources·agent-threats-apt-agent-automated-penetration-testing-using-lar-mpm4zlrs

Broken Object Level Authorization in the Wild: An Empirical Taxonomy from 100+ Bug Bounty Disclosures

1 min·1 sources·agent-threats-broken-object-level-authorization-in-the-wild-an-mpm4zlrr

Robust LLM Watermarking with Minimal Semantic Distortion for IP Protection

1 min·1 sources·agent-threats-robust-llm-watermarking-with-minimal-semantic-dist-mpkk75j9

Security, Privacy, and Ethical Risks in OpenClaw

1 min·1 sources·agent-threats-security-privacy-and-ethical-risks-in-openclaw-mpkk75j8

Kernel-Based ReLU Approximation for Homomorphic Encryption-Compatible Privacy-preserving Deep Learning Models

1 min·1 sources·agent-threats-kernel-based-relu-approximation-for-homomorphic-en-mpkk75j7

A Large Language Model Approach to Generating Bypass Rules for Malware Evasion in Analysis Sandbox

1 min·1 sources·agent-threats-a-large-language-model-approach-to-generating-bypa-mpggawht

Adversarial Reframing: A Framework for Targeted Generation in Language Models

1 min·1 sources·agent-threats-adversarial-reframing-a-framework-for-targeted-ge-mpggawhs

A First Measurement Study on Authentication Security in Real-World Remote MCP Servers

1 min·1 sources·agent-threats-a-first-measurement-study-on-authentication-securi-mpggawhr

Automated Repair of TEE Partitioning Issues via DSL-Guided and LLM-Assisted Patching

1 min·1 sources·agent-threats-automated-repair-of-tee-partitioning-issues-via-ds-mpggawhq

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

1 min·1 sources·agent-threats-benchmarking-autonomous-agents-against-temporal-s-mpggawhp

MCP Server Kubernetes: Tool Access Control Bypass via Presentation-Layer Filtering Without Execution-Layer Enf

1 min·1 sources·agent-threats-mcp-server-kubernetes-tool-access-control-bypass-mpfz5igw

Refusal Evaluation in Coding LLMs and Code Agents: A Systematic Review of Thirteen Malicious-Code Prompt Corpo

1 min·1 sources·agent-threats-refusal-evaluation-in-coding-llms-and-code-agents-mpevi1sa

Trusted Weights, Treacherous Optimizations? Optimization-Triggered Backdoor Attacks on LLMs

1 min·1 sources·agent-threats-trusted-weights-treacherous-optimizations-optimi-mpevi1s9

An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

1 min·1 sources·agent-threats-an-application-layer-multi-modal-covert-channel-re-mpevi1s7

Heartbeat-Bound Hierarchical Credentials: Cryptographic Revocation for AI Agent Swarms

1 min·1 sources·agent-threats-heartbeat-bound-hierarchical-credentials-cryptogr-mpevi1s6

VIPER-MCP: Detecting and Exploiting Taint-Style Vulnerabilities in Model Context Protocol Servers

1 min·1 sources·agent-threats-viper-mcp-detecting-and-exploiting-taint-style-vu-mpevi1s4

Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks

1 min·1 sources·agent-threats-surviving-the-unseen-predictive-defense-for-novel-mpdh4sqr

Agent Meltdowns: The Road to Hell Is Paved with Helpful Agents

1 min·1 sources·agent-threats-agent-meltdowns-the-road-to-hell-is-paved-with-he-mpdh4sqq

Hallucination as Exploit: Evidence-Carrying Multimodal Agents

1 min·1 sources·agent-threats-hallucination-as-exploit-evidence-carrying-multim-mpdh4sqp

Token by Token, Compromised: Backdoor Vulnerabilities in Unified Autoregressive Models

1 min·1 sources·agent-threats-token-by-token-compromised-backdoor-vulnerabilit-mpdh4sqp

SCARA: A Semantics-Constrained Autonomous Remediation Agent for Opaque Industrial Software Vulnerabilities

1 min·1 sources·agent-threats-scara-a-semantics-constrained-autonomous-remediat-mpdh4sqo

Hunting Vulnerability Variants in AI Infra: Measurement and Reference-Driven Detection

1 min·1 sources·agent-threats-hunting-vulnerability-variants-in-ai-infra-measur-mpdh4sqn

Measuring Safety Alignment Effects in Autonomous Security Agents

1 min·1 sources·agent-threats-measuring-safety-alignment-effects-in-autonomous-s-mpdh4sql

auth-fetch-mcp: SSRF and disk exfiltration via unvalidated auth_fetch and download_media URLs

1 min·1 sources·agent-threats-auth-fetch-mcp-ssrf-and-disk-exfiltration-via-unv-mpctjztf

Not What You Asked For: Typographic Attacks in Household Robot Manipulation

1 min·1 sources·agent-threats-not-what-you-asked-for-typographic-attacks-in-hou-mpc84djc

Prompts Don't Protect: Architectural Enforcement via MCP Proxy for LLM Tool Access Control

1 min·1 sources·agent-threats-prompts-don-t-protect-architectural-enforcement-v-mpc71tma

Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks

1 min·1 sources·agent-threats-overeager-coding-agents-measuring-out-of-scope-ac-mpc71tm8

AI Agents May Always Fall for Prompt Injections

1 min·1 sources·agent-threats-ai-agents-may-always-fall-for-prompt-injections-mpc0mdux

ADR: An Agentic Detection System for Enterprise Agentic AI Security

1 min·1 sources·agent-threats-adr-an-agentic-detection-system-for-enterprise-ag-mpc0mduw

ContraFix: Agentic Vulnerability Repair via Differential Runtime Evidence and Skill Reuse

1 min·1 sources·agent-threats-contrafix-agentic-vulnerability-repair-via-differ-mpc0mduw

Explainable Machine Learning for Phishing Detection on Heterogeneous Datasets with MCP-Enabled Deployment

1 min·1 sources·agent-threats-explainable-machine-learning-for-phishing-detectio-mpc0mduv

Babel: Jailbreaking Safety Attention via Obfuscation Distribution Optimized Sampling

1 min·1 sources·agent-threats-babel-jailbreaking-safety-attention-via-obfuscati-mpc0mduu

An Empirical Study of Privacy Leakage Chains via Prompt Injection in Black-Box Chatbot Environments

1 min·1 sources·agent-threats-an-empirical-study-of-privacy-leakage-chains-via-p-mpc0mdus

LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injectio

1 min·1 sources·agent-threats-livepi-more-realistic-benchmarking-of-agents-agai-mpc0mdur

Acoustic Interference: A New Paradigm Weaponizing Acoustic Latent Semantic for Universal Jailbreak against Lar

1 min·1 sources·agent-threats-acoustic-interference-a-new-paradigm-weaponizing-mpc0mduq

A Multi-Layer Cloud-IDS Pipeline with LLM and Adaptive Q-Learning Calibration

1 min·1 sources·agent-threats-a-multi-layer-cloud-ids-pipeline-with-llm-and-adap-mpawyy1h

A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation

1 min·1 sources·agent-threats-a-cross-modal-prompt-injection-attack-against-larg-mpawyy1g

uGen: An Agentic Framework for Generating Microarchitectural Attack PoCs

1 min·1 sources·agent-threats-ugen-an-agentic-framework-for-generating-microarc-mpawyy1f

Veritas: A Semantically Grounded Agentic Framework for Memory Corruption Vulnerability Detection in Binaries

1 min·1 sources·agent-threats-veritas-a-semantically-grounded-agentic-framework-mp6bxk06

Toward Securing AI Agents Like Operating Systems

1 min·1 sources·agent-threats-toward-securing-ai-agents-like-operating-systems-mp6bxk05

WARD: Adversarially Robust Defense of Web Agents Against Prompt Injections

1 min·1 sources·agent-threats-ward-adversarially-robust-defense-of-web-agents-a-mp6bxk04

Exploiting LLM Agent Supply Chains via Payload-less Skills

1 min·1 sources·agent-threats-exploiting-llm-agent-supply-chains-via-payload-les-mp6bxk04

The Great Pretender: A Stochasticity Problem in LLM Jailbreak

1 min·1 sources·agent-threats-the-great-pretender-a-stochasticity-problem-in-ll-mp6bxk03

EVA: Editing for Versatile Alignment against Jailbreaks

1 min·1 sources·agent-threats-eva-editing-for-versatile-alignment-against-jailb-mp6bxk02

DeepSeek TUI: task_create Insecure Defaults Enable RCE via Prompt Injection in Project Files

1 min·1 sources·agent-threats-deepseek-tui-task-create-insecure-defaults-enable-mp5y00hu

Open WebUI has a SSRF Bypass via HTTP Redirect Following in Web-Fetch and Image-Load Endpoints (not addressed

1 min·1 sources·agent-threats-open-webui-has-a-ssrf-bypass-via-http-redirect-fol-mp5y00hq

dbt MCP Server has an Argument Injection in dbt CLI Tool Wrappers via node_selection and resource_type Paramet

1 min·1 sources·agent-threats-dbt-mcp-server-has-an-argument-injection-in-dbt-cl-mp5tpnfx

dbt MCP Server Logs Tool Arguments Including SQL Queries and Credentials in Plaintext Without Redaction When F

1 min·1 sources·agent-threats-dbt-mcp-server-logs-tool-arguments-including-sql-q-mp5tpnfw

dbt MCP Server Transmits All MCP Tool Arguments Including Raw SQL and --vars Credentials to dbt Labs Telemetry

1 min·1 sources·agent-threats-dbt-mcp-server-transmits-all-mcp-tool-arguments-in-mp5tpnft

Flowise has an MCP Security Bypass that Enables RCE

1 min·1 sources·agent-threats-flowise-has-an-mcp-security-bypass-that-enables-rc-mp5m7x33

Sleeper Channels and Provenance Gates: Persistent Prompt Injection in Always-on Autonomous AI Agents

1 min·1 sources·agent-threats-sleeper-channels-and-provenance-gates-persistent-mp4xk8mr

Quantifying LLM Safety Degradation Under Repeated Attacks Using Survival Analysis

1 min·1 sources·agent-threats-quantifying-llm-safety-degradation-under-repeated-mp4ucm2h

Large Language Models for Agentic NetOps and AIOps: Architectures, Evaluation, and Safety

1 min·1 sources·agent-threats-large-language-models-for-agentic-netops-and-aiops-mp4ucm2h

Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills

1 min·1 sources·agent-threats-do-skill-descriptions-tell-the-truth-detecting-un-mp4ucm2g

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

1 min·1 sources·agent-threats-no-attack-required-semantic-fuzzing-for-specifica-mp4ucm2f

claude-code-cache-fix vulnerable to local code execution via Python triple-quote injection in tools/quota-stat

1 min·1 sources·agent-threats-claude-code-cache-fix-vulnerable-to-local-code-exe-mp48ww7q

Obot has an authorization bypass in /mcp-connect/{id} that allows any authenticated user to use any registered

1 min·1 sources·agent-threats-obot-has-an-authorization-bypass-in-mcp-connect-mp48ww7m

LangSmith SDK: Public prompt pull deserializes untrusted manifests without trust boundary warning

1 min·1 sources·agent-threats-langsmith-sdk-public-prompt-pull-deserializes-unt-mp47ub79

Options, Not Clicks: Lattice Refinement for Consent-Driven MCP Authorization

1 min·1 sources·agent-threats-options-not-clicks-lattice-refinement-for-consen-mp3ewr7g

Behavioral Integrity Verification for AI Agent Skills

1 min·1 sources·agent-threats-behavioral-integrity-verification-for-ai-agent-ski-mp3ewr7f

Comment and Control: Hijacking Agentic Workflows via Context-Grounded Evolution

1 min·1 sources·agent-threats-comment-and-control-hijacking-agentic-workflows-v-mp3ewr7f

Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data

1 min·1 sources·agent-threats-context-aware-spear-phishing-generative-ai-enable-mp3ewr7e

Generate "Normal", Edit Poisoned: Branding Injection via Hint Embedding in Image Editing

1 min·1 sources·agent-threats-generate-normal-edit-poisoned-branding-injecti-mp26zt5e

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

1 min·1 sources·agent-threats-knowledge-poisoning-attacks-on-medical-multi-modal-mp26zt5c

Threat Modelling using Domain-Adapted Language Models: Empirical Evaluation and Insights

1 min·1 sources·agent-threats-threat-modelling-using-domain-adapted-language-mod-mp26zt5c

Agentic Fuzzing: Opportunities and Challenges

1 min·1 sources·agent-threats-agentic-fuzzing-opportunities-and-challenges-mp26zt5b

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

1 min·1 sources·agent-threats-autosoup-safety-oriented-unit-proof-generation-fo-mp26zt5a

MATRA: Modeling the Attack Surface of Agentic AI Systems -- OpenClaw Case Study

1 min·1 sources·agent-threats-matra-modeling-the-attack-surface-of-agentic-ai-s-mp26zt59

Re-Triggering Safeguards within LLMs for Jailbreak Detection

1 min·1 sources·agent-threats-re-triggering-safeguards-within-llms-for-jailbreak-mp26zt58

From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World

1 min·1 sources·agent-threats-from-controlled-to-the-wild-evaluation-of-pentest-mp26zt57

Why Do Aligned LLMs Remain Jailbreakable: Refusal-Escape Directions, Operator-Level Sources, and Safety-Utilit

1 min·1 sources·agent-threats-why-do-aligned-llms-remain-jailbreakable-refusal-mp1zh10e

Cross-Modal Backdoors in Multimodal Large Language Models

1 min·1 sources·agent-threats-cross-modal-backdoors-in-multimodal-large-language-mp0k15b5

Language Models Can Autonomously Hack and Self-Replicate

1 min·1 sources·agent-threats-language-models-can-autonomously-hack-and-self-rep-mp0k15b4

LangChain vulnerable to unsafe deserialization of attacker-controlled objects through overly broad `load()` al

1 min·1 sources·agent-threats-langchain-vulnerable-to-unsafe-deserialization-of-moxjsbf7

Open WebUI has Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite

1 min·1 sources·agent-threats-open-webui-has-knowledge-base-destruction-and-rag-moxcaaje

Claude Code CVE-2026-39861:sandbox escape via symlink

1 min·1 sources·agent-threats-claude-code-cve-2026-39861-sandbox-escape-via-syml-mox0ijpx

Patch2Vuln: Agentic Reconstruction of Vulnerabilities from Linux Distribution Binary Patches

1 min·1 sources·agent-threats-patch2vuln-agentic-reconstruction-of-vulnerabilit-mowas0ya

Pop Quiz Attack: Black-box Membership Inference Attacks Against Large Language Models

1 min·1 sources·agent-threats-pop-quiz-attack-black-box-membership-inference-at-mowas0ya

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

1 min·1 sources·agent-threats-constraining-host-level-abuse-in-self-hosted-compu-mowas0y9

Root-Cause-Driven Automated Vulnerability Repair

1 min·1 sources·agent-threats-root-cause-driven-automated-vulnerability-repair-mouweu9g

Agentic Vulnerability Reasoning on Windows COM Binaries

1 min·1 sources·agent-threats-agentic-vulnerability-reasoning-on-windows-com-bin-mouweu9d

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

1 min·1 sources·agent-threats-misrouter-exploiting-routing-mechanisms-for-input-mouweu9e

AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

1 min·1 sources·agent-threats-agenttrust-runtime-safety-evaluation-and-intercep-mouweu9c

rmcp Streamable HTTP server transport has a DNS rebinding vulnerability

1 min·1 sources·agent-threats-rmcp-streamable-http-server-transport-has-a-dns-re-moulowwz

Dependency-Aware Privacy for Multi-turn Agents

1 min·1 sources·agent-threats-dependency-aware-privacy-for-multi-turn-agents-motetvf8

Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

1 min·1 sources·agent-threats-generating-proof-of-vulnerability-tests-to-help-en-motetvf7

Tailored Prompts, Targeted Protection: Vulnerability-Specific LLM Analysis for Smart Contracts

1 min·1 sources·agent-threats-tailored-prompts-targeted-protection-vulnerabili-motetvf7

Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis

1 min·1 sources·agent-threats-exposing-llm-safety-gaps-through-mathematical-enco-motetvf6

MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents

1 min·1 sources·agent-threats-mosaic-bench-measuring-compositional-vulnerabilit-motetvf5

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

1 min·1 sources·agent-threats-when-agents-handle-secrets-a-survey-of-confidenti-motetvf4

ciguard: discover_pipeline_files follows symlinks out of scan root

1 min·1 sources·agent-threats-ciguard-discover-pipeline-files-follows-symlinks-mot7clgy

APIOT: Autonomous Vulnerability Management Across Bare-Metal Industrial OT Networks

1 min·1 sources·agent-threats-apiot-autonomous-vulnerability-management-across-mos2lm4w

EvoPoC: Automated Exploit Synthesis for DeFi Smart Contracts via Hierarchical Knowledge Graphs

1 min·1 sources·agent-threats-evopoc-automated-exploit-synthesis-for-defi-smart-mos2lm4w

Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration

1 min·1 sources·agent-threats-trojan-hippo-weaponizing-agent-memory-for-data-ex-mos2lm4v

ContextualJailbreak: Evolutionary Red-Teaming via Simulated Conversational Priming

1 min·1 sources·agent-threats-contextualjailbreak-evolutionary-red-teaming-via-mos2lm4u

VisInject: Disruption != Injection -- A Dual-Dimension Evaluation of Universal Adversarial Attacks on Vision-L

1 min·1 sources·agent-threats-visinject-disruption-injection-a-dual-dimen-morze008

AgenticVM: Agentic AI for Adaptive Software Vulnerability Management

1 min·1 sources·agent-threats-agenticvm-agentic-ai-for-adaptive-software-vulner-morze007

Architectural Obsolescence of Unhardened Agentic-AI Runtimes

1 min·1 sources·agent-threats-architectural-obsolescence-of-unhardened-agentic-a-morze006

Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection

1 min·1 sources·agent-threats-latent-adversarial-detection-adaptive-probing-of-momcu6bk

Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives

1 min·1 sources·agent-threats-indirect-prompt-injection-in-the-wild-an-empirica-mom9mjun

Enhancing Linux Privilege Escalation Attack Capabilities of Local LLM Agents

1 min·1 sources·agent-threats-enhancing-linux-privilege-escalation-attack-capabi-mom9mjul

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

1 min·1 sources·agent-threats-safetune-mitigating-data-poisoning-in-llm-fine-tu-mom9mjul

How Code Representation Shapes False-Positive Dynamics in Cross-Language LLM Vulnerability Detection

1 min·1 sources·agent-threats-how-code-representation-shapes-false-positive-dyna-mom9mjuk

Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Ca

1 min·1 sources·agent-threats-security-attack-and-defense-strategies-for-autonom-mom9mjui

Prompt injection: the SQL injection of the AI era — real case

1 min·1 sources·lyrie-scheduled-20260430-1700-2ec36693

Claude Code refuses requests or charges extra if your commits mention "OpenClaw"

1 min·1 sources·agent-threats-claude-code-refuses-requests-or-charges-extra-if-y-moln48hf

47 CVEs this month targeting AI agent infrastructure

1 min·1 sources·lyrie-scheduled-20260430-0800-0a2c1461

The AI agent framework security gap — named with receipts

1 min·1 sources·lyrie-scheduled-20260430-0500-ae0a95eb

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

1 min·1 sources·agent-threats-safereview-defending-llm-based-review-systems-aga-mokv95gl

OpenClaw: Webchat audio embedding could read local files without local-root containment

1 min·1 sources·agent-threats-openclaw-webchat-audio-embedding-could-read-local-mokllwo2

Anthropic's Champion Kit for engineers pushing Claude Code at their company

1 min·1 sources·agent-threats-anthropic-s-champion-kit-for-engineers-pushing-cla-mok8qz9r

From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native E

1 min·1 sources·agent-threats-from-crud-to-autonomous-agents-formal-validation-mojeqsbp

SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents

1 min·1 sources·agent-threats-snapguard-lightweight-prompt-injection-detection-mojeqsbo

The Protocol Is the Exploit: How MCP's Architectural Flaw Turned 150 Million AI Downloads Into an Attack Surface

9 min·0 sources·mcp-architectural-rce-ai-supply-chain-ox-security

Spore: Efficient and Training-Free Privacy Extraction Attack on LLMs via Inference-Time Hybrid Probing

1 min·1 sources·agent-threats-spore-efficient-and-training-free-privacy-extract-moilt1uf

GlassWorm escalates: 73 Open VSX sleeper extensions deploy malware to VS Code, Cursor, and every VSIX IDE

11 min·3 sources·glassworm-73-openvsx-2026-04-28

Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models

1 min·1 sources·agent-threats-layerwise-convergence-fingerprints-for-runtime-mis-moi0dg1o

MAS-SZZ: Multi-Agentic SZZ Algorithm for Vulnerability-Inducing Commit Identification

1 min·1 sources·agent-threats-mas-szz-multi-agentic-szz-algorithm-for-vulnerabi-moi0dg1o

System-aware contextual digital twin for ICS anomaly diagnosis

1 min·1 sources·agent-threats-system-aware-contextual-digital-twin-for-ics-anoma-moi0dg1n

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

1 min·1 sources·agent-threats-agentvisor-defending-llm-agents-against-prompt-in-moi0dg1m

Poster: ClawdGo: Endogenous Security Awareness Training for Autonomous AI Agents

1 min·1 sources·agent-threats-poster-clawdgo-endogenous-security-awareness-tra-moi0dg1l

TraceScope: Interactive URL Triage via Decoupled Checklist Adjudication

1 min·1 sources·agent-threats-tracescope-interactive-url-triage-via-decoupled-c-mohtjgqu

Automation-Exploit: Multi‑Agent LLMs weaponized with digital-twin guardrails

4 min·1 sources·agent-threats-automation-exploit-a-multi-agent-llm-framework-fo-mohtjgqt

OpenClaw: MCP stdio server env could load dangerous startup variables from workspace config

1 min·1 sources·agent-threats-openclaw-mcp-stdio-server-env-could-load-dangerou-mohtjgqs

OpenClaw: Agent gateway config mutations could change protected operator settings

4 min·1 sources·agent-threats-openclaw-agent-gateway-config-mutations-could-cha-mohtjgqr

LiteLLM: Authenticated command execution via MCP stdio test endpoints

4 min·1 sources·agent-threats-litellm-authenticated-command-execution-via-mcp-s-mohtjgqq

11 ways agents get hijacked in 2026 — a defender's field guide

12 min·18 sources·agent-threats-field-guide-2026-04-27