In an era where Artificial Intelligence is increasingly integrated into critical development workflows, a stark reminder of its inherent vulnerabilities has emerged. A security researcher has uncovered a severe flaw in Anthropic’s Claude Code GitHub Action, demonstrating how a seemingly innocuous GitHub issue could be weaponized to seize control of public repositories. This discovery not only exposed vulnerable projects but also highlighted a potential pathway to compromise the AI action itself, creating a dangerous supply chain threat.
The Alarming Flaw: A Bot’s Betrayal
The vulnerability, meticulously detailed by RyotaK of GMO Flatt Security, centered on a critical bypass within the Claude Code GitHub Action. Designed to streamline CI/CD pipelines by triaging issues, applying labels, reviewing pull requests, and executing slash commands, the action typically operates with extensive read and write permissions across a repository’s code, issues, pull requests, discussions, and workflow files. Given these broad capabilities, a stringent permission check was in place, theoretically allowing only users with explicit write access to trigger the action.
However, this safeguard contained a significant loophole. The system was configured to permit any actor whose name ended in “[bot]” to bypass the trigger check, operating under the assumption that GitHub Apps are inherently trusted entities installed by administrators. This assumption proved fatal. As RyotaK demonstrated, anyone can register a GitHub App, install it on their own repository, and then leverage its token to open an issue or pull request on any public repository. The Claude Code Action, misinterpreting the bot’s origin as legitimate, would then process the attacker’s malicious content.
The Prompt Injection Mechanism: Unmasking Secrets
Once the bot bypass was achieved, the attacker’s next move involved a sophisticated technique known as indirect prompt injection. This method involves subtly embedding malicious instructions within content that an AI model is designed to read, compelling the model to execute the attacker’s directives rather than its intended task. RyotaK crafted a GitHub issue whose body masqueraded as an error message. Through careful refinement, he engineered the prompt to trick Claude into “recovering” by executing hidden commands.
The primary target for this exfiltration was /proc/self/environ, a Linux file containing a process’s environment variables, including sensitive secrets. Despite Claude Code’s built-in safeguards against direct reads, RyotaK successfully bypassed these defenses, coercing Claude to write the extracted variable values back into the GitHub issue, where they could be easily retrieved by the attacker.
The ultimate prize in those variables was the credential pair used by GitHub Actions to request an OIDC token – a signed token verifying the workflow’s identity within its repository. Claude Code exchanges this OIDC token with Anthropic’s backend for a Claude GitHub App installation token, which grants write access. By stealing these credentials and replaying the exchange, an attacker could gain full write access to the target repository’s code, issues, and workflows. Critically, if aimed at the claude-code-action repository itself, this could lead to the poisoning of the action, impacting all downstream projects that rely on it.
Real-World Implications and Other Vulnerabilities
RyotaK also identified additional, less direct attack vectors. One notable example involved Anthropic’s own issue-triage workflow, which shipped with allowed_non_write_users: "*"
. This setting, explicitly flagged as risky in Anthropic’s documentation, allowed anyone to trigger the workflow. Compounding the risk, Claude was found to be posting task summaries to the publicly visible workflow run’s summary panel, creating an immediate data leakage channel. Many repositories that copied this example workflow unwittingly inherited this vulnerability.
Another path involved an attacker with issue-editing privileges but no direct trigger access. By editing a trusted user’s issue after the workflow had been initiated but before Claude processed it, the malicious payload could be injected as “trusted” input.
These vulnerabilities are far from theoretical. A similar setup – an AI issue-triager combined with broad permissions and prompt injection – already led to a real-world supply chain incident in February. An attacker used a prompt-injected issue title against Cline’s
claude-code-action triage workflow to steal an npm publish token, resulting in the unauthorized release of cline@2.3.0. While the rogue version only installed a non-malicious AI agent and was quickly removed, the incident underscored the potential for widespread malware distribution.
The autonomous “HackerBot-Claw” further demonstrated this threat by actively probing GitHub Actions misconfigurations across major organizations like Microsoft, Datadog, and CNCF projects. Although Claude successfully thwarted an attempt to prompt-inject a reviewer via a poisoned config file, the continuous probing highlights the persistent danger.
Mitigating the Risk: Essential Steps for Developers
Anthropic acted swiftly, fixing the core bypass within four days of RyotaK’s report in January, with further hardening implemented through the spring. The critical fixes are available in claude-code-action v1.0.94 and later versions. Anthropic rated the issues 7.8 under CVSS v4.0 and awarded a bug bounty for the responsible disclosure.
To protect against these threats, developers must:
- Update Immediately: Ensure all instances of
claude-code-actionare updated to version 1.0.94 or newer. - Audit Workflows: Scrutinize any workflow that permits users without write access, or bots, to trigger Claude.
- Restrict Secrets: If a workflow processes untrusted input, limit the secrets it can access to only the Anthropic API key and
GITHUB_TOKEN. - Remove Exfiltration Tools: Eliminate any tools and permissions that could be exploited for data exfiltration.
The Unsolved Challenge of Prompt Injection
RyotaK’s extensive research, which has reportedly uncovered around 50 distinct ways to bypass Claude Code’s permission system and execute commands, underscores a broader, persistent challenge in AI security: prompt injection. This vulnerability remains largely unsolved, and as AI agents gain more sophisticated tools and access to real tokens, their potential for misuse scales directly with the permissions they are granted. The incident serves as a critical warning: integrating powerful AI into automated workflows demands rigorous security scrutiny and a proactive approach to emerging threats.
For more details, visit our website.
Source: Link








Leave a comment