A clean-looking GitHub repository can become a malware delivery path when an AI coding assistant is trusted to set it up.
Mozilla’s 0DIN team demonstrated that risk in a June 25, 2026, proof-of-concept involving Claude Code, Anthropic’s agentic coding tool. The demo showed how routine setup steps could lead an AI coding agent to fetch and run a DNS-hosted payload, opening a reverse shell on a developer’s machine.
AI coding assistants can read files, install dependencies, run shell commands, and access local developer environments. That makes prompt injection a credential-exposure and endpoint-compromise risk as vendors and researchers define new security controls for AI agents.
How the clean repo attack worked
In the 0DIN proof-of-concept, the repository did not contain an obvious malicious payload. It used a normal-looking Python setup flow that Claude Code read as project context.
The assistant installed requirements, encountered a routine initialization error, and ran the suggested fix. That command triggered a setup script that queried a DNS TXT record, decoded the returned value, and executed it as a shell command.
The result was a reverse shell back to an attacker-controlled system. Because the payload was fetched at runtime, the final command was not visible in the repository before execution, leaving static scanners, code reviewers, and dependency audits with little to flag.
Once the shell opened, the attacker had access under the developer’s own user account. That could expose source code, browser sessions, API keys, GitHub tokens, AWS credentials, SSH material, and other local secrets.
The agent did not need to recognize the action as malicious. It was trying to complete a developer’s request, troubleshoot an error, and run a documented initialization step.
Separate research shows the problem is broader than one demo, as vendors race to make AI coding tools safer. A study of agentic coding editors analyzed 314 prompt-injection payloads across 70 MITRE ATT&CK techniques and found success rates as high as 84% for malicious command execution against GitHub Copilot and Cursor.
Why agentic coding tools need new guardrails
Any coding agent that ingests untrusted project content while operating with developer-level permissions can create a similar exposure path.
High-risk setups combine broad filesystem access, terminal access, network access, persistent memory, external package ingestion, or access to credential stores. Those capabilities move the security boundary from code review alone to runtime controls around files, commands, network access, and secrets.
Security teams should review whether agents inherit full developer credentials or run under constrained service accounts. They should also define how agents handle setup instructions, package errors, README files, model context protocol tools, and other external content, especially as AI systems are used to find and prioritize exploitable software flaws.
A January 2026 systematic review of 78 studies found that adaptive prompt-injection attacks can bypass many published defenses when attackers tune payloads to evade them. That makes filtering and warning banners insufficient as standalone controls.
Enterprises should prioritize least-privilege configurations, sandboxed execution, approval gates for file writes and shell commands, network egress controls, secret isolation, and audit logs for agent-executed actions. Anthropic’s sandboxing guidance also points to filesystem and network isolation as controls for limiting damage from a compromised coding agent.
Treat every unfamiliar repository, setup script, package file, and README that an AI agent reads as untrusted input. As coding agents move deeper into enterprise development workflows, security controls have to extend to the runtime environment where agents execute commands.
Read more: Security teams are also tracking how AI-driven malware can choose its own attack path as agentic systems gain more autonomy.


