DevelopmentIndustry Guide13 min readPublished June 24, 2026

One fake Sentry error · ~85% reported success · Claude Code, Cursor & Codex

Agentjacking: Your AI Coding Agent Can Be Hijacked

A new attack class researchers call agentjacking hides instructions inside a fake error report. Your AI coding agent reads it through an MCP connection and runs the attacker’s command — using your real credentials. No phishing, no server breach, no policy violation to trip an alarm. Here is how it works and the five guardrails that cut your exposure.

DA
Digital Applied Team
Senior strategists · Published Jun 24, 2026
PublishedJun 24, 2026
Read time13 min
SourcesTenet, CSA, Unit 42, OWASP
Reported success rate
~85%
Tenet controlled test
Exposed organizations
2,388
public DSNs found
Confirmed executions
100+
incl. a Fortune 100
CVE assigned
None
architectural class

Agentjacking is a newly disclosed attack class that hijacks AI coding agents — Claude Code, Cursor, and Codex among them — by hiding malicious instructions inside data the agent already trusts. A developer asks the agent to investigate an error; the agent pulls the error details through a Model Context Protocol (MCP) connection; and the attacker’s payload, planted inside that error, runs as if it were the agent’s own idea.

The research comes from Tenet Threat Labs, which coined the term and published the disclosure in June 2026. Their controlled testing reports a roughly 85% exploitation success rate across the major agents. That figure is Tenet’s own measurement — corroborated by several security outlets reporting the same number, but not independently replicated — so we treat it as vendor-stated throughout. What is not in dispute is the mechanism, and the mechanism is the part worth understanding.

This guide explains what agentjacking is, why conventional security tooling misses it entirely, who is exposed, why the vendor at the center of the disclosure declined to fix it, and — most usefully — five concrete guardrails you can apply today, mapped to the specific stage of the attack each one disrupts. We end with a short pre-run self-check you can adopt before your next agent touches a repository.

Key takeaways
  1. 01
    The injection rides on data, not on a breach.A fake error event carries hidden instructions. The agent retrieves it through MCP and cannot reliably tell the difference between data to read and a command to run — so it runs it.
  2. 02
    Prompt-layer warnings did not stop it.Tenet reports that agents executed the payload even when their system prompts explicitly told them to disregard untrusted external data. A warning in the prompt is not a control.
  3. 03
    Every step is technically authorized.Tenet calls this the Authorized Intent Chain. Real credentials, legitimate tools, no malware dropped. EDR, WAF, IAM, VPN, and firewalls see nothing wrong because nothing is unauthorized.
  4. 04
    The exposure is broad and the vendor declined a fix.Tenet reports 2,388 organizations with publicly exposed Sentry DSNs and 100+ confirmed executions. Sentry called it technically not defensible and applied only a string filter for the proof-of-concept.
  5. 05
    Defense belongs at the agent, not the prompt.Audit MCP connections, rotate or proxy DSNs, sandbox execution, require human approval for shell commands, and deploy hardened configs. Map each control to a specific attack stage.

01The AttackA fake bug report that the agent treats as instructions.

Start with the workflow every developer using an AI agent already has: something breaks, you ask the agent to look into it, and the agent queries your error-tracking service to read the details. Tools like AI coding agents including Claude Code, Cursor, and Codex do this through MCP — the protocol that lets an agent call out to external services for context. The agent reads the error, proposes a fix, and often offers to run it.

Agentjacking abuses that loop. According to Tenet, the attacker sends a crafted error event to your Sentry project and embeds malicious instructions inside the event’s message field — formatted as a fake ## Resolution markdown section so it mimics the structure of legitimate Sentry MCP output. When the agent later retrieves that event, it sees what looks like an authoritative suggested fix and acts on it. The payload in Tenet’s proof-of-concept directed agents to run a single npm command that then probed the local machine for AWS credentials, npm tokens, Docker credentials, SSH keys, and git credential helpers.

The entry point is a Sentry DSN — the Data Source Name. A DSN is by design a public, write-only credential: it is embedded in frontend JavaScript so a visitor’s browser can report errors. That makes it trivially discoverable to anyone who can view a page’s source. Tenet found exposed DSNs using only public Sentry APIs and GitHub code search — no authentication, no breach. The design that makes Sentry easy to wire up is the same design that makes it injectable.

Step 1 · Discovery
Find the DSN
public Sentry APIs + GitHub search

A Sentry DSN is a public, write-only key embedded in frontend JavaScript. Anyone who can read a page's source can find it. Tenet located thousands without authentication.

No breach required
Step 2 · Injection
Plant the payload
fake ## Resolution markdown

The attacker sends a crafted error event. Malicious instructions sit in the message field, formatted to mimic a legitimate Sentry MCP resolution template the agent already trusts.

Looks authoritative
Step 3 · Execution
Agent runs the command
npx … --diagnose

When the developer asks the agent to investigate, it retrieves the event, reads the injected fix, and executes it — probing for AWS keys, npm tokens, SSH keys, and git credentials.

Real credentials used
Important framing
This is not a zero-day and no CVE has been assigned. Tenet characterizes agentjacking as an architectural attack class living at the intersection of three systems — telemetry ingestion, the MCP protocol, and the agent’s trust model — not a single point vulnerability you can patch in one place. It is also not Sentry-specific: per the research, any MCP-connected service that surfaces externally-controlled content carries the same pattern, including issue trackers, support queues, code-review platforms, and log aggregators.
The core problem, in Tenet's words
“AI coding agents cannot tell the difference between the data they read and an instruction to act.” That single sentence from Tenet Threat Labs is the whole vulnerability: formatting that looks like a suggested fix is read as a suggested fix, and acted on.

02Why Detection FailsThe Authorized Intent Chain: every step is permitted.

The reason agentjacking is dangerous is not that it is clever at the injection point — prompt injection is well documented. It is that the entire attack chain is, technically, authorized. Tenet names this the Authorized Intent Chain: the developer asks the agent to fix errors, the agent queries Sentry via MCP, MCP returns data, and the agent runs the suggested fix. No single step is unauthorized. The prevailing security model is built to catch unauthorized behavior — and this attack contains none.

That single property is what makes the attack pass straight through the controls most teams rely on. Per the reporting, agentjacking bypasses EDR, WAF, IAM systems, VPN, Cloudflare, and firewall controls — not by defeating them, but by never doing anything they are designed to flag. The agent is using legitimate credentials to execute legitimate tools. No malware is dropped. No policy is violated. There is no anomalous login, no exfiltration signature, no unsigned binary.

This reframes the entire defense problem. If your mental model of security is “detect the bad thing happening,” you have nothing to detect. The interpretation worth sitting with: the same properties that make agents productive — broad tool access, standing credentials, the autonomy to act on what they read — are exactly the properties an attacker borrows. You cannot harden the agent by making it more obedient to the data it reads. You harden it by constraining what it is allowed to do when it acts.

The Authorized Intent Chain
As Tenet Threat Labs frames it: “Every action in the chain is authorized. Tenet calls this the Authorized Intent Chain: the prevailing security model is built to catch unauthorized behavior, and this attack contains none.”

The most uncomfortable finding sharpens the point: prompt-layer defenses did not work. Tenet reports that agents executed the attacker’s payload even when their system prompts explicitly instructed them to disregard untrusted external data. The popular advice to “just add a warning to your system prompt” is, on this evidence, not a control at all. The same reasoning applies across the agent ecosystem we cover in our look at AI coding agents including Claude Code, Cursor, and Codex: every one of them can be configured to read external context, and external context is where the instruction hides.

The defensive consequence
If detection has nothing to detect and prompt warnings do not hold, the durable controls move to where the agent acts — sandboxing, egress limits, and approval gates — rather than where it reads. This is the through-line of the five-guardrail matrix later in the guide. For the broader pattern across agentic systems, our analysis of AI agent security best practices covers the same principle at the architecture level.

03The ScaleWho is exposed — and how far the footprint reaches.

Tenet’s exposure numbers are self-reported and we present them as such. During a validation period that ended June 17, 2026, the team reports identifying 2,388 organizations with injectable, publicly exposed Sentry DSNs — 71 of which rank within the Tranco top-one-million global websites. More than 100 real-world organizations had AI agents actually execute the researchers’ controlled validation payload, spanning Fortune 500 enterprises, hosting providers, scientific computing firms, cloud security vendors, and startups across FinTech, EdTech, and HealthTech.

The single most striking case — again, vendor-stated, with the company deliberately unnamed — is a Fortune 100 technology company valued at approximately $250 billion whose AI coding agents on corporate Windows devices confirmed execution of the payload, with cloud infrastructure tokens and git tokens accessible. Tenet reports confirmed victim environments spanning macOS, Windows (including WSL Ubuntu), CI/CD pipelines, sandboxed agents, VPN-protected internal networks, and GCP/AWS cloud containers, across six continents and more than 30 countries.

Agentjacking exposure footprint · as reported by Tenet

Source: Tenet Threat Labs disclosure, June 2026 — figures vendor-stated, not independently replicated
Exposed organizations (injectable DSNs)Identified via public APIs + GitHub search · vendor-stated
2,388
Confirmed agent executionsRan the controlled validation payload · vendor-stated
100+
In Tranco top-1M websitesProduction web infrastructure, not just side projects
71
Reported exploitation successAcross tested agents in controlled testing · vendor-stated
~85%

Treat these bars as a magnitude indicator rather than an audited census. The bar widths are an illustrative visual scaling of the reported figures, not a shared denominator — 2,388 and 100+ measure different things (exposed versus confirmed-executed). The honest takeaway is directional: the exposed population is large, real executions occurred at well-resourced organizations, and the geographic spread means this is a global pattern rather than a regional or sector-specific one. Notably, this lands against a wider backdrop in which 1 in 8 enterprise breaches now involve agentic systems — agentjacking is one named instance of a category that is already showing up in breach statistics.

04The Vendor ResponseWhy Sentry called it not defensible — and what that means for you.

Tenet disclosed the issue to Sentry on June 3, 2026. Per the reporting, Sentry acknowledged it the same day but declined to fix it at the root cause. Sentry characterized the attack as technically not defensible at its platform level — meaning it would not restrict event ingestion to authenticated sources or sanitize event data before returning it through the MCP server. Its stated rationale was that model vendors run middleware defenses. The only remediation Sentry took was to activate a global content filter for the specific payload string in Tenet’s proof-of-concept.

Be precise about what that means, because it is easy to misread. Sentry did not patch the vulnerability. It blocked one known exploit string while leaving the architectural pathway — untrusted event data flowing through MCP into an agent that will act on it — entirely open. A different payload, a different phrasing, the same outcome. This is an unusual disclosure precisely because the vendor at the center of it took the position that the defense belongs somewhere else.

Here we will take a frank editorial stance: whatever the merits of Sentry’s argument that model vendors should run middleware defenses, the practical effect is that responsibility lands on you, the developer or platform team. You cannot wait for a patch that the platform vendor has said it will not ship. That is not a reason to stop using Sentry or MCP — both remain genuinely useful — but it is the reason the rest of this guide focuses on controls you own rather than fixes you are waiting on.

What was and was not fixed
Fixed: a global content filter blocking the specific proof-of-concept payload string. Not fixed: the underlying pathway that lets untrusted, unauthenticated event data reach an agent that will execute on it. The correct framing is that the exploit string is filtered and the attack class remains open. No CVE has been assigned.
The one-line summary
Tenet Threat Labs reduces the whole disclosure to four words: “Telemetry is now an RCE vector.” The data you collect to debug your system has become a path to remote code execution inside it.

05Attack SurfaceThe agents in scope — and the shared weakness.

Tenet reports that the affected tools include Claude Code, Cursor, OpenAI Codex, Warp terminal agents, and VS Code extensions — any agent that can be configured to query Sentry (or another external service) over MCP. The common thread is not a flaw unique to any one product; it is the shared design where an agent reads external context and is empowered to act on it. The same loop powers always-on Cursor automations and agentic coding agents, which raises the stakes: an always-on agent retrieving external data without a human in the loop removes the one moment a person might have paused to question a suspicious “fix.”

The defensive controls a given agent ships with matter, but Tenet’s broader point is that you should not assume any default is safe. Several confirmed victim environments were already sandboxed, and the attack still succeeded — sandboxing helps, but only when it is configured to deny outbound network access by default, which most defaults do not. The three capabilities that actually move the needle are deny-by-default network egress, an explicit content-trust boundary for tool output, and a human-approval gate before shell execution.

Confirmed in scope
Agent families affected
5

Claude Code, Cursor, OpenAI Codex, Warp terminal agents, and VS Code extensions — every agent configurable to query an external service over MCP. The weakness is the pattern, not the product.

Tenet-reported
Prompt warnings
Reliable protection from prompt text
0

Agents executed the payload even when system prompts told them to ignore untrusted external data. A warning in the prompt is not a control — treat it as documentation, not defense.

Verify at the action layer
Open-source defense
agent-jackstop drop-in configs
1

Tenet open-sourced configs for Cursor and Claude Code: deny-by-default network egress, macOS/Linux/WSL2 sandboxing, an Auto-review classifier, and an Allowlist-plus-Sandbox mode.

github.com/tenet-security/agent-jackstop

06DefenseFive guardrails, mapped to the stage each one disrupts.

Most coverage gives you a flat list of advice. The more useful frame is to map each control to the attack stage it interrupts — Discovery, Injection, Retrieval, Execution, or Exfiltration — and to note whether it works without waiting on the agent vendor or on Sentry. The matrix below does exactly that. The five guardrails are practical recommendations drawn from Tenet’s agent-jackstop documentation, the Cloud Security Alliance research note, and general security practice; none of them is a claim about any product’s default behavior.

Agentjacking defense layer matrix: five guardrails mapped to the attack stage each disrupts, whether each works without agent or Sentry changes, an independence score derived as the count of those two yes answers, plus implementation complexity and cost.
GuardrailStage disruptedNo agent changeNo Sentry changeIndependenceComplexityCost
Five guardrails · independence = count of “Yes” across the two no-change columns (0–2)
Audit MCP connectionsRetrievalYesYes2 / 2LowFree
Rotate / proxy Sentry DSNsDiscovery · InjectionYesNo1 / 2MediumFree
Sandbox agent executionExecution · ExfiltrationNoYes1 / 2MediumFree
Require approval for shell commandsExecutionNoYes1 / 2LowFree
Deploy agent-jackstop configsExecution · ExfiltrationNoYes1 / 2MediumFree (open source)

Read the independence column as a prioritization signal, not a ranking of effectiveness. Auditing your MCP connections scores 2 of 2 — it needs no change from either the agent vendor or Sentry, so you can do it immediately. Rotating or proxying DSNs scores 1 because it touches your Sentry configuration; sandboxing, approval gates, and agent-jackstop each score 1 because they change how the agent runs. The controls that score lower are not weaker — sandboxing and approval gates are the ones that actually stop execution — they simply require you to change the agent’s configuration, which is well within your control.

Pre-agent runtime
Audit MCP & rotate DSNs

Inventory every MCP connection and ask which surface externally-controlled content. Rotate or proxy exposed Sentry DSNs so a leaked public key cannot be used to inject events. This shrinks the attack surface before the agent ever reads anything.

Do this first
Agent runtime
Sandbox & gate execution

Run agents in a sandbox with deny-by-default network egress, require explicit human approval before any shell command, and deploy hardened configs such as agent-jackstop. This is where execution and exfiltration are actually stopped.

The control that holds
Post-execution
Rotate secrets & keep audit logs

Assume any credential reachable by a hijacked agent is exposed: rotate AWS keys, git tokens, and npm tokens, and retain agent action logs so you can reconstruct what a compromised run touched. Recovery, not prevention — but essential when prevention fails.

Plan for failure
Beyond Sentry
Treat all external context as untrusted

Apply the same posture to issue trackers, support queues, code-review platforms, and log aggregators — every MCP source that surfaces content a stranger can influence. Sentry is the disclosed instance, not the boundary of the risk.

Generalize the rule

Tenet open-sourced agent-jackstop — drop-in configurations for Cursor and Claude Code that harden agents against prompt injection through untrusted tool output. Per its documentation, it provides deny-by-default network egress, macOS/Linux/WSL2 sandboxing, an Auto-review classifier, and an Allowlist mode paired with a sandbox. It is available on GitHub under tenet-security/agent-jackstop. It is one option among several; the principle it embodies — constrain the agent at the point of action — is the durable part, whichever tool you use to enforce it.

Looking forward, expect the defensive center of gravity to keep shifting toward runtime reasoning monitoring rather than input or output filtering. The broader research direction points that way: controls that inspect what an agent is about to do, at the moment it decides to do it, are more durable than trying to scrub every possible malicious input. If your team is standing up agents in production and wants this baked in from the start, our AI & digital transformation engagements and web development practice build the sandboxing, egress controls, and approval gates into the agent architecture rather than bolting them on after an incident.

Where the defense has to live
Tenet Threat Labs puts the conclusion plainly: “The only place left to stop this is the moment the agent decides to act.” Every earlier layer — detection, prompt warnings, network perimeter — has already been shown to miss it.

07Self-CheckRun this before your next agent touches a repo.

None of these steps requires waiting on a vendor, and all five are free. Treat them as the minimum bar for any team running AI coding agents against repositories that hold real credentials.

Check 1
Inventory MCP connections
Which surface external content?

List every MCP server your agents can reach. Flag any that returns content a third party can influence — error trackers, issue boards, support tickets, log streams. Those are the injection surfaces.

Independence 2/2
Check 2
Rotate or proxy exposed DSNs
Public keys are injectable

Treat any DSN in your frontend as discoverable. Rotate it, and where possible route error ingestion through a proxy that drops or sanitizes events from unauthenticated sources.

Shrinks the surface
Check 3
Sandbox with deny-by-default egress
Block outbound by default

Run agents in a sandbox that denies outbound network access unless explicitly allowed. Sandboxing alone was not enough in confirmed cases — the egress rule is the part that matters.

Stops exfiltration
Check 4
Require approval for shell commands
Human in the loop at execution

Gate any shell command behind explicit human approval. This is the moment a person can question a suspicious 'fix' — the one chokepoint prompt warnings could not provide.

Stops execution
Check 5
Deploy hardened configs & log everything
agent-jackstop + audit trail

Apply a hardened config such as agent-jackstop, and retain agent action logs. If a run is ever compromised, the logs are how you scope exactly which secrets to rotate.

Defense + recovery
A note on tone
This is a serious attack class, but it is not a reason to abandon AI agents or MCP. Both deliver real value. The right response is the same one good engineering always asks for: understand the trust boundaries, constrain what runs with your credentials, and keep a human at the point of action. The teams that do this keep the productivity and close the exposure.

08ConclusionThe defense moves to where the agent acts.

The shape of agentic risk, June 2026

When every step is authorized, you cannot detect your way out — you have to constrain the action.

Agentjacking is a clean illustration of a category that is only going to grow: attacks that weaponize an agent’s own legitimate behavior. There is no malware to catch, no unauthorized login to flag, no policy violation to alert on. A fake error report carries an instruction, the agent reads it as guidance, and it runs the command with your credentials. The reported numbers — roughly 85% success, 2,388 exposed organizations, 100-plus confirmed executions — are Tenet’s own, and we have treated them as such throughout. The mechanism, which is what matters, is not in dispute.

The single most important lesson is that prompt-layer defenses do not hold. Agents executed the payload even when told to ignore untrusted data. That finding alone retires the most common piece of advice and forces the real work to where it belongs: auditing MCP connections, rotating exposed DSNs, sandboxing with deny-by-default egress, requiring human approval before execution, and treating every external context source as untrusted — not just Sentry.

Sentry declined to fix the root cause, which means the responsibility is already yours. That is not cause for alarm so much as a prompt to act: the five guardrails in this guide are free, most can be applied today, and together they cut your real exposure without giving up the productivity that made you adopt agents in the first place. Build the constraints into the agent architecture now, and an incident becomes a contained event instead of an open-ended one.

Run AI agents in production, safely

Keep the productivity of AI agents while closing the exposure.

Our team designs and operates agentic systems with the security posture they demand — sandboxing, deny-by-default egress, approval gates, and MCP trust boundaries built in from day one, not bolted on after an incident.

Free consultationSenior engineeringTailored solutions
What we work on

Secure agentic engagements

  • MCP connection audits — which surfaces are injectable
  • Sandboxing with deny-by-default network egress
  • Human-approval gates before shell execution
  • Secret rotation runbooks & agent audit logging
  • Hardened agent configs for Claude Code, Cursor & Codex
FAQ · Agentjacking guide

The questions teams ask about agent security.

Agentjacking is an attack class, named by Tenet Threat Labs in June 2026, that hijacks AI coding agents such as Claude Code, Cursor, and Codex by hiding malicious instructions inside data the agent already trusts — most notably error reports it retrieves through a Model Context Protocol (MCP) connection. A developer asks the agent to investigate an error; the agent pulls the error details, which contain an injected fake resolution; and the agent runs the attacker's command using the developer's own credentials. There is no phishing, no server breach, and no malware. The key insight Tenet stresses is that an AI coding agent cannot reliably tell the difference between the data it reads and an instruction to act on, so cleverly formatted data becomes a command.
Related dispatches

Continue exploring AI development & security.

Development

Cursor Bugbot Reviews in 90 Seconds: The June Update

Cursor Bugbot dropped from 5 minutes to 90 seconds via Composer 2.5, finding 10% more bugs at 22% lower cost. What the June 2026 update means for lean teams.

June 11, 2026 · 11 minRead
Development

Cursor Design Mode: Edit UI by Voice and Multi-Select

Cursor 3.7 Design Mode lets you point at elements, speak intent, and queue edits mid-run. Where multi-select and voice fit, and how to stop design-token drift.

June 9, 2026 · 10 minRead
Development

Cursor Organizations: Govern Enterprise AI Coding at Scale

Cursor's new Organizations layer adds per-team budgets, model-tier segmentation, and SCIM Groups. How it compares to GitHub Copilot Enterprise governance.

June 6, 2026 · 12 minRead
Development

Cursor 3: Agents Window, Cloud Agents, and What Changed

Cursor 3 launches with the Agents Window for parallel agent orchestration, Composer 2, seamless cloud handoff, an integrated browser, and Marketplace plugins.

April 3, 2026 · 13 minRead
Development

Marketing Data Pipelines in 2026: An ETL-to-Activation Guide

The modern marketing data stack splits ingestion, transformation, and activation into composable layers. A 2026 build-vs-buy guide to ETL, dbt, and reverse ETL.

June 15, 2026 · 12 minRead
Development

Product Analytics: An Event Taxonomy That Won't Rot

Event data decays through naming drift, untyped properties, and no enforcement. A framework for an event taxonomy and tracking plan that survives, gated in CI.

June 10, 2026 · 14 minRead