Agent washing — Gartner’s term for rebranding chatbots, RPA, and AI assistants as “agentic” without substantial agentic capability — has become the defining diligence problem in enterprise software buying. Gartner estimated in June 2025 that of the thousands of vendors claiming agentic AI, only about 130 are the real thing. Every other vendor deck on your desk is, statistically, selling a relabel.

The stakes are not abstract. The same Gartner release predicts that more than 40% of agentic AI projects will be canceled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls — and a mislabeled product is a head start on all three. Meanwhile the term itself has jumped from analyst vocabulary into vendor marketing, legal risk memos, and a second Gartner warning in May 2026.

This guide does three things. It pins the definition to its primary sources with the dates most coverage gets wrong. It grounds the pattern in two regulator-confirmed enforcement cases that predate the buzzword. And it gives you the Digital Applied Agent-Washing Test: six axes, scored 0-5, a 30-point maximum, and bands that tell you whether you’re buying an agent or a rebrand.

Key takeaways

01
Agent washing has a precise, citable definition.Gartner’s June 25, 2025 release defines it as rebranding existing products — AI assistants, RPA, chatbots — without substantial agentic capabilities, and estimates only ~130 of thousands of self-described agentic vendors are real.
02
Market conditions reward the label.Gartner’s 2026 CIO survey finds only 17% of organizations have deployed AI agents while more than 60% expect to within two years — buying intent is running far ahead of hands-on experience, which is exactly the gap a relabel exploits.
03
The pattern is already regulator territory.SEC charges against Presto Automation and the FTC’s DoNotPay order both punished claims of autonomous capability that reality didn’t match. Neither regulator used the term agent washing — but both cases document the exact claims-vs-reality gap Gartner describes.
04
Gartner expects over 40% of agentic projects to be canceled by end-2027.The cited causes — escalating costs, unclear business value, inadequate risk controls — all get worse when the underlying product was never agentic to begin with. The same firm still forecasts 15% of daily work decisions running through agentic AI by 2028.
05
Our 30-point test turns the question into a number.Six axes — autonomy, planning, tool use, memory, feedback loop, human-in-loop boundaries — scored 0-5 against anchored rubrics. 0-10 reads as a rebranded chatbot or RPA, 11-20 a partial or assisted agent, 21-30 genuinely agentic.

01 — The DefinitionWhat agent washing actually means.

The term comes from Gartner, introduced in mid-2025. In its June 25, 2025 press release — the one predicting the 40%-plus cancellation rate — the firm named the practice and sized it. Vendors were, in Gartner’s words, engaging in “agent washing”: the rebranding of existing products, such as AI assistants, robotic process automation (RPA) and chatbots, without substantial agentic capabilities. The deliberate echo of greenwashing is the point — it’s a claims problem, not a technology problem.

What made the release land was the number attached to it. Gartner estimated that of the thousands of vendors marketing agentic AI products, only about 130 were genuinely agentic. That is not a rounding-error minority — it implies the overwhelming majority of "agentic" pitches a buyer receives are repackaged automation wearing this year’s label.

The primary source

Gartner, June 25, 2025: agent washing is “the rebranding of existing products, such as AI assistants, robotic process automation (RPA) and chatbots, without substantial agentic capabilities” — and only about 130 of the thousands of agentic AI vendors are estimated to be real. The same release predicts over 40% of agentic AI projects will be canceled by end-2027 “due to escalating costs, unclear business value or inadequate risk controls.”

Gartner analyst Anushree Verma’s framing in that release is worth internalizing rather than quoting back at vendors: most agentic projects today are early-stage experiments driven by hype and often misapplied, which blinds organizations to the real cost and complexity of deploying agents at scale. Her sharper observation cuts both ways for buyers — many use cases positioned as agentic today don’t require agentic implementations at all. Sometimes the washed product isn’t just mislabeled; it’s answering a question nobody in your workflow asked.

02 — Get the Dates RightTwo Gartner warnings — eleven months apart.

Here is the detail nearly all secondary coverage gets wrong. Gartner has issued two separate, dated agent-washing warnings, and they don’t say the same thing. The ~130-vendors statistic belongs exclusively to the June 25, 2025 release. On May 20, 2026, Gartner warned again — this time about agent washing in the supply chain planning technology market — without restating that figure. Articles that attribute “130 vendors” to the 2026 release are conflating two distinct statements made nearly eleven months apart.

Gartner’s two agent-washing warnings side by side — the June 25, 2025 press release and the May 20, 2026 supply chain planning press release, compared by venue, market scope, analyst, core warning, and whether the ~130-vendors figure appears. Compiled from both Gartner Newsroom releases, retrieved July 3, 2026.
Aspect	June 25, 2025 release	May 20, 2026 release
Announced	June 25, 2025 · Sydney, Australia	May 20, 2026 · Barcelona, Spain
Market scope	The agentic AI market at large	Supply chain planning (SCP) technology specifically
Analyst	Anushree Verma, Senior Director Analyst	Jan Snoeckx, Senior Director Analyst, Supply Chain practice
Core warning	Over 40% of agentic AI projects will be canceled by end-2027; “agent washing” defined as rebranding assistants, RPA, and chatbots without substantial agentic capability	Relabeling conventional automation as agentic obscures real differences between planning tools; vendors claiming end-to-end autonomous supply chain planning before 2027 are overstating what is possible
The “~130 vendors” figure	Yes — this release is the origin of the statistic	No — it repeats the warning, not the number

The 2026 release matters for a different reason: it shows the warning aging into specific markets rather than fading. Per the release, Snoeckx warned that agent washing obscures the real differences between planning tools "by relabeling conventional automation as agentic, increasing the risk of misaligned investments and long-term lock-in.” And it sets a concrete autonomy bar: “True autonomous planning would require the automatic generation of plans, automatic selection of the optimal plan and seamless execution without human intervention. Most current solutions have not reached that level of end-to-end autonomy, and vendors claiming end-to-end autonomous supply chain planning before 2027 are overstating what is possible in the near term.” Swap “planning” for your own domain and that sentence is a usable diligence question.

03 — Why It WorksWashing works because the market wants to believe.

Agent washing isn’t a con that succeeds against buyers' better judgment — it succeeds because of a structural gap between enthusiasm and experience. Gartner’s 2026 CIO survey, cited in its April 15, 2026 Hype Cycle for Agentic AI, found that only 17% of organizations have deployed AI agents to date, yet more than 60% expect to do so within the next two years — described as the most aggressive adoption curve among the emerging technologies measured. When intent runs more than three times ahead of deployment, most buyers evaluating an “agent” have never operated one. That inexperience is the surface agent washing sticks to.

Deployed to date

Orgs running AI agents

17%

Per Gartner’s 2026 CIO survey, cited in the April 15, 2026 Hype Cycle for Agentic AI article. Agentic AI currently sits at the Peak of Inflated Expectations on that curve.

Gartner, Apr 2026

Expect to deploy

Within the next two years

60%+

The steepest intent curve of any emerging technology in the survey — a 3.5x gap between expectation and lived deployment experience.

Same survey

Predicted canceled

Agentic projects by end-2027

40%+

Gartner’s June 25, 2025 forecast, attributed to escalating costs, unclear business value, and inadequate risk controls — all three amplified when the product was washed.

Gartner, Jun 2025

The money tells the same story. A January 2025 Gartner poll of 3,412 webinar attendees found the largest bloc — 42% — making only conservative investments in agentic AI, with 19% investing significantly, 31% waiting or unsure, and 8% not investing at all. Commitment is cautious even while the vocabulary is everywhere.

How organizations were investing in agentic AI · early 2025

Source: Gartner poll of 3,412 webinar attendees, January 2025 (reported June 25, 2025)

Conservative investmentDipping a toe, small commitments

42%

Wait-and-see / unsureNo committed position yet

31%

Significant investmentThe committed minority

19%

No investmentSitting out entirely

Gartner’s own hype check

Even Gartner’s bullish read comes with a governor. Its April 15, 2026 Hype Cycle article states that “most deployments remain narrowly scoped, and fully autonomous agents are not ready for the majority of enterprise use cases.” The same Hype Cycle newly tracks agentic AI governance, agentic AI security, and FinOps for agentic AI as distinct profiles — a signal the market is maturing past pure hype into operational discipline.

None of this means the category is fake — the same June 2025 Gartner release forecasts that by 2028, at least 15% of day-to-day work decisions will be made autonomously through agentic AI (up from 0% in 2024) and 33% of enterprise software applications will include agentic AI (up from under 1% in 2024). The washing problem and the genuine trend coexist; that’s precisely what makes the label so profitable to misuse. It also compounds a separate, well-documented problem: even honestly built agent projects struggle to ship — most AI agent projects never reach production for execution reasons that have nothing to do with labeling. A washed product stacks a definitional failure on top of an already unfavorable base rate.

The clearest evidence the term has escaped analyst-land: vendors now use it against each other on stage.

"We want companies to build agents and orchestrate agents and there's a lot of agent washing going on in the market. So that's really important."— Kevin Li, SVP of Product, Optimizely · CMSWire, September 10, 2025

04 — Named CasesThe same pattern, regulator-confirmed.

An important precision: no major current AI-agent vendor has been publicly accused of “agent washing” by that name, and neither of the two regulatory cases below uses the term. What both cases document — with the evidentiary weight of federal enforcement — is the exact claims-versus-reality pattern Gartner describes: products marketed as autonomous that, in practice, ran on undisclosed human labor or untested capability claims. Treat them as the pattern’s case law, not its name-and-shame list.

SEC · Jan 14, 2025

Presto Automation

“AI drive-thru” · settled charges

The SEC found Presto made materially false and misleading statements about its Presto Voice product between November 2021 and May 2023 — claiming its AI speech recognition eliminated the need for human order-taking when the vast majority of orders required human intervention, and for a period not disclosing that the AI in deployed units was owned and operated by a third party. Presto consented to a cease-and-desist without admitting or denying findings; cooperation spared it a civil penalty.

SEC File No. 3-22413

FTC · Feb 11, 2025

DoNotPay

“The world’s first robot lawyer” · final order

Under its Operation AI Comply initiative, the FTC alleged DoNotPay never tested whether its “AI lawyer” performed at the level of a human lawyer and didn’t retain attorneys to check its law-related outputs. The final order — approved 5-0 — imposed $193,000 in monetary relief, required notice to 2021-2023 subscribers, and prohibits advertising that the service performs like a real lawyer without evidence to back it.

FTC final order, 2025

Read the SEC’s Presto order and the FTC’s DoNotPay release with a buyer’s eye and two diligence questions fall straight out of them. From Presto: what percentage of this product’s output currently requires human intervention, and who exactly operates the AI? From DoNotPay: what testing supports the capability claim, and can we see it? Both questions reappear as scored axes in the test below.

The legal-risk turn

Law firm Debevoise & Plimpton, writing on the Harvard Law School Forum on Corporate Governance (April 16, 2026), defines agent washing as misrepresenting AI tools as agentic when they lack genuine autonomy, or overstating agent capabilities, reliability, and business impact while underplaying risks — and warns that “once companies link agents to growth or efficiency, subsequent disclosures about limitations become evidence of earlier misstatements.” Agent washing is migrating from a marketing-ethics issue to a securities-disclosure risk.

05 — The BaselineWhat a real agent does that a rebrand can’t.

This post won’t re-litigate the full definition of agentic AI — our agentic AI glossary and AI agent glossary cover the vocabulary in depth. What matters here is the dividing line the washing claim blurs. IBM’s reference definition draws it cleanly: an AI agent autonomously performs tasks by designing workflows with available tools — deciding, acting, and interacting with external systems beyond its training data. A chatbot is a modality; agency is a capability. An LLM-powered chat interface with no tools, no memory, and no reasoning loop is non-agentic no matter how fluent it sounds, and traditional automation that follows predefined rules doesn’t become an agent by being renamed.

AWS’s prescriptive-guidance comparison makes the same cut as a spectrum, scoring traditional AI, software agents, and agentic AI across characteristics like autonomy, proactivity, and agency — from “tools for humans” through “operates independently within predefined bounds” to “operates with purpose, goals, and self-direction.” Its strategic guide sketches the autonomy ladder most washed products quietly sit at the bottom of:

Rung 1

Chain

Rule-based, pre-defined sequence

Every step and its order are authored by a human. Reliable, auditable, and categorically not an agent — this is where classic RPA lives.

Automation

Rung 2

Workflow

Pre-defined actions, dynamic sequencing

The system picks the order of known actions at runtime. Useful flexibility, still bounded by a human-authored action set.

Automation+

Rung 3

Partially autonomous

Plans and adjusts, minimal oversight

The agent plans, executes, and adjusts with a human at defined checkpoints. Most genuine enterprise agents in 2026 live here — and honest vendors say so.

Agentic

Rung 4

Fully autonomous

Sets goals proactively, little oversight

Rare in production. Per Gartner’s May 2026 warning, vendors claiming full end-to-end autonomy in complex domains before 2027 are overstating what’s possible.

Mostly aspirational

Buyers aren’t short of definitions — they’re short of a way to score a specific product against them. Frameworks exist, but they’re fragmented: AWS’s comparison is taxonomic rather than evaluative; enterprise vendor WRITER published a four-level autonomy framework (noting most misrepresentation involves marketing level-1 tools as level-3 or level-4 — though WRITER sells its own agent product, so read it as positioned analysis); and other vendor blogs offer question checklists without scoring. Nothing in that pile gives a procurement team a number. That’s the gap the next section fills.

06 — Our RubricThe Digital Applied Agent-Washing Test.

The test scores a product on six axes, 0-5 each, for a maximum of 30 points. The axes are a Digital Applied synthesis — informed by IBM’s agent framing, AWS’s autonomy spectrum, and the public Gartner warnings above, but the anchors, weights, and bands are our editorial methodology. Score against what you observe in a live demonstration on your own scenario, never against the deck. Anchor descriptions are written for 0, 3, and 5; score 1-2 or 4 when the product sits between anchors.

The Digital Applied Agent-Washing Test — six axes (autonomy, planning, tool use, memory, feedback loop, human-in-loop boundaries) each scored 0 to 5 with anchor descriptions at 0, 3, and 5, for a 30-point maximum. Digital Applied methodology, July 2026.
Axis	0 · Washed	3 · Assisted	5 · Agentic
AutonomyDoes it act toward a goal, or wait for a prompt at every step?	Responds only when a human prompts it; every step is human-initiated. A chat window with a new label.	Executes a multi-step sequence once kicked off, but halts at any branch it wasn’t configured for.	Owns a goal end to end — initiates steps, works through branches, and escalates only at defined checkpoints.
PlanningDoes it decompose and replan, or run a fixed script?	A fixed flow authored by a human; changing the process means reconfiguring the tool.	Chooses among pre-built paths, with limited re-sequencing when inputs vary.	Decomposes a novel goal into steps and replans when a step fails or new information arrives.
Tool useDoes it choose and call systems itself?	No live tool calls — output is text a human then executes somewhere else.	Calls a fixed menu of pre-wired integrations in a pre-set order.	Selects which system or API to call per step, sets its own parameters, and handles errors and retries.
MemoryDoes state persist across sessions and change behavior?	Resets every session; no recall of prior work or outcomes.	Holds session context and retrieves documents, but nothing it learns persists as state.	Retains durable state across sessions and uses it to make different decisions next time.
Feedback loopDoes it verify its own output and self-correct?	No self-check; every unit of verification is human or external.	Applies basic validation rules it can flag against, but cannot fix its own failures.	Evaluates output against the goal, detects failure, and retries or corrects before handing off.
Human-in-loop boundariesAre the human checkpoints explicit and disclosed?	Humans quietly do the work behind an “autonomous” label; the intervention rate is undisclosed.	Checkpoints exist and are acknowledged, but intervention rates aren’t measured or shared.	Checkpoints are documented, intervention rates are measured and reported, and the autonomy claim matches them.
Total	Max 30 points (6 axes × 5). Digital Applied bands: 0-10 rebranded chatbot or RPA · 11-20 partial / assisted agent · 21-30 genuinely agentic.

Two scoring notes. First, the sixth axis is deliberately about honesty, not capability: a vendor with explicit, measured human checkpoints scores a 5 there even if the product is only partially autonomous — while a product hiding its humans behind an “autonomous” claim scores 0 on that axis regardless of how good the demo looks. That is the Presto lesson, encoded. Second, the bands measure labeling accuracy, not product quality — an 11-20 assisted agent can be an excellent purchase at an assisted-agent price. What the test catches is paying agentic prices, and accepting agentic risk assumptions, for rung-1 automation. For the price side of that equation, pair this test with what a real agent actually costs to build and run.

07 — In the Vendor CallRunning the test — and acting on the score.

Scoring takes one structured session. Insist on a live run against a scenario you supply — not a recorded demo — and ask three questions the enforcement cases taught us to ask: What percentage of production outputs currently involve human intervention, and will you contractually report that number? Which steps in the flow we just watched were pre-scripted versus planned by the system? What testing evidence backs the headline capability claim? A vendor’s comfort with those questions is itself a signal — the genuinely agentic minority tends to answer them eagerly, because the answers are their moat.

Gartner’s May 2026 release offers three buyer missteps to avoid that map cleanly onto the test: don’t mistake vendor positioning for true autonomy, avoid monolithic transformations and legacy retrofits, and don’t pursue high-risk autonomous use cases too early. Then act on the band you scored:

Scored 0-10

Treat it as automation procurement

Nothing wrong with buying automation — RPA and chatbots earn their keep. But reprice against automation alternatives, strip agentic assumptions out of the ROI model, and ask why the vendor needed the label.

Reprice or walk

Scored 11-20

Buy the assist, not the story

Real value with a human in the loop. Contract on measured behavior: intervention rates, checkpoint definitions, and roadmap commitments in writing — not the word “autonomous” in a slide.

Contract on measured autonomy

Scored 21-30

Verify with a scoped pilot

Plausibly one of the genuinely agentic minority. Run a bounded pilot with intervention-rate reporting before production rollout — Gartner’s 40%-cancellation forecast is built on projects that skipped this step.

Pilot, then scale

Any score, hidden humans

Undisclosed intervention is a red line

If the demo can’t reveal where humans sit in the loop, you’re looking at the exact pattern the SEC documented at Presto. Escalate diligence or exit — the label problem is now a disclosure problem.

Escalate or exit

The scorecard also sharpens two adjacent decisions. If you’re evaluating the big CRM platforms’ agent offerings, run the test axis by axis against our guide to CRM AI agents from Salesforce, HubSpot, and Zoho — the same six questions expose exactly where each platform’s "agent" sits on the autonomy ladder. And a low score reframes the build-vs-buy decision: if the branded “agent” scores 8, a purpose-built workflow you own — scoped to rung 2 or 3 honestly — often beats renting the label. That comparative evaluation is the first step of our AI transformation engagements, where we score shortlisted vendors against live demos before any budget is committed.

Looking forward, we expect the test’s sixth axis to become the whole game. With securities lawyers now framing overstated agent claims as disclosure risk, and Gartner tracking governance and FinOps profiles for agentic AI on its 2026 Hype Cycle, the direction of travel is toward measured, reported autonomy — the intervention rate as a contract term, not a secret. Vendors who can publish that number will take the premium; vendors who can’t will quietly drop the label. Buyers who score today are simply ahead of where procurement standards are headed by 2027.

08 — ConclusionScore the claim, then buy the product.

The buyer’s position, July 2026

Agent washing is a labeling problem — so measure the label.

The definition is settled and citable: Gartner named agent washing in June 2025 — rebranding assistants, RPA, and chatbots without substantial agentic capability — and estimated only about 130 of thousands of claimed agentic vendors were real. Eleven months later it was warning specific markets about the same pattern, and regulators had already punished the underlying behavior at Presto and DoNotPay without ever needing the word.

The buyer’s error isn’t believing in agents — Gartner itself forecasts agentic AI inside a third of enterprise software by 2028. The error is letting the vendor’s label substitute for measurement while intent runs three times ahead of deployed experience. Six axes, thirty points, one live demo on your own scenario: the score is the diligence.

And keep the test’s spirit rather than just its arithmetic: an honest 14 that’s priced and contracted as an assisted agent will outperform a washed “28” every time. The point of catching agent washing isn’t to avoid the category — it’s to pay the right price, set the right risk controls, and reserve agentic budgets for the minority of products that have earned the noun.

Agent Washing: The Definition — and a Scorecard to Catch It

01 — The DefinitionWhat agent washing actually means.

02 — Get the Dates RightTwo Gartner warnings — eleven months apart.

03 — Why It WorksWashing works because the market wants to believe.

Orgs running AI agents

Within the next two years

Agentic projects by end-2027

How organizations were investing in agentic AI · early 2025

04 — Named CasesThe same pattern, regulator-confirmed.

Presto Automation

DoNotPay

05 — The BaselineWhat a real agent does that a rebrand can’t.

Chain

Workflow

Partially autonomous

Fully autonomous

06 — Our RubricThe Digital Applied Agent-Washing Test.

07 — In the Vendor CallRunning the test — and acting on the score.

Treat it as automation procurement

Buy the assist, not the story

Verify with a scoped pilot

Undisclosed intervention is a red line

08 — ConclusionScore the claim, then buy the product.

Agent washing is a labeling problem — so measure the label.

A vendor’s agent claim is a marketing artifact until it’s scored.

Agentic diligence engagements

The questions buyers ask before they sign.

Continue exploring agentic AI decisions.

The AI Agent Build & Run Cost Index 2026: Real Numbers

The AI Cost Reckoning: Right-Sizing Model Spend 2026

HPE Discover 2026: Agentic AI, Self-Driving Networks

NotebookLM Is Now an Agentic Research Workstation Tool

Google Ads Security: Stop Account Hijacking in 2026

AI Spending Forecasts 2026: Gartner, IDC & Stanford