The cost to build an AI agent in 2026 is the most-searched, worst- answered question in the agentic economy: the top-ranking guides quote anywhere from $8,000 to $500,000+ for functionally identical descriptors, and not one of them publishes the math behind the range. This index takes the opposite approach — a stated-assumption model where every dollar figure traces to a formula you can recompute and re-run with your own numbers.

Getting this number wrong is expensive in both directions. Overestimate and you shelve an automation that would have paid for itself in a quarter; underestimate and you join the projects Gartner predicts will be canceled — the firm forecast in June 2025 that more than 40% of agentic AI projects will be scrapped by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls.

Below: the five competing cost pages and why they disagree with each other by 10x on the same tier, the Digital Applied build and run cost model with every assumption stated, a per-model price sensitivity table spanning 1x to 9.1x, and the July 2026 price-war context that will force this index to be re-dated. Everything is a snapshot, deliberately labeled with its date.

Key takeaways

01
The SERP disagrees with itself by 10x.Five dev-shop guides published within four months quote “simple agent” builds from $8,000 to $80,000 — none state token volumes, engineering days, or a day rate, so none can be recomputed or compared.
02
Our index: three classes, every cell recomputable.Digital Applied methodology — simple task agent $5,400–$9,000 build / ~$800 monthly run; RAG-workflow agent $13,500–$22,500 / ~$2,700; multi-agent system $27,000–$45,000 / ~$6,250. All from stated engineering days × $900/day and July 2026 token prices.
03
Tokens are no longer the dominant run-cost line.At July 2026 rates, model tokens are 8% of the simple-agent run median, roughly 16% of the RAG median, and roughly 27% of the multi-agent median. Senior oversight time is the largest line in all three classes.
04
Model choice is a 1x-to-9.1x multiplier.The same 1,000 agent turns cost $3.86 on GLM-5.2 and $35.00 on Claude Fable 5 at list prices — with Sonnet 5, GPT-5.6 Terra preview, Opus 4.8, and GPT-5.5 spread between them.
05
Serverless compute is a rounding error.A worked example on Vercel’s published rate card puts raw compute for a 10,000-request/month agent at ≈$0.26 — active-CPU billing pauses during LLM waits, so the real hosting line is the $20/month Pro seat.

01 — The ProblemFive guides, one descriptor, a 10x spread.

Search “AI agent development cost” and the results are dominated by dev-shop guides — SoftTeco, DestiLabs, ProductCrafters, Cleveroad, and Azilen all published or refreshed theirs between February and June 2026. Put their numbers side by side and the genre’s problem is immediate: the same “simple agent” descriptor spans $8,000 at the bottom of DestiLabs’ band to $80,000 at the top of SoftTeco’s — a 10x disagreement on the same tier, published within a four-month window.

Commonly quoted AI agent build-cost ranges from five dev-shop guides published February through June 2026 — SoftTeco, DestiLabs, ProductCrafters, Cleveroad, and Azilen — with each page’s simple, mid, and complex tiers and whether any reproducible methodology is stated. All retrieved July 3, 2026.
Source	Published	“Simple” tier	Mid tier	Complex tier	Methodology shown?
SoftTeco	Jun 9, 2026	$10K–$30K (POC) · $20K–$80K (simple)	$20K–$60K (MVP)	$100K–$500K+	Partial — cites $40–$180/hr US rates, never ties them to the quoted totals
DestiLabs	Feb 23, 2026	$8K–$25K (conversational)	$25K–$80K (task) · $80K–$200K (multi-agent)	$200K–$500K+ (platform)	Claims a “50+ projects” basis; underlying dataset undisclosed
ProductCrafters	Feb 24, 2026	$20K–$35K+	$40K–$70K+ · $80K–$120K+	$100K–$200K+	None — explicit “no one-size-fits-all price tag” disclaimer
Cleveroad	Feb 18, 2026	$20K–$35K+	$40K–$70K+ · $80K–$120K	$100K–$200K+	None — ranges read as experience-based, no derivation shown
Azilen	Feb 18, 2026	$10K–$50K (chatbot)	$50K–$120K+ (task) · $80K–$180K+ (RAG)	$150K–$400K+	None disclosed; states $3,200–$13,000/mo average run cost

To be fair to the genre, some of the spread is real scope variation — enterprise compliance, on-prem deployment, and deep legacy integrations legitimately multiply cost. But none of the five pages state token volumes, engineering-day counts, or the day rate behind their totals, so a reader cannot tell scope variation from guesswork. ProductCrafters says it outright: “there isn’t a one-size-fits-all price tag.” True — which is exactly why the useful artifact is a formula with stated assumptions, not another point estimate. Even the recurring maintenance convention — three of the five cite 15–25% of build cost per year — is quoted as a rule of thumb with no derivation anywhere.

Why this matters

The internal inconsistency of the cost-guide SERP is itself the evidence: when five pages publishing in the same four-month window disagree by an order of magnitude on the same descriptor, the genre’s numbers cannot all be load-bearing. A cost model you can recompute is worth more than a range you have to trust.

02 — MethodologyOne day rate, stated volumes, published token prices.

Everything in this index derives from two formulas. Build cost = stated engineering days × a stated day rate. Monthly run cost = stated token volumes × verified July 2026 model prices, plus hosting, vector database, monitoring, and senior-oversight maintenance. That’s it — every cell in the tables below is one of those two multiplications plus a sum.

The day rate is $900 per engineering day, and it is explicitly Digital Applied’s own assumption, not an industry benchmark. It derives from our published pricing: 1 unit = €100, framed as roughly one hour of senior-level work delivered as senior-led, agent-assisted capacity — a senior person scopes, briefs, and reviews while agents compress execution. Eight units a day is €800; converted at the July 3, 2026 EUR/USD spot rate of roughly 1.146 that is about $917, rounded down to $900 for clean arithmetic. Our own published engagements cross-check the scale: packaged tiers run €2,000–€4,000 for 20–40 units, and a typical 10-week AI transformation engagement is scoped at 100–150 units (€10,000–€15,000).

For market color — and only color — freelance marketplaces quote wide bands: Upwork lists a $50/hour median for AI engineers with senior specialists at $100–$300/hour, and Toptal AI/ML listings are publicly reported at $100–$200+/hour billed to clients. If your delivery model prices differently, substitute your day rate into the formula; the index survives the swap, which is the point.

Digital Applied methodology — stated, not surveyed

These bands are our stated-assumption model, not an industry survey. Build = engineering days × $900/day (senior-led, agent-assisted delivery). Run = monthly tokens × July 2026 list prices + hosting + vector DB + monitoring + senior oversight days × $900. Every assumption is printed so you can replace any of them and recompute.

The index prices three agent classes, each with a stated monthly workload. These volumes are the second lever you should swap for your own — they drive every token figure downstream.

Class 01

Simple task agent

10,000 requests/mo · 1,200 in / 400 out per request

One LLM call per request behind a defined trigger — form triage, inbox routing, structured extraction, FAQ answering over a small prompt. Short system prompt, short response, no retrieval.

6–10 engineering days

Class 02

RAG-workflow agent

30,000 requests/mo · 4,000 in / 600 out per request

Retrieval over your own corpus plus a multi-step workflow — the input tokens carry retrieved context, and a vector database joins the bill of materials.

15–25 engineering days

Class 03

Multi-agent system

20,000 tasks/mo · ~6 calls each · 3,500 in / 700 out

An orchestrator decomposing tasks across subagents — 120,000 total model calls a month. Coordination overhead multiplies token volume and makes evaluation tooling non-optional.

30–50 engineering days

03 — The IndexThe Build & Run Cost Index — July 2026.

The headline table. Build bands are engineering days × $900; run bands are the sums of the line items broken out in the next section, with tokens priced at Claude Sonnet 5’s introductory rate ($2 input / $10 output per million tokens) — the realistic mid-market default, since Sonnet 5 is Anthropic’s default model on its Free and Pro tiers as of July 1, 2026. This index is dated July 2026 and priced from vendor pages verified July 3, 2026.

The Digital Applied AI Agent Build and Run Cost Index, dated July 2026 — for each of three agent classes, the stated engineering days, build cost band and median at $900 per day, and the monthly run cost band and median summed from token, hosting, vector database, monitoring, and maintenance line items at July 2026 published rates. Digital Applied stated-assumption methodology, not an industry survey.
Agent class	Eng-days (stated)	Build cost (days × $900)	Build median	Monthly run band	Run median
Simple task agent	6–10	$5,400–$9,000	$7,200 (8 days)	$534–$1,068	~$800
RAG-workflow agent	15–25	$13,500–$22,500	$18,000 (20 days)	$1,925–$3,420	~$2,700
Multi-agent system	30–50	$27,000–$45,000	$36,000 (40 days)	$4,709–$7,790	~$6,250

Two honest notes on reading it. First, our build bands sit below most of the SERP’s — a multi-agent system tops out at $45,000 here against $80,000–$500,000 elsewhere — because the day count assumes senior-led, agent-assisted delivery rather than a traditional team-weeks staffing model, and because the bands price the agent itself, not enterprise procurement, compliance programs, or legacy integration projects that can legitimately dominate larger quotes. Second, the run medians are workload-dependent: double the request volumes and the token line doubles with them, which is why the volumes are printed. If you want a spend ceiling before you commit, our guide to budgeting a token spend cap pairs directly with this table.

04 — Run CostWhere the monthly money actually goes.

This is the table the dev-shop guides don’t publish: the line items. Sum any column’s monthly items and you get the run band in the headline table — that’s the recompute check, and it should be run on every number here before you budget from it.

Monthly run-cost line items for the three agent classes in the Digital Applied index — stated workload assumptions, then token costs at Claude Sonnet 5 introductory pricing, Vercel hosting, Pinecone vector database, monitoring tooling, and senior oversight maintenance, with the total monthly run band and median each column sums to. July 2026 published rates.
Line item	Simple task agent	RAG-workflow agent	Multi-agent system
Workload assumption (stated)
Requests / model calls per month	10,000 requests	30,000 requests	120,000 calls (20,000 tasks × ~6)
Tokens per call (input / output)	1,200 / 400	4,000 / 600	3,500 / 700
Monthly tokens (input / output)	12M / 4M	120M / 18M	420M / 84M
Monthly line items (July 2026 rates)
Model tokens — Sonnet 5 intro, $2 / $10 per Mtok	$64	$420	$1,680
Hosting — Vercel Pro seat + Fluid Compute	$20–$25	$25–$40	$30–$60
Vector DB — Pinecone Standard floor	—	$50–$100	$50–$150
Monitoring — free tier → first paid step	$0–$79	$80–$160	$249–$500
Maintenance — senior oversight days × $900	$450–$900 (0.5–1 day)	$1,350–$2,700 (1.5–3 days)	$2,700–$5,400 (3–6 days)
Totals
Monthly run band (sum of line items)	$534–$1,068	$1,925–$3,420	$4,709–$7,790
Run median	~$800	~$2,700	~$6,250

The trend hiding in this table is the one most 2026 cost guides miss because they were written before the mid-2026 price moves: tokens are no longer the dominant run-cost line for most agents. At July 2026 rates, the token line is 8% of the simple-agent run median ($64 of ~$800), roughly 16% of the RAG median ($420 of ~$2,700), and roughly 27% of the multi-agent median ($1,680 of ~$6,250). The largest line in every class is senior oversight — the human hours that keep an agent accurate, evaluated, and improving. Several competitor pages still frame LLM API usage as the budget driver, quoting $1,000–$8,000+ per month for it; at current prices that framing is a year out of date.

Hosting is the clearest example of how cheap the plumbing has become. On Vercel’s published rate card (updated June 16, 2026), Fluid Compute bills active CPU at $0.128 per CPU-hour in US regions — and billing pauses during I/O waits, which for an agent means the long seconds spent waiting on the LLM API cost nothing in CPU. A worked example: 10,000 requests a month at 1GB memory, ~5 seconds average lifetime but only ~0.3 seconds of active CPU per request, comes to roughly $0.11 of CPU, $0.15 of provisioned memory, and under a cent of invocation fees — ≈$0.26 a month of raw compute. The real hosting line is the $20/month Pro seat (which itself includes a $20 usage credit); EU regions run higher at $0.184 per CPU-hour in Frankfurt, and usage-based observability adds $1.20 per million events if you turn it on.

The other two lines deserve their floors stated plainly. Pinecone’s Standard serverless tier carries a $50/month minimum commitment regardless of usage — the well-corroborated anchor for a managed vector database (per-unit read/write rates vary by plan; verify them on the pricing page before modeling to the cent). And monitoring runs free at light volume — LangSmith’s Developer tier covers 5,000 traces a month, Helicone’s free tier 10,000 requests, Braintrust’s roughly 10,000 scores — before the first paid step at $39/seat (LangSmith Plus), $79 flat (Helicone Pro), or $249 flat (Braintrust Pro). Enterprise platforms are murkier: Datadog’s LLM Observability pricing is not published on its pricing pages, and third-party reports of a premium that activates automatically when LLM spans are detected are just that — reported, not vendor-confirmed. Treat enterprise observability as a line that adds meaningful per-day cost and demands a quote. Our full breakdown of evals, traces, and what they cost goes deeper on this line item.

Hosting compute

Raw serverless compute, worked example

$0.26/mo

10,000 requests/month at 1GB memory, ~5s lifetime, ~0.3s active CPU on Vercel Fluid Compute (US rates). Active-CPU billing pauses during LLM waits — the seat fee, not usage, is the hosting line.

Vercel rate card, Jun 16, 2026

Vector DB floor

Pinecone Standard minimum

$50/mo

The Standard serverless tier carries a $50/month minimum commitment regardless of usage — the reliable anchor. Per-unit read/write rates vary; verify on the pricing page before modeling to the cent.

pinecone.io/pricing, Jul 2026

Observability step

Free tier → first paid tier

$0–249/mo

Free tiers cover light volume (LangSmith 5K traces, Helicone 10K requests, Braintrust ~10K scores). First paid steps: LangSmith Plus $39/seat, Helicone Pro $79 flat, Braintrust Pro $249 flat.

Vendor pricing pages, Jul 2026

05 — Model SensitivityOne workload, seven prices: 1x to 9.1x.

The single biggest run-cost decision is the model. To compare fairly, we standardize one “agent turn” at 1,500 input and 400 output tokens and price 1,000 of them at each model’s verified July 2026 list rate. The spread is 9.1x between the cheapest and most expensive current-generation options — for the identical workload.

Model price sensitivity for AI agents, July 2026 — for seven current models, the list price per million input and output tokens, the computed cost of 1,000 standardized agent turns of 1,500 input and 400 output tokens each, and the multiplier versus the cheapest option. Vendor list prices verified July 3, 2026; Digital Applied arithmetic.
Model	$/Mtok in	$/Mtok out	Per 1,000 turns	vs cheapest
GLM-5.2 (Z.ai list)	$1.40	$4.40	$3.86	1.0x (baseline)
Claude Sonnet 5 (intro, through Aug 31)	$2.00	$10.00	$7.00	1.8x
GPT-5.6 Terra (preview pricing)	$2.50	$15.00	$9.75	2.5x
Claude Sonnet 5 (from Sep 1, 2026)	$3.00	$15.00	$10.50	2.7x
Claude Opus 4.8	$5.00	$25.00	$17.50	4.5x
GPT-5.5 (standard, under 272K context)	$5.00	$30.00	$19.50	5.1x
Claude Fable 5	$10.00	$50.00	$35.00	9.1x

Cost per 1,000 agent turns · July 2026 list prices

Source: vendor list prices (Z.ai, Anthropic, OpenAI) verified Jul 3, 2026 · cost per 1,000 agent turns of 1,500 in / 400 out tokens · Digital Applied arithmetic

GLM-5.2$1.40 / $4.40 per Mtok · Z.ai list

$3.86

Sonnet 5 (intro)$2 / $10 · through Aug 31, 2026

$7.00

GPT-5.6 Terra (preview)$2.50 / $15 · not GA, subject to change

$9.75

Sonnet 5 (from Sep 1)$3 / $15 · standard rate

$10.50

Opus 4.8$5 / $25 · 1M context

$17.50

GPT-5.5$5 / $30 · standard, under 272K context

$19.50

Fable 5$10 / $50 · cache reads $1/M, Batch $5/$25

$35.00

The Sonnet 5 caveat — mandatory for run-cost math

Anthropic’s own framing of the introductory price is unusually candid: “The introductory pricing is set so that the transition to Sonnet 5 is roughly cost-neutral.” The reason is the new tokenizer, which Anthropic says produces approximately 30% more tokens for the same text (up to 1.35x in some analyses) — so the sticker drop from Sonnet 4.6 is partly offset by higher token counts per request. After August 31, 2026 the rate itself rises to $3/$15, and the tokenizer inflation compounds on top of it. Vendor-stated framing; budget on effective per-request cost, not the sticker.

Three labeling notes so this table stays honest. GPT-5.6 Terra is preview-only — OpenAI positions it as delivering competitive performance to GPT-5.5 at roughly half the price, but it is not GA and the $2.50/$15 rate is preview pricing, subject to change. Claude Fable 5 returned to full global availability on July 1, 2026 after a June 12–30 export-control suspension — worth knowing before treating the premium tier as a stable default. And GLM-5.2’s $1.40/$4.40 is Z.ai’s list price; third-party hosts run $0.93–$1.40 in and $3.00–$4.40 out, a spread we map provider-by-provider in our GLM-5.2 pricing comparison. For the running picture across every major vendor, our LLM pricing index tracks these rates as they move.

06 — Market ContextThe price war that keeps re-dating this index.

Why date a cost index to the month? Because the inputs keep moving. Forbes reported in June 2026 that Anthropic cut Opus pricing 67% at the Opus 4.5 launch in November 2025, that OpenAI’s Flex processing offers a 50% discount against standard rates, and — citing underlying usage data — that business token consumption grew roughly 1,001% between January 2025 and April 2026 while token spend grew only about 497%. Volume is outrunning spend because per-token prices are falling faster than usage is rising; secondary trackers reportedly put the blended cost decline at roughly $18.40 to $6.07 per million tokens between Q1 2025 and Q1 2026.

Falling unit prices do not automatically mean falling bills. The same Forbes piece reported that Uber consumed its entire 2026 AI coding budget in four months, with per-engineer costs for tools like Claude Code and Cursor running $500–$2,000 a month. Cheaper tokens invite more usage; without stated volumes and a cap, the budget conversation happens after the money is gone. That failure mode — cost without a traceable line to value — is the same one behind the pilot-graveyard statistics we unpacked in our analysis of why most agent pilots never reach production.

"If you're not actually able to draw a direct line to useful features and functionality you're shipping to your users, that trade becomes harder to justify."— Andrew Macdonald, COO, Uber — quoted in Forbes, June 11, 2026

The enterprise stakes — Gartner

Gartner’s July 1, 2026 press release estimates that up to $234 billion of enterprise application software spend — roughly 20% of enterprise application SaaS spend — is exposed to “agentic arbitrage” between now and 2030. “Agentic AI changes the economics of software,” as Gartner’s George Brocklehurst puts it. The same firm predicts more than 40% of agentic AI projects will be canceled by end-2027 on escalating costs and unclear value — both claims are Gartner’s analysis, and together they frame the stakes: the economics are real, and so is the failure rate for teams that don’t cost their builds honestly.

Projecting forward from the July snapshot: at least three dated events will move this index before year-end. Sonnet 5’s introductory pricing ends August 31, taking the mid-market reference rate from $2/$10 to $3/$15 — on top of a tokenizer that produces more tokens per request. GPT-5.6 Terra will exit preview at some point, and if its GA pricing holds near $2.50/$15 it undercuts the post-intro Sonnet rate directly. And if the past 18 months of cuts are any guide, at least one vendor reprices again before January. The build side of the index moves slower — day rates don’t drop 67% in a quarter — which is precisely why the run-cost side needs a dated, recomputable formula rather than a static range.

07 — Budgeting LeversFour levers that move your number.

The index is a starting grid, not a quote. Four substitutions adapt it to any project — and each is a genuine decision, not a formality.

Lever 01

Swap the model

The 1x–9.1x per-turn spread is the biggest run-cost dial. Route routine turns to a budget model and reserve premium models for the reasoning-heavy fraction — a two-tier stack can cut the token line hard without capping capability.

Biggest run-cost impact

Lever 02

Swap the volumes

Our stated workloads (10K/30K/120K calls per month) drive every token figure. Replace them with your real traffic and recompute — then set a hard monthly token budget so growth is a decision, not a surprise.

Do this before anything else

Lever 03

Swap the day rate

Our $900/day is a stated Digital Applied assumption from senior-led, agent-assisted delivery. A US contract-senior staffing model or an offshore team model will produce a different — and equally recomputable — build band.

Changes build cost only

Lever 04

Question the build itself

Some agents shouldn’t be built — an off-the-shelf tool or a paid feature that funds its own run cost can beat a custom build. Run the build-vs-buy test before pricing engineering days.

The zero-cost option

Two levers that mostly don’t work in 2026 deserve a warning label. Self-hosting an open-weight model to escape API pricing sounds like a run-cost lever, but the hardware reality is brutal at the frontier — GLM-5.2 at FP16 needs roughly 1.57TB of VRAM, on the order of 25 A100-80GB cards — so for most SMB and mid-market buyers it isn’t a real option; our self-hosting decision guide covers when it genuinely beats API pricing. And squeezing the maintenance line to zero is the false economy behind most agent decay: the oversight days are what keep the agent worth running. As Alex Salazar, CEO of Arcade.dev, wrote in CIO.com in December 2025: “Stop chasing moonshots. These vaguely scoped, overgeneralized agent dreams are often expensive, they rarely ship and they burn resources faster than they can create value.” Scope tightly, cost honestly. For the upstream question of whether to build at all, see the build-vs-buy decision — and if the agent faces customers, consider charging for AI-built features to cover run cost.

08 — ConclusionOwn the formula, not the range.

The index, restated

A cost model you can recompute beats a range you have to trust.

The July 2026 numbers: a simple task agent at $5,400–$9,000 to build and ~$800 a month to run; a RAG-workflow agent at $13,500–$22,500 and ~$2,700; a multi-agent system at $27,000–$45,000 and ~$6,250. Every cell derives from stated engineering days × $900/day and stated token volumes × verified July 2026 prices — swap any assumption and the index recomputes for your project.

The structural finding matters more than any single band: tokens have stopped being the budget driver. At current prices the model line is 8–27% of the monthly run median across our three classes, serverless compute is effectively a rounding error, and the largest recurring cost is the senior oversight that keeps an agent accurate. Teams still budgeting agents as “an API bill plus hosting” are pricing the smallest lines and ignoring the biggest one.

This page is a snapshot, deliberately dated. Sonnet 5’s introductory pricing ends August 31, 2026; GPT-5.6 Terra is still in preview; the price war has more rounds in it. When the inputs move, re-run the formula — that’s the entire point of publishing one.

The AI Agent Build & Run Cost Index 2026

01 — The ProblemFive guides, one descriptor, a 10x spread.

02 — MethodologyOne day rate, stated volumes, published token prices.

Simple task agent

RAG-workflow agent

Multi-agent system

03 — The IndexThe Build & Run Cost Index — July 2026.

04 — Run CostWhere the monthly money actually goes.

Raw serverless compute, worked example

Pinecone Standard minimum

Free tier → first paid tier

05 — Model SensitivityOne workload, seven prices: 1x to 9.1x.

Cost per 1,000 agent turns · July 2026 list prices

06 — Market ContextThe price war that keeps re-dating this index.

07 — Budgeting LeversFour levers that move your number.

Swap the model

Swap the volumes

Swap the day rate

Question the build itself

08 — ConclusionOwn the formula, not the range.

A cost model you can recompute beats a range you have to trust.

Priced from a formula you can check — before we build.

Agent build & cost engagements

The questions we get every week.

Continue exploring agent economics.

Agent Washing: The Definition — and a Scorecard to Catch It

The AI Cost Reckoning: Right-Sizing Model Spend 2026

HPE Discover 2026: Agentic AI, Self-Driving Networks

NotebookLM Is Now an Agentic Research Workstation Tool

AI Venture Funding 2026: Where the $242 Billion Went

AI Industry Weekly Recap: May 25-31, 2026 Top Stories