DevelopmentNew Release11 min readPublished June 11, 2026

One pre-push command · ~90s avg review · vendor-stated −22% run cost

Cursor Bugbot Reviews in 90 Seconds: The June Update

Cursor's June 10, 2026 update cut Bugbot's average review from about five minutes to roughly 90 seconds, powered by Composer 2.5 — while finding 10% more bugs per run at a 22% lower run cost (all vendor-stated). The new /review command moves the check before you push. Here is what changed and how lean teams should use it.

DA
Digital Applied Team
Senior strategists · Published June 11, 2026
PublishedJune 11, 2026
Read time11 min
SourcesCursor changelog + blog
Avg review time
~90s
was ~5 min
over 3x faster
Bugs per run (default)
0.62
up from 0.56
+10% found
Run cost change
−22%
per Bugbot run
Runs under 3 min
90%
vendor-stated

Cursor Bugbot now reviews code in about 90 seconds. On June 10, 2026, Cursor shipped an update that — by the company's own measurements — cut Bugbot's average review time from roughly five minutes to around 90 seconds, while finding 10% more bugs per run at a 22% lower cost per run. For teams already leaning on AI code review, the change moves the tool from "nice async check" to "fast enough to run before every push."

Speed is the headline, but the more interesting shift is where the review happens. The same release window introduced a /review command that runs Bugbot before you open a pull request, syncing with GitHub and GitLab so it does not double-charge you to re-review an identical diff. That reframes Bugbot from a PR-stage reviewer into a pre-push gate — the point in the workflow where catching a bug is cheapest.

This guide covers exactly what the June update ships, the Composer 2.5 model that now powers Bugbot, the cost-per-bug-caught math for a lean agency team, where each review stage fits in the pipeline, and the due-diligence caveats every figure below carries. Every number is sourced from Cursor's changelog and blog posts and is vendor-stated unless noted otherwise.

Key takeaways
  1. 01
    Reviews dropped from ~5 minutes to ~90 seconds.Cursor states the June 10 update made Bugbot over 3x faster on average, with 90% of runs now finishing in under three minutes. The speed comes from Composer 2.5 now powering Bugbot.
  2. 02
    More bugs found, lower cost per run — vendor-stated.Cursor reports 0.62 bugs per default run, up from 0.56 (about 10% more), and roughly 22% lower cost per run. These are run-cost figures, separate from the June 1 Teams seat-pricing change.
  3. 03
    The /review command moves the check before the push.Available in Cursor 3.7+ and on cursor.com/agents, /review runs Bugbot and Security Review locally. If you later open a PR with the same diff, Bugbot recognizes it, skips the duplicate review, and notes it already looked.
  4. 04
    The cost-per-bug math now clears most ROI thresholds.At a vendor-stated $1.00-$1.50 per run and 0.62 bugs found per default review, the cost per caught bug lands in single-digit dollars — trivial against the widely cited four-to-five-figure cost of a production incident.
  5. 05
    Vendor benchmarks need scrutiny, not blind trust.CursorBench is Cursor's own eval; on independent SWE-Bench Multilingual, Composer 2.5 sits just behind Claude Opus 4.7, not ahead. Treat headline numbers as a starting point and benchmark on your own repos.

01What ShippedFaster, cheaper, and earlier in the workflow.

The June 10, 2026 Bugbot update — published by Cursor team members Jason Smale, Yuri Volkov, and Michael Zhao — bundles three changes that reinforce one another. Reviews got dramatically faster, each run got cheaper and slightly more thorough, and a new command lets you trigger a review before a pull request even exists.

Two smaller-but-useful capabilities round out the release. Bugbot can now be configured to review only the changes since its previous run, keeping feedback focused on the latest delta instead of re-scanning the whole changeset. And the duplicate-diff detection tied to GitHub and GitLab sync means a diff reviewed locally with /review is not re-billed when the matching PR opens.

Speed
~90-second reviews
was ~5 min · over 3x faster

Cursor states the average Bugbot review now completes in roughly 90 seconds, with 90% of runs finishing in under three minutes. Vendor-stated, June 10, 2026.

cursor.com/changelog
Thoroughness
0.62 bugs/run
up from 0.56 · default effort

On the same update, Cursor reports default-effort runs now surface 0.62 bugs each, up from 0.56 — about 10% more — at roughly 22% lower cost per run.

default effort, post-update
Timing
/review command
Cursor 3.7+ · pre-push gate

Run Bugbot and Security Review before pushing. Shortcuts /review-bugbot and /review-security target a single agent. Available in 3.7+ and on cursor.com/agents; CLI support is coming soon.

released June 5, 2026
Update snapshot
The June 10, 2026 Bugbot update is vendor-stated at: average review time down from ~5 minutes to ~90 seconds (over 3x faster, 90% of runs under three minutes), 0.62 bugs found per default run (up from 0.56), and roughly 22% lower cost per run — with average runs costing $1.00-$1.50 depending on PR size. The improvements are attributed to Composer 2.5 now powering Bugbot. These are run-economics changes, distinct from the separate June 1 Teams seat-pricing restructure.

02The Speed JumpFrom a coffee break to a single breath.

A five-minute review is something you wait for; a 90-second review is something you run reflexively. That is the practical meaning of Cursor's vendor-stated "over 3x faster" claim. When a check costs you 90 seconds and 90% of runs finish in under three minutes, it stops being a deliberate decision and becomes part of muscle memory — the same way a fast test suite gets run far more often than a slow one.

Bugbot review latency · before vs after June 10 update

Source: Cursor Bugbot changelog, June 10, 2026 — vendor-stated
Pre-update average review~5 minutes per run · vendor-stated
~5 min
Post-update average review~90 seconds per run · over 3x faster
~90s
Share of runs under 3 minutespost-update distribution · vendor-stated
90%

The downstream effect is a behavior change, not just a time saving. Industry data on AI-assisted review points in the same direction: an analysis of the state of AI code review in 2026 reports that 47% of professional developers used AI-assisted code review in 2025, and that repositories using it showed roughly 32% faster merge times and 28% fewer post-merge defects versus human-only review (attributed to Stack Overflow and GitHub 2025 data respectively). The faster the tool, the more often it runs, and the earlier defects surface.

"A faster, less expensive, more thorough Bugbot allows you to find issues sooner and merge code faster."— Jason Smale, Yuri Volkov & Michael Zhao, Cursor team

03The EngineComposer 2.5 is what powers the new Bugbot.

In Cursor's words, the gains are made possible by progress training Composer 2.5, which now powers Bugbot. Composer 2.5 launched on May 18, 2026 and, per Cursor, was trained on 25x more synthetic tasks than the prior Composer. We covered the launch in depth in our Composer 2.5 agent-coding launch analysis; the short version is that it is the model that now sits behind both Cursor's agent and Bugbot.

Two facts matter when you read the benchmark numbers. First, Composer 2.5 is exclusive to the Cursor IDE — it is not available through public APIs, Amazon Bedrock, Google Vertex, or OpenRouter, so you cannot wire it into your own pipeline outside Cursor. Second, its headline CursorBench score is a vendor-controlled benchmark: CursorBench is Cursor's own eval, not an independent one.

CursorBench v3.1
Vendor-controlled eval
63.2%

Composer 2.5 scores 63.2% on Cursor's own CursorBench v3.1, vendor-reported above Claude Opus 4.7 (61.6%) and GPT-5.5 (59.2%). Treat as vendor-stated, not independently verified.

Cursor's own benchmark
SWE-Bench Multilingual
Independent yardstick
79.8%

On the more independent SWE-Bench Multilingual, Composer 2.5 lands at 79.8% — about 0.7 points behind Claude Opus 4.7 at 80.5%. Near-parity, not superiority.

−0.7 vs Opus 4.7
Cost vs Opus 4.7
Vendor cost claim
~1/10

Cursor positions Composer 2.5 at roughly a tenth of Claude Opus 4.7's per-task cost on CursorBench. The cost story, more than raw capability, is what makes a cheaper Bugbot run feasible.

vendor-stated

The picture is nuanced rather than triumphant, and that is the point. On Cursor's own benchmark Composer 2.5 edges ahead of Opus 4.7; on the more independent SWE-Bench Multilingual it sits just behind; and on Terminal-Bench 2.0 the two are essentially tied. What is not in dispute is the cost trajectory — a substantially cheaper model underneath is precisely what lets a more thorough review also be a cheaper one. For a fuller treatment of why vendor-controlled scores deserve a second look, see our analysis of the vendor-controlled CursorBench methodology.

04The Pre-Push Gate/review moves the check before the push.

Most coverage treats Bugbot as a pull-request tool. The /review command, introduced with Cursor 3.7 on June 5, 2026, reframes it as a pre-commit gate. You run it locally before pushing; it executes Bugbot and Security Review and prompts you to select an agent. The dedicated shortcuts /review-bugbot and /review-security target one or the other directly. It is available inside Cursor 3.7+ and on cursor.com/agents, with CLI support stated as coming soon — so it is not yet a CLI command you can drop into a shell script.

The duplicate-diff detection is what makes the pre-push habit economical. If you run /review locally and then open a pull request with the same diff, Bugbot recognizes the duplication, skips the PR review, and posts a comment noting it has already reviewed that diff. You get the early signal without paying twice for the identical change.

Why pre-push is the high-value moment

The earlier a defect is caught, the cheaper it is to fix — a bug found before merge is dramatically less costly than the same bug found in production. A ~90-second review run before every push, that surfaces 0.62 bugs on average, changes the calculus for small teams with no dedicated QA function. As Cursor itself notes about agent output, AI-generated code can look right while being subtly wrong — which is exactly why a fast pre-push read of the diff matters more as agent-written code increases.

05The ROI MathThe cost-per-bug-caught math agencies actually need.

Here is the calculation most Bugbot coverage skips. Cursor states an average run costs $1.00-$1.50 depending on PR size (a 500-line PR runs around $1.20; a 5,000-line PR can exceed $4), and that default-effort runs find 0.62 bugs on average. Dividing a run cost by bugs found gives an effective cost per caught bug in the rough range of $1.60-$2.40 at default effort. The numbers below are illustrative arithmetic from those vendor-stated inputs, not a Cursor-published table — actual figures vary with PR size and effort level.

Illustrative Bugbot cost-per-bug-caught estimate by effort level, derived from vendor-stated run cost and bugs-per-run figures. Default-effort figures are from the June 10, 2026 changelog; the high-effort bugs-per-run figure is from the separate May 11, 2026 effort-levels post and high-effort run cost is an estimate.
Effort levelBugs / run (avg)Est. cost / runEst. cost / bug caughtBest used on
Default0.62$1.00-$1.50~$1.60-$2.40Everyday feature-branch PRs (most volume)
High0.95*higher (slower, more compute)not directly comparable*Infra, backend, billing, auth PRs
Customnatural-language routingvaries by rulevaries by ruleTeams with per-path review policies

Read the asterisk carefully. The 0.62 default figure is from the June 10 changelog (post-Composer 2.5). The 0.95 high-effort figure comes from a different post — the May 11, 2026 effort-levels changelog — which used its own test set and reported a different default baseline (0.7 bugs per run) in that context. Those two default baselines, 0.62 and 0.7, are not the same measurement and should not be read as a before-and-after trend. We keep them in separate columns deliberately.

Even at the conservative end, the economics are not close. Treat the widely cited cost of a single production incident as a four-to five-figure number — analyst write-ups on AI review put avoided cost per prevented production incident in the $5,000-$15,000 range, and most teams recoup their investment within the first quarter. Against that, a couple of dollars per caught bug at default effort is a rounding error. The decision is no longer whether the ROI clears; it is which PRs deserve the more expensive high-effort pass.

Industry benchmark, not a guarantee
The $5,000-$15,000 avoided-cost-per-prevented-incident figure and the "recoup within Q1" framing come from a third-party 2026 state-of-AI-code-review write-up, drawing on Stack Overflow and GitHub 2025 data. Treat them as industry benchmarks for sizing the opportunity — not as a promise about your specific codebase. Your real number depends on incident frequency, severity, and how much of your code is agent-written.

06Pipeline PlacementThree review stages, three jobs.

Cursor's documentation frames pre-push review, PR-opened review, and CI-stage review as separate features. In practice they form a decision matrix: each stage catches a different class of problem and suits a different kind of change. The faster pre-push review does not replace PR-level analysis — it front-loads the cheap catches so the PR review can focus on cross-file regressions.

Pre-push
/review-bugbot before you push

Roughly 90 seconds, run locally on the working diff. The cheapest catch point. Best for everyday feature-branch work where you want a fast read before the change ever reaches a PR.

Default on feature branches
PR-opened
GitHub / GitLab Bugbot auto-review

Fires when a PR opens; skips the run if it already reviewed an identical diff via /review. Best for cross-file regressions and changes that only make sense in full PR context.

Keep for cross-file review
CI / CD gate
CI-stage review

A blocking gate in your pipeline for changes that must not merge unreviewed. Complements rather than replaces the earlier stages. See our Vercel Agent walkthrough for a CI-stage pattern.

Reserve for must-block paths
High effort
Escalate the risky PRs

Use high or custom effort only on infrastructure, backend, billing, and auth changes where the cost delta is justified. Default effort handles the bulk of routine PRs.

Infra / auth / billing only

If you are deciding when AI code review belongs in your workflow at all, our guide to where AI code review fits into your workflow walks through the trade-offs, and the Vercel Agent tutorial covers a CI-stage AI code review setup that pairs naturally with Bugbot's earlier stages.

07Effort LevelsWhat the effort dial actually changes.

Bugbot's effort levels shipped earlier, on May 11, 2026, and are worth understanding in their own right — but their figures live in their own context. In that May post, Cursor reported a Default level finding 0.7 bugs per run at a 79% resolution rate, a High level at 0.95 bugs per run (about 36% more than Default, but more expensive and slower), and a Custom level that routes effort via natural-language rules. Cursor noted it uses high effort for changes to its own infrastructure and backend.

A deliberate note on the numbers. The 0.7 default-effort figure here and the 0.62 default figure from the June 10 speed update are not the same metric — they come from different posts using different test sets or measurement periods. Use 0.62 when discussing the June 10 speed update, and 0.7 only inside this May effort-levels context. Cursor’s internal measurement methodology is not fully transparent, so resist the temptation to chart them as a single trend line.

Bugbot effort levels · bugs found per run (May 11 context only)

Source: Cursor effort-levels changelog, May 11, 2026 — vendor-stated, separate test set from the June 10 figures
Default effort (May context)0.7 bugs/run · 79% resolution rate
0.7
High effort (May context)0.95 bugs/run · ~36% more than default
0.95

08Reading The ClaimsThe due-diligence note a thoughtful buyer would want.

None of this is a reason to dismiss Bugbot — it is a reason to read the numbers correctly. Every speed, cost, and bug-count figure in this post is vendor-stated by Cursor. CursorBench is Cursor's own benchmark; on the more independent SWE-Bench Multilingual, Composer 2.5 sits just behind Claude Opus 4.7 rather than ahead. And on the broader point of transparency, Cursor co-founder Aman Sanger publicly acknowledged it was a miss not to disclose the Moonshot AI Kimi K2.5 base when the earlier Composer version launched. That candor is itself reassuring, but it is also a reminder to verify vendor claims against your own runs.

Developer sentiment is mixed in the honest way you would expect. Some report that Bugbot misses issues other tools catch; others say the new Composer made them a returning subscriber. The signal for a buyer is not to pick a side from forum threads but to run a short internal trial: turn on /review-bugbot across a sprint, log what it catches and misses against your actual review process, and decide from your data rather than the changelog.

"it tells me there's no issues (but claude or copilot both find real things)"— rcleveng, Hacker News thread on Composer 2.5

Balance that against the other side of the same conversation. Whatever the headline benchmarks claim, the experience that matters is the one your team has on your repos — which is why we always recommend a short, logged trial before changing a default.

"Composer 2.5 is fast and effective...I was ready to end my subscription a week ago, and now I'm back."— jmcqk6, Hacker News thread on Composer 2.5

09For Lean TeamsWhat a boutique agency should actually do.

For a boutique team running Cursor across two or three active projects with tight sprint cycles, the value of sub-two-minute pre-push review is higher than waiting for CI to flag issues. The workflow we would set as the default is simple: always run /review-bugbot pre-push on feature branches as a cheap async gate; rely on GitHub Bugbot for PR-level analysis that catches cross-file regressions; and reserve high effort for the infrastructure, billing, and auth PRs where the extra cost is obviously justified.

Looking forward, the trajectory is clear: as more of a codebase is written by agents, the review stage becomes the load-bearing quality control, and the cheapest place to run it is before the push. A 90-second gate that costs a couple of dollars and reliably surfaces roughly two-thirds of a bug per run is the kind of small, compounding discipline that separates teams who ship agent-written code safely from those who clean it up in production. Pairing that gate with senior human judgment on the diffs is exactly how we run agentic web development engagements — and the same approach extends to our broader AI transformation work.

For the wider context on how Cursor, Claude Code, and Codex have shifted through the first half of the year, our H1 2026 AI coding retrospective and the Cursor 3.7 release that introduced Design Mode and voice input put this Bugbot update in its release-cycle frame.

10ConclusionA faster gate at the cheapest moment to catch a bug.

The shape of AI code review, June 2026

The fix that matters is moving the check earlier, not just making it faster.

The June 10 Bugbot update reads as a speed story, but the durable change is one of timing. A review that takes 90 seconds and can run before you push turns code review into a reflex rather than a ceremony — and the duplicate-diff detection means doing it early costs you nothing extra when the PR follows. The Composer 2.5 engine underneath is what makes a more thorough review also a cheaper one.

Read the figures with the right caution. Every speed, cost, and bug-count number here is vendor-stated; CursorBench is Cursor's own eval; and on independent SWE-Bench Multilingual, Composer 2.5 trails Claude Opus 4.7 by a hair rather than leading. None of that undercuts the practical case — it just means you size the rollout from your own run logs, not the headline.

For a lean team, the move is unambiguous: make /review-bugbot the default pre-push gate on feature branches, keep PR-level Bugbot for cross-file regressions, and spend high effort only where a bug would be expensive. At a couple of dollars per caught bug against four- and five-figure production incidents, the ROI question is settled. The remaining work is discipline — running the gate every time, and keeping a senior human reading the diffs the way you always should with agent-written code.

Make agent-written code safe to ship

A 90-second review before every push makes shipping agent code far safer.

We help lean teams wire AI code review into a real workflow — pre-push gates, PR-stage analysis, and CI policy — with senior engineers reading the diffs, not just trusting the bot.

Free consultationExpert guidanceTailored solutions
What we work on

AI-assisted delivery engagements

  • Pre-push and PR-stage AI review workflow design
  • Effort-level routing for infra, billing, and auth PRs
  • Cost and ROI sizing from your own run logs
  • Senior human review of agent-written diffs
  • Multi-tool review setup across Cursor and CI
FAQ · Cursor Bugbot June update

The questions we get every week.

Cursor states the June 10, 2026 update cut Bugbot's average review time from about five minutes to roughly 90 seconds — described as over 3x faster — with 90% of runs now finishing in under three minutes. The speed improvement is attributed to Composer 2.5, which now powers Bugbot. These are vendor-stated figures from Cursor's own changelog rather than independent measurements, so the practical takeaway is directional: reviews are fast enough to run before every push instead of being a deliberate wait. As always, confirm against your own run logs once you turn it on, since latency varies with pull-request size.