Development11 min read

Cursor Cloud Agents: Build and Test in Isolated VMs

Cursor Cloud Agents run AI coding tasks in isolated virtual machines with full environment access. Setup guide, PR workflow, and performance benchmarks.

Digital Applied Team

February 27, 2026

11 min read

30%+

PR Merge Rate

Isolated

VM Per Task

Video

Recorded Output

Async

Background Execution

Key Takeaways

Isolated VMs prevent side effects: Each task runs in a dedicated virtual machine with its own filesystem, terminal, and package manager, so failed builds or broken dependencies never affect your local environment.

30%+ of generated PRs merge to production: Cursor's internal benchmarks show that Cloud Agents produce pull requests that pass CI and merge at rates above 30%, a meaningful improvement for async development workflows.

Video recordings show every step: Each completed task produces a video recording of the agent's actions inside the VM, giving developers full visibility into what changed, what was tested, and why decisions were made.

Background execution frees developer time: Tasks run asynchronously in the cloud while developers continue working locally, with notifications when the PR is ready for review.

AI-powered coding tools have shifted from autocomplete assistants to autonomous agents that write, test, and ship code independently. Cursor Cloud Agents represent the next step in that evolution: isolated virtual machines in the cloud that clone your repository, execute a coding task from start to finish, and deliver a pull request with a video recording of every action taken.

This guide covers the architecture behind Cloud Agents, how VM isolation prevents side effects, the video output and PR workflow, real-world performance benchmarks, supported languages and frameworks, and how Cloud Agents compare to working with Cursor locally. Whether you are evaluating Cloud Agents for your team or deciding how they fit alongside tools like Claude Code, this breakdown covers what matters.

What Are Cursor Cloud Agents

Cursor Cloud Agents are autonomous coding agents that execute tasks inside cloud-hosted virtual machines. Unlike local Cursor, which runs the AI model alongside your editor on your machine, Cloud Agents offload the entire execution to a remote VM. You describe a task in natural language, the agent spins up a dedicated environment, clones your repository, installs dependencies, and begins working. When finished, it creates a pull request targeting your specified branch.

Assign

Describe the task in natural language. Reference specific files, functions, or acceptance criteria. The agent parses your instructions and plans an execution strategy.

Execute

The agent works in an isolated VM: reading files, writing code, running tests, installing packages, and iterating until the task is complete or it reaches a defined stopping point.

Deliver

A pull request is opened with the changes, a summary of what was done, and a video recording of the agent working inside the VM. You review, request changes, or merge.

The key distinction from local AI coding assistants is that Cloud Agents run asynchronously. You do not need to keep your editor open, watch the agent work, or stay online. Assign a task, continue with your own work, and receive a notification when the PR is ready. This makes Cloud Agents particularly effective for parallelizing independent tasks across a team or backlog.

VM Isolation Architecture

Each Cloud Agent task receives a fresh virtual machine with its own filesystem, terminal sessions, network stack, and package manager environment. The VM clones your repository at the current HEAD of your specified branch, runs your setup commands (such as npm install or pip install -r requirements.txt), and then begins executing the assigned task.

Isolation Guarantees

Filesystem isolation. Each VM has its own disk. File changes, new dependencies, and generated artifacts are scoped to that VM only.
Process isolation. Terminal commands run in the VM, not on your machine. A runaway process or infinite loop terminates when the VM shuts down.
Dependency isolation. Package installations and version changes are confined to the VM. If the agent upgrades a library that breaks your build, nothing outside that VM is affected.
Branch isolation. The agent works on a new branch created from your target. Your main branch and other feature branches remain untouched until you merge.

This isolation model mirrors how CI/CD pipelines work: each run gets a clean environment, executes in isolation, and produces artifacts without polluting shared state. The difference is that Cloud Agents are not just running tests. They are writing the code, running the tests on that code, and iterating until the tests pass.

Video Output and PR Workflow

One of the most distinctive features of Cloud Agents is the video recording attached to every completed task. The agent records its entire session inside the VM: opening files, writing code, running terminal commands, reading error messages, and iterating on solutions. This recording is attached to the pull request as a playable video.

What the Video Shows

Files opened and edited with visible diffs
Terminal commands executed and their output
Test runs with pass/fail results
Error handling and retry decisions

PR Contents

Code changes with standard GitHub diff view
Written summary of what was implemented
Link to the video recording of the session
Test results and build status from the VM

The video output addresses a common concern with autonomous coding agents: the black box problem. When an agent produces a diff, you see what changed but not why. The video provides the reasoning trail. You can watch the agent encounter an error, read the stack trace, decide on a fix, implement it, and verify the solution. This transparency makes code review faster because you understand the intent behind each change.

Performance: 30%+ PR Merge Rate

Cursor reports that Cloud Agents achieve a PR merge rate above 30% on internal benchmarks. This means roughly one in three pull requests generated by Cloud Agents passes CI, clears code review, and merges to the target branch without requiring significant manual intervention. For context, this is not a pass rate on toy benchmarks like SWE-bench. It measures real production codebases where the PR must satisfy linting, type checking, test suites, and human review.

What Affects Merge Rate

Task specificity. Well-scoped tasks with clear acceptance criteria produce higher merge rates than vague instructions like "improve the codebase."
Test coverage. Projects with existing test suites give the agent a feedback loop. It can run tests, see failures, and iterate before submitting the PR.
Codebase documentation. READMEs, inline comments, and type definitions help the agent understand project conventions and architectural patterns.
Task complexity. Bug fixes and feature additions in well-typed codebases merge at higher rates than architectural refactors or cross-cutting changes.

A 30% merge rate might seem low in isolation, but the economics shift when you consider that Cloud Agents work asynchronously and in parallel. Assigning five tasks simultaneously means an expected return of 1-2 merged PRs with no hands-on development time. For teams with large backlogs of well-defined tickets, this translates directly into throughput gains.

Building AI-powered development workflows? Cloud Agents work best alongside broader automation pipelines. Explore our Web Development Services for expert guidance on integrating AI coding tools into your engineering workflow.

Supported Languages and Frameworks

Cloud Agents run inside Linux-based VMs with standard development toolchains pre-installed. Any language and framework that builds in a typical CI environment will work. The agent reads your project configuration files, installs dependencies from lock files, and adapts to your toolchain automatically.

Languages

TypeScript and JavaScript (Node.js, Deno, Bun)
Python (3.9+, with pip and Poetry support)
Go, Rust, Java, C++, Ruby, PHP, Swift
Any language with a Linux-compatible toolchain

Frameworks

Next.js, React, Vue, Svelte, Angular
Django, FastAPI, Flask, Rails, Spring Boot
Tailwind CSS, Prisma, tRPC, GraphQL
Testing: Jest, Vitest, Pytest, Go test

The agent performs best on strongly-typed codebases with good test coverage. TypeScript projects with Vitest or Jest suites, Python projects with Pytest, and Go projects with built-in testing all provide the feedback loops that let the agent self-correct during execution. Dynamically-typed codebases without tests still work, but the merge rate is lower because the agent has fewer signals to validate its own output.

Modernizing your development stack? AI coding agents deliver the best results on well-structured, typed codebases. Explore our AI and Digital Transformation Services for guidance on building AI-ready development infrastructure.

Cloud Agents vs Local Cursor

Cloud Agents and local Cursor are complementary tools, not replacements for each other. Understanding when to use each one determines how much value you extract from both. The decision comes down to task type, interaction model, and isolation requirements.

Cloud Agents

Async execution while you work on other tasks
Full VM isolation prevents local side effects
Video recording provides audit trail
Best for well-scoped, independent tasks

Local Cursor

Real-time interaction with immediate feedback
Direct filesystem access for multi-step work
Tab completion and inline suggestions
Best for exploratory and architectural work

The practical workflow for most teams combines both. Use Cloud Agents to parallelize well-defined tickets from your backlog: bug fixes, test additions, documentation updates, dependency upgrades, and small feature implementations. Use local Cursor for complex architectural decisions, multi-file refactors that require ongoing dialogue, and any task where you need to guide the AI step by step. For comparison with terminal-based agents, see our Claude Code Remote Control guide, which covers a different execution model for autonomous coding.

Pricing and Access

Cloud Agents are available on Cursor Pro and Business plans. The pricing model ties to compute credits consumed by VM runtime rather than a flat per-task fee. A typical task that runs for 10-15 minutes in the VM consumes a moderate number of credits. Longer tasks (such as large refactors or complex feature implementations) consume proportionally more.

Access and Plan Details

Pro plan. Includes a monthly allocation of Cloud Agent credits. Suitable for individual developers running 10-30 tasks per month.
Business plan. Higher credit limits, priority execution queue, and team management features. Designed for engineering teams with shared backlogs.
Additional credits. Available for purchase if your monthly allocation runs out. Credits do not roll over between billing cycles.

For teams evaluating whether Cloud Agents justify the cost, the calculation is straightforward: compare the credit cost per merged PR against the hourly cost of developer time for equivalent work. If a Cloud Agent produces a merged PR for the equivalent of $5-15 in credits, and the same task would take a developer 1-3 hours at $75-200 per hour, the ROI is significant even at a 30% merge rate. For more on building and selling AI agent solutions, see our guide to building and selling custom AI agents.

Putting It All Together

Cursor Cloud Agents represent a meaningful shift in how development teams can approach their backlog. The combination of VM isolation, asynchronous execution, video-recorded sessions, and automatic PR creation turns well-defined tickets into a parallelizable queue. You assign tasks, the agents execute in isolated environments, and you review the results when they arrive.

The 30%+ merge rate is an honest benchmark: not every task will produce a mergeable result, but the cost of a failed attempt is limited to compute credits and review time. For teams with large backlogs of well-scoped tickets, the math works in your favor. For complex architectural work, local Cursor and tools like Claude Code with Claude Opus 4.6 or GPT-5.2 remain the better choice because they support the real-time dialogue that nuanced decisions require.

The practical path forward is not choosing one tool over another. It is building a workflow that assigns the right tasks to the right execution model: Cloud Agents for parallelizable, well-defined work, and local AI tools for interactive, complex sessions. For a deeper look at how agent configuration files like AGENTS.md affect inference costs and task success rates, see the ETH Zurich AGENTS.md study breakdown.

Ready to Build with AI-Powered Development?

Our development and AI teams help engineering organizations integrate autonomous coding agents, CI/CD pipelines, and modern toolchains. From evaluation to production deployment.

Get Started Explore Development Services

Free consultation

AI development expertise

Modern infrastructure