Google Jules: Gemini Async Coding Agent Guide 2026
Google Jules complete guide — Gemini-powered async agentic coding, task pool architecture, pull request generation, and how async beats synchronous flows.
Task Context
Parallelism
Model Family
Output
Key Takeaways
Jules is the only major coding agent that makes you stop typing. While Claude Code, Cursor, and Codex CLI all keep you in a live conversation with the model, Google's Jules takes your brief, disappears into a cloud VM, and hands back a pull request when it is done. That async queueing forces a different discipline and a different workflow from the synchronous tools — and in the right scenarios, it is dramatically more productive.
This guide covers how Jules is architected, what the async workflow actually looks like in a working day, how to configure your repository for it, where Gemini 3.1 fits in, what the agent is good and bad at, and when async beats sync in agency and platform-team contexts. If you already use Claude Code or Codex, the interesting question is not whether Jules replaces them — it usually does not — but whether it earns a place alongside them.
One-line definition: Google Jules is an async agentic coding tool from Google Labs that runs Gemini models in isolated cloud VMs, queues tasks from GitHub issues or plain descriptions, and delivers results as pull requests to your repository.
Why Async Matters: The Synchronous IDE Fatigue
A year into the agentic coding era, most developers have settled into a synchronous rhythm. You open Claude Code, Cursor, or Codex CLI, describe a task, watch the model edit files, approve a tool call, read the diff, reply with a clarification, and repeat. That loop is astonishingly capable — but it is also, in a literal sense, exhausting. Your attention is held hostage to whichever task the agent is working on right now.
Async queueing breaks that hostage relationship. You describe the work once, in enough detail that it can run without you, and then get on with something else. When the agent finishes, you come back to a pull request, review it the way you would review a junior engineer's PR, and either merge, comment, or discard. If you queue five tasks in the morning, five PRs are waiting by mid-afternoon. The unit of productivity is not "how fast does one agent run" but "how many agents can you keep busy at once."
For agencies running parallel client workloads, for platform teams maintaining dozens of services, or for any developer with a backlog of well-specified tasks, this reframing matters more than the per-task performance differences between one model and another.
Where this lands in your stack: Async agents like Jules are a natural fit alongside broader AI delivery pipelines. Explore our AI Digital Transformation service to map async coding agents onto your actual delivery workflow.
Jules Architecture: Task Pool + Cloud VMs
At a high level, Jules has three moving parts: a task pool, a fleet of ephemeral cloud virtual machines, and the Gemini models that actually do the work. A task enters the pool when you submit it through the Jules UI, a GitHub issue mention, or the Jules API. The scheduler picks it up, provisions a fresh VM, checks out your repository into that VM, and hands the brief to a Gemini planner instance.
- Task intake: You submit a brief. Jules parses it and queues a task record.
- VM provisioning: A Google-managed VM spins up, installs your declared dependencies, and clones your repo.
- Planning: A Gemini Pro instance drafts a plan, broken into concrete steps with files to touch.
- Execution: The agent edits files, runs tests, iterates on failures, and updates the plan as it learns.
- Delivery: Jules opens a pull request against a new branch with the summary, diff, and test output.
- Teardown: The VM is destroyed; you review the PR on GitHub like any other contribution.
Why Cloud VMs Change the Feel
Running in a cloud VM sounds mundane, but it is the single most important architectural choice Jules made. Synchronous tools run on your machine, which means they compete with your other work for CPU, memory, and your attention. A Jules task takes nothing from you once it is submitted. Running a five-hour migration across ten services at once is a viable strategy with Jules; it is a laptop fire with synchronous tooling.
Planning, Execution, and Self-Review
Internally, Jules typically separates planning from execution. A stronger model produces the plan, a faster model executes the concrete steps, and a review pass verifies the work before the pull request is opened. That separation is why Jules tends to behave more like an engineer working from a ticket than like a pair-programming partner improvising in real time.
The Jules Workflow in Practice
The Jules daily rhythm is deliberately boring. It is less about skilled prompting in the moment and more about building a habit around queueing, waking, reviewing, and merging. A productive day looks something like this.
1. Queue (first 30 minutes)
Start the morning by walking through your backlog and filing tasks into Jules. Good briefs read like well-written tickets: they state the problem, list the files or areas involved, include acceptance criteria, and call out known pitfalls. Bad briefs are one-liners like "fix the login bug" that would confuse a human engineer too. If you have GitHub issues already, you can mention Jules in the issue to have it pick the work up directly.
2. Work on something else (an hour or two)
This is the part async tools actually earn their place. While Jules is running, you are not babysitting it. You go work on a design, a meeting, a live coding session with Claude Code on an interactive task, whatever else belongs in your day. Jules sits in a tab; notifications let you know when PRs are ready.
3. Review and merge (after lunch)
Come back to GitHub, triage the queue of Jules pull requests, and treat them exactly like junior-engineer PRs. Read the summary, skim the diff, run the tests locally if needed, and either merge, leave comments, or close with a rejection note. Jules learns from comments the same way an engineer does — corrections feed back into its next run on related work.
4. Re-queue or close
Some PRs merge cleanly. Some need a second pass, which you can request by responding on the PR or re-filing as a new task with additional constraints. Some are dead ends that reveal the brief was wrong — close them, rewrite the brief, and queue again.
Repo Setup and Configuration
Jules behaves best in repositories that a new human engineer could also navigate on day one. The agent has the codebase, whatever configuration you provide, and the planning model's general knowledge — nothing else. Setup work usually takes under an hour per repository and pays for itself within the first few tasks.
Connect the GitHub app
Install the Jules GitHub app and grant access to the specific repositories you want the agent to read and write. Prefer per-repo access over org-wide access, particularly for agencies handling multiple clients — the principle of least privilege applies to agents too.
Declare setup, lint, and test commands
Jules needs to know how to install dependencies, run your linter, and execute your test suite inside the fresh VM. Configure these at the Jules project level (or in the repo-level config file Jules expects) rather than assuming defaults. A Jules VM with no test command is a VM that will not self-verify before submitting a PR — and that is the single most common reason Jules ships bad work.
Write a CONTRIBUTING.md for the agent
Jules reads the repository's README, CONTRIBUTING, and any convention files you keep. Treat them as briefs for the agent. If your project has a CLAUDE.md or AGENTS.md file that already documents code conventions, directory layout, and forbidden patterns, Jules benefits from it just as much as a synchronous agent would. Keep it up to date.
Configure secrets carefully
Environment variables required for tests or build steps can be supplied to the VM through Jules's secret store rather than being committed to the repo. Never put a production key in a Jules config; use stub keys or local-only credentials. For client engagements, have this review as part of onboarding — our web development team does a similar pass before any client project goes live.
Gemini Integration: 3.1 Pro and Flash in Jules
Jules is the most direct way to put Gemini's latest coding capability to work on a real repository. Internally the agent typically routes planning and hard-reasoning steps to Gemini Pro tier models and executes lighter work on Flash — giving you the benefits of both without needing to manage the routing yourself.
Per Google's recent announcements, Jules tracks the general Gemini release cadence closely, with Gemini 3.1 Pro becoming the default planner tier as that model family rolls out. If you want to understand the underlying model's coding behavior in isolation, our Gemini 3.1 Pro benchmarks and pricing guide has the details. Jules essentially puts that model into an agentic harness with cloud execution attached.
Why Gemini's Strengths Fit Jules
Two Gemini traits in particular suit async agentic use. First, the family's long context window gives Jules the headroom to load a substantial chunk of a repository without aggressive summarization — useful for the whole-repo refactors that Jules is often asked to run. Second, Gemini's multimodal strength pays off when tasks include screenshots or design artifacts attached to the brief.
Jules vs Gemini Code Assist
Google ships two coding-agent products, and it is worth being clear on the split. Gemini Code Assist lives in your IDE and operates synchronously alongside you; Jules lives in the cloud and operates asynchronously without you. They share a model family but serve different workflows. Our Gemini Code Assist agent mode guide covers the synchronous side; this post covers the async side.
Pull Request Quality: What Jules Does Well
The practical output of Jules is a pull request, so the thing worth evaluating is not "is the model clever" but "is the PR mergeable." A few patterns show up consistently in production use.
- Test authoring against existing modules
- Dependency bumps and lockfile updates
- Config-driven refactors across many files
- Straightforward bug fixes from issue reports
- Documentation and README maintenance
- Boilerplate scaffolding for new endpoints
- Genuinely novel architecture decisions
- Interactive debugging with unclear repro
- Design-heavy UI work needing taste calls
- Ambiguous or contradictory briefs
- Tasks requiring production data access
- Live pair-programming exploration
Summary and Commit Hygiene
Jules pull requests tend to ship with readable commit messages and a plain-language summary of what changed and why. That is a deliberate design choice: review cost is the limiting factor on async agents, so a PR that is painful to review is a PR that costs more than it saves. Jules defaults toward small commits and descriptive titles even when the underlying change is multi-file.
Self-Verification Before Submission
Most Jules runs include a test-and-lint pass before the PR is opened. If the declared test command fails, Jules typically iterates until the suite passes or reports back that it could not resolve the failure. That means a Jules PR landing in your queue has at least been checked against your test gate — the agent cannot hide behind "works on my machine" because there is no machine.
When Async Beats Sync (and When It Doesn't)
Picking the right model-and-harness combination for a given task is the real skill of working with agentic tools. Async is not universally better; it trades responsiveness for parallelism.
| Scenario | Prefer Async (Jules) | Prefer Sync (Claude Code, Cursor, Codex) |
|---|---|---|
| Well-specified backlog tickets | Yes — batch through the queue | Overkill for this pattern |
| Unfamiliar codebase exploration | Limited — brief is hard to write | Yes — iterate in real time |
| Batch refactors across repos | Strong fit — run in parallel | Serial and slow |
| Debugging with unclear repro | Usually poor | Strong — live feedback loop |
| Scheduled maintenance (bumps, CVE patches) | Ideal — queue overnight | Wasteful of attention |
| Prototype or greenfield scaffolding | Useful for repeat scaffolds | Better for creative work |
The rule that holds up in practice: if you could hand the task to a competent engineer with a paragraph of context and walk away, give it to Jules. If you would need to sit next to that engineer and iterate, use a synchronous tool.
Jules vs Claude Code vs Codex
A high-level view of where each tool fits, without manufactured benchmark scores. Real performance depends heavily on the task and codebase, and published benchmarks tend to lag the tools' actual release cadence.
| Attribute | Google Jules | Claude Code | OpenAI Codex |
|---|---|---|---|
| Interaction Model | Async, queue-based | Synchronous terminal agent | Hybrid (CLI + cloud) |
| Execution Location | Google-managed cloud VM | Your local machine | Local or cloud sandbox |
| Model Family | Gemini (Pro + Flash) | Claude (Opus + Sonnet) | OpenAI GPT series |
| Output Format | GitHub pull request | Live edits in working dir | Live edits or PR |
| Best For | Batched, well-specified work | Interactive engineering sessions | Mixed interactive + queued work |
| Parallelism | Many tasks at once | One session at a time (per terminal) | Multi-session (cloud mode) |
For a feature-by-feature tear-down of the three tools, see our Claude Code vs Codex vs Jules matrix, and for the broader agentic coding landscape our 20-platform matrix covers the full field. The model-level coding comparison between Gemini 3.1 Pro, Claude Opus, and GPT-Codex is in our agentic coding model comparison.
Agency Use Cases
Agencies running AI-assisted delivery get disproportionate value from Jules because the patterns it suits — batched, repeatable, well-specified work — map cleanly onto how agency delivery actually looks across many client projects at once.
Batch Refactors Across Client Repos
When a framework ships a breaking change or a shared pattern needs migrating across every client that uses it, Jules can queue the same refactor against ten or twenty repositories in parallel. One human engineer reviews the PRs as they land, instead of doing each migration by hand.
Test Authoring on Legacy Codebases
Inherited codebases frequently come with thin test coverage. Filing a Jules task per module with "write meaningful tests for X" turns a months-long retrofit into a week of PR review. This also fits well with broader automation engagements where bulletproof test coverage is a precondition for shipping changes to production systems.
Scheduled Maintenance and Dependency Upkeep
Quarterly dependency bumps, CVE patches, lockfile upgrades, and config migrations all suit a queued, overnight workflow. Set up standing tasks per client repo, let Jules run during off-hours, and start the workday with a review queue of candidate PRs.
Boilerplate Scaffolding for New Client Work
New engagements often need the same kind of bootstrap scaffolding: auth wiring, CI configuration, initial component libraries, and deploy pipelines. A Jules task template per scaffold lets you kick off new client projects from a brief rather than by copy-pasting from a template repo.
Documentation Backfills
Jules's tendency to write readable summaries extends naturally to documentation tasks. "Write a README for this module explaining its public API and how it plugs into the rest of the system" is a task Jules handles well, and it scales to the long tail of under-documented modules that every codebase accumulates. See our breakdown of AI coding tool adoption trends for context on how teams are splitting work between synchronous and async agents.
Conclusion
Jules is not a drop-in replacement for Claude Code or Codex. It is a complementary tool that rewards a different discipline — write a good brief, queue the work, and come back to a pull request. For well-specified tasks, batched refactors, scheduled maintenance, and any context where parallelism matters more than real-time control, the async model is a genuine productivity unlock.
The honest evaluation is not "which tool is best" but "which tool fits this task right now." Sync tools win on interactive exploration; Jules wins when the unit of work is a ticket, the brief is solid, and you have five other things to do while it runs. The agencies and platform teams getting the most from agentic coding in 2026 are the ones running both patterns in parallel, not picking a favorite.
Ready to Build an Agentic Delivery Pipeline?
Whether you are evaluating Jules, Claude Code, or running async and sync tools side by side, we can help you map the tooling onto real delivery workflows for your team or clients.
Frequently Asked Questions
Related Guides
Continue exploring async agents and the broader agentic coding stack