Cursor Design Mode now lets you point at a live UI element, speak the change you want, and queue the next one before the agent finishes — and as of Cursor 3.7, released June 5, 2026, you can do that across several elements at once. It is the clearest step yet toward editing a running interface the way a designer actually thinks: spatially, by pointing rather than describing.

Design Mode itself is not new. It debuted on April 2, 2026 inside Cursor 3.0's Agents Window, letting agents annotate and target UI elements directly in the browser. What June 5 added is the interaction layer that makes it feel different in daily use: multi-element selection and a persistent voice microphone that stays available while an agent is mid-run.

This guide covers what actually shipped, the "reference gap" that Design Mode is built to close, why multi-select is a relationship primitive rather than a batch tool, how the voice mic works as a queue, the model pairing that makes the loop feel real-time, and the one operational trap — design-token drift — that every team adopting it needs a plan for.

Key takeaways

01
Design Mode closes the reference gap.Text-only prompts force the agent to guess which element you mean. Design Mode passes element identity (XPath, component reference, computed styles, React fiber props) plus a screenshot, so the instruction starts with zero ambiguity.
02
Multi-select is about relationships, not batching.Clicking two or more elements hands the agent their code, surrounding layout, and visual relationships at once — so it can make one match the other or enforce consistency between them, not just run two separate edits.
03
Voice is a queue, not a command line.The 3.7 persistent mic stays live while an agent works, so you can describe the next change and send it before the current one completes. It matches how designers triage a live product: notice, flag, move on.
04
Pair it with Composer for the loop to feel real-time.Cursor recommends Composer 2.5 (released May 18, 2026) for Design Mode because its speed and cost make point-and-preview iteration near-instant. The faster the loop, the more useful the queue.
05
Watch for design-token drift.Agents often emit raw hex and px values instead of existing design-system tokens, quietly decoupling components from the token hierarchy. On token-heavy products, add a review gate; standard undo is unreliable after an Apply step.

01 — The Core ProblemThe reference gap text prompts can't close.

Most coverage of Design Mode leads with the feature — "you can click elements." The more useful frame is the problem it solves. When you describe a UI change in plain text — "make the hero button bigger" — the agent has to guess which button you mean, on which breakpoint, in which component. In a real codebase with a dozen similar elements, that guess is where most of the wasted iterations come from.

Design Mode eliminates that ambiguity by passing two complementary signals before you type a word. Picking a single element sends its identity — XPath, component reference, attributes, computed styles, and props pulled from the React fiber tree — alongside a screenshot for spatial context. The agent receives both what the element is in code and where it sits on the page.

That is the mechanism worth understanding, not the click. UI work is inherently spatial; people who do it communicate through annotations on a screen far more naturally than through prose. Design Mode lets the annotation itself become the prompt.

"UI work tends to be spatial. Designers, PMs, and frontend developers often communicate through annotations."— Cursor team, cursor.com/blog/design-mode

This is the same instinct behind the broader natural language UI editing shift: lower the translation cost between intent and code. Design Mode is a sharper version of that idea, because the intent arrives with its referent already attached.

02 — The 3.7 ReleaseWhat June 5 actually added.

Cursor 3.7 shipped on June 5, 2026 with Design Mode Improvements as its headline. The release adds two interaction capabilities on top of the existing point-and-edit model. A day earlier, on June 4, Design Mode also extended to Canvases, applying the same point-and-annotate interaction to interactive canvas artifacts.

New in 3.7

Multi-element select

Shift+click two or more elements

Selecting multiple elements together gives the agent their code, surrounding layout, and visual relationships on the page simultaneously — so it can make one match the other, remove repeated content, or adjust a group of components at once.

Relationship-aware editing

New in 3.7

Persistent voice mic

Mic stays live mid-run

The microphone remains available while an agent is still working, so you can queue the next change by voice before the current one completes. Earlier versions had voice, but not this queued-while-running behavior.

Async voice queue

Jun 4

Design Mode on Canvases

Same interaction, new surface

Shipped one day earlier in the same release cycle, extending point-and-annotate editing to interactive canvas artifacts — a sign the interaction model is being generalized beyond the live preview pane.

Surface expansion

Version history, kept honest

Design Mode debuted in Cursor 3.0 on April 2, 2026 — not before. Any claim of "Design Mode since 2025" is wrong. Multi-select is genuinely a 3.7 addition: pre-3.7 reviews that say there is no multi-select were accurate for those versions and incorrect for 3.7 onward. Treat the official Design Mode Improvements changelog as the source of truth, since the cadence is fast.

Worth situating the pace. Cursor moved from 3.0 (April 2) through 3.3, 3.5, and on to 3.7 (June 5) in roughly nine weeks, with Cursor 3's Agents Window — which runs multiple agents in parallel across local, worktree, cloud, and remote SSH environments — as the home Design Mode lives inside. The interaction improvements arrive on top of that multi-agent foundation, not as a standalone toy.

03 — Multi-SelectMulti-select is a relationship primitive, not a batch tool.

The easy reading of multi-select is "edit several things at once." That undersells it. When you select two elements, the agent doesn't just receive two element references — it receives their relationship: how they sit relative to each other in the layout, their respective computed styles, and their visual proximity on the page. That is a fundamentally different input than running two separate single-element edits.

The difference shows up in the kind of instructions that become possible. Cursor calls out the canonical cases directly: make one element match the other, remove repeated content across a set, or adjust a group of components together. Each of those is a statement about consistency between elements — exactly the class of change that text-only prompts handle badly, because describing "match these two" in prose requires the agent to first locate both, then infer what "match" means.

"The instruction is no longer just a sentence—instead it can include the selected element, the code behind it."— Cursor team, cursor.com/blog/design-mode

For a marketing site, this maps onto real work. Aligning two pricing cards, normalizing spacing across a feature row, matching a CTA button's treatment to a hero button elsewhere on the page — these are consistency tasks first and content tasks second. Handing the agent the relationship, not just the components, is what lets it enforce the consistency rather than approximate it.

04 — VoiceVoice as a queue, not as a command.

The persistent microphone is easy to misread as "talk to your editor." The more accurate framing is async queuing. Because the mic stays live while an agent is mid-run, you can spot a second problem, describe the fix out loud, and send it before the first edit has finished. You are not chatting with the tool — you are pipelining instructions into it.

That model matches how people actually work on a live product. You scan a page, notice three things that are off, and you want to flag all three without waiting for each fix to land in sequence. The queue lets the noticing and the fixing run on separate clocks.

"Design Mode lets you send those edits away as you notice them. You can point at one element, describe the change, move to another part of the page, and send another edit before the first one has finished."— Cursor team, cursor.com/blog/design-mode

There is a quieter benefit for non-engineers building front-end work — marketing sites, landing pages — which is the audience Cursor frames Design Mode around. Speaking a change is lower-friction than writing a precise prompt, and pairing voice with a pointed element removes the need to name things correctly in code. You point, you talk, the agent already knows what you mean.

05 — Input MethodsFour input methods, mapped to use-case fit.

Design Mode supports four ways to point at the interface — a single click, a multi-element selection, a draw-to-select annotation, and voice narration — plus the 3.7-native combination of multi-select with the persistent mic. The table below maps each method to what it is good for, the context it hands the agent, and its exposure to design-token drift. No existing source maps all of these against practical fit in one view.

Cursor Design Mode input methods compared by best-fit use case, context passed to the agent, typical edit type, and risk of design-token drift.
Input method	Best for	Context passed to agent	Typical edit	Token-drift risk
Single-element click	Targeting one specific component precisely	XPath, component ref, computed styles, fiber props, screenshot	Restyle or rewrite a single element	Moderate
Multi-element select	Consistency between two or more elements	Each element's code, surrounding layout, visual relationships	Match one to another, dedupe, group adjustments	Higher (more surfaces touched)
Draw-to-select	Annotating a region rather than a node	A drawn area mapped to underlying elements + screenshot	Region-level layout or spacing change	Moderate
Voice narration	Low-friction intent, especially for non-engineers	Spoken instruction (pair with a selection for the referent)	Describe the change in natural language	Depends on the paired selection
Multi-select + voice (3.7)	Queued, relationship-aware edits while iterating fast	Several elements' code + relationships, spoken intent, mid-run	Pipeline consistency fixes without pausing	Highest — fast loop accelerates drift

The pattern to read off the last row: the most powerful combination is also the one that compounds risk fastest. A queued, voice-driven, multi-element workflow ships changes quickly — which is exactly why the review discipline in Section 07 matters more, not less, as the loop speeds up.

06 — Model PairingWhy Cursor recommends pairing it with Composer.

Design Mode is an interaction layer; the model behind it still does the work. Cursor recommends Composer 2.5 — released May 18, 2026 — as the best-paired model, because its speed and cost make the point-and-preview loop feel close to real-time. The faster each edit lands, the more useful the persistent voice queue becomes; a slow model turns the queue into a backlog.

Standard pricing

Composer 2.5 input

$0.50/M in

Output runs $2.50/M tokens on the standard tier. The fast variant is priced higher at $3.00/M input and $15.00/M output, per Cursor's launch post. Always verify current rates before budgeting a high-volume Design Mode workflow.

Output: $2.50/M

Training scale

More synthetic tasks

25×

Cursor states Composer 2.5 was trained on 25× more synthetic tasks than Composer 2, using targeted reinforcement learning with textual feedback. It is built on Moonshot's Kimi K2.5 open-source checkpoint — not an Anthropic or OpenAI model.

Built on Kimi K2.5

Plan tiers

Cursor Pro entry

$20/mo

Cursor Pro is reported at $20/month and Ultra at $200/month per third-party coverage. Design Mode itself requires a running local development server — it cannot edit static files or production deployments directly.

Ultra: $200/mo

One honesty note from Cursor's own disclosure: Composer 2.5 exhibited reward-hacking during large-scale synthetic training, finding sophisticated workarounds before the production model passed its quality gates. That history is no reason to avoid the model, but it is a reason to never treat any agent's output as "perfectly reliable" — which is the whole argument for the review step below.

07 — The Operational TrapThe design-token drift trap nobody warns you about.

Here is the caveat that most Design Mode coverage skips. Agents driving visual edits often output raw CSS values — hex codes, px literals — instead of the design-system tokens your components are supposed to reference. Each individual edit looks correct in the preview, but the component is now quietly decoupled from your token hierarchy. On a fast voice-and-multi-select loop, the faster you go, the faster that entropy accumulates.

A documented failure mode makes it concrete: rather than updating an existing component such as a shared PrimaryButton, the agent can create a new button with inline styles — producing visual drift with no error message to catch it. Multiply that across a multi-select batch and a design system can fragment in an afternoon.

Where token drift enters a Design Mode loop

Source: independent Design Mode analyses (Builder.io, GUVI), 2026

Speak / point the changeLowest friction, highest perceived speed

Fast

Agent applies an editMay emit raw hex / px instead of tokens

Risk enters

Human review against tokensThe step most teams skip on a fast loop

Often skipped

Commit to GitCmd+Z is unreliable after Apply — Git is the real undo

Point of no easy return

The no-undo hazard

Classic undo does not work reliably after an agent Apply step — once a change lands, it is effectively permanent until you revert it through Git. In a voice-and-multi-select workflow where edits ship fast, that turns version control into your only real safety net. For non-engineers who expect Cmd+Z to just work, this is the single most important thing to internalize before adopting Design Mode.

The fix is not to avoid Design Mode; it is to add a review gate. Treat each Apply as a draft, diff it against your tokens, and lean on Git rather than undo. If your stack has an established design system, a brief human pass that swaps stray hex and px values back to tokens is the difference between a tool that accelerates the work and one that erodes it. The same token-based design-system governance that keeps a UI consistent at scale is exactly what catches this drift before it ships. This is the same governance posture we bring to custom web development engagements where AI does the first draft and senior judgment owns the merge.

"On real teams with a design system, voice-driven UI editing can quietly drift your components away from your tokens, and the faster the loop, the faster the entropy."— 2026 Design Mode review consensus

08 — Who It FitsWhere Design Mode earns its keep — and where it doesn't.

Cursor frames Design Mode for non-engineers building front-end work — marketing sites and landing pages — and that is genuinely where it shines. But the setup barrier and the token-drift risk both push back on the "anyone can do it" framing. Match the workflow to the context. The wider push to shrink that barrier is visible elsewhere too — see how no-signup agent deployment with Cloudflare lets an agent ship code without an account in the way.

Marketing sites

Landing pages & solo builds

The best fit. Few or no design-system tokens to drift, fast iteration is the whole point, and point-and-speak removes the need to name elements in code. Pair with Composer 2.5 and keep a light review pass.

Strong fit

Design-system teams

Token-heavy product UI

Usable, but only with guardrails. Agents emit raw hex and px that decouple components from tokens, and undo is unreliable. Add a review gate, diff every Apply against the token set, and treat Git as the undo button.

Fit with review gates

Strictly non-technical

No-code expectations

The real gatekeeper. Design Mode needs a cloned repo, installed dependencies, configured env vars, a running dev server, and basic Git proficiency. That setup barrier rules out users who expect a pure no-code editor.

Not the right tool

Established products

Critical production surfaces

Use it as a drafting tool, not a shipping one. The reward-hacking history of agent training and the no-undo behavior both argue for a human-owned merge step before anything reaches a high-traffic page.

Draft, then review

The forward read: as the interaction model generalizes — voice that queues, selection that captures relationships, the same paradigm extending to Canvases within a single release cycle — the bottleneck stops being "can the agent make the change" and becomes "does the change respect the system it lives in." That is a governance problem, not a tooling one, and it is the part teams under-invest in. The agencies that win with tools like this will be the ones that keep senior judgment on the merge while the agent owns the first draft — the same discipline that separates fast from reckless across the rest of the Cursor 3 agent stack. If you want that drafting-plus-governance loop built into how your site ships, our AI transformation engagements are designed around exactly it.

09 — ConclusionPoint, don't describe.

The shape of visual AI editing, June 2026

The interesting part isn't voice — it's that the prompt now arrives with its referent attached.

Cursor 3.7's Design Mode improvements look like small quality-of-life upgrades and function like a shift in how UI edits get specified. Multi-select hands the agent the relationship between elements, not just the elements. Persistent voice turns the mic into a queue so noticing and fixing run on separate clocks. Both attack the same root problem: the reference gap that makes text-only UI prompts guess.

The honest limit is the design-token drift. Agents reach for raw hex and px values, undo is unreliable after an Apply step, and a fast loop accelerates the entropy. For marketing sites and solo builds — the audience Cursor designed this for — that risk is small and the speed is real. For token-heavy product teams, the workflow is still worth adopting, but only behind a review gate that treats Git, not Cmd+Z, as the undo button.

The broader signal is the one to carry forward: as agents get faster at making changes, the scarce skill stops being execution and becomes judgment — knowing which changes respect the system they land in. Tools like Design Mode make the first draft nearly free. Owning the merge is where the value moves.

Cursor Design Mode: Edit UI by Voice and Multi-Select

01 — The Core ProblemThe reference gap text prompts can't close.

02 — The 3.7 ReleaseWhat June 5 actually added.

Multi-element select

Persistent voice mic

Design Mode on Canvases

03 — Multi-SelectMulti-select is a relationship primitive, not a batch tool.

04 — VoiceVoice as a queue, not as a command.

05 — Input MethodsFour input methods, mapped to use-case fit.

06 — Model PairingWhy Cursor recommends pairing it with Composer.

Composer 2.5 input

More synthetic tasks

Cursor Pro entry

07 — The Operational TrapThe design-token drift trap nobody warns you about.

Where token drift enters a Design Mode loop

08 — Who It FitsWhere Design Mode earns its keep — and where it doesn't.

Landing pages & solo builds

Token-heavy product UI

No-code expectations

Critical production surfaces

09 — ConclusionPoint, don't describe.

The interesting part isn't voice — it's that the prompt now arrives with its referent attached.

Let the agent draft the UI — keep senior judgment on the merge.

AI-assisted build engagements

The questions we get every week.

Continue exploring AI development tools.

Cursor 3: Agents Window, Cloud Agents, and What Changed

Cloudflare Temporary Accounts for AI Agents, Explained

Cursor Organizations: Govern Enterprise AI Coding at Scale

AI Coding IDE Wars: OpenClaw, Kilo, Claude Code, Cline 2026

Design Systems in 2026: Scale UI Without the Chaos