SYS/2026.Q1Agentic SEO audits delivered in 72 hoursSee how →
DevelopmentDeep Dive9 min readPublished May 10, 2026

Cascade agents, Flows, memory across sessions — the agentic editor surface that distinguishes Windsurf 2 from Cursor and Claude Code.

Windsurf 2 Deep Dive: Cascade Agents + Flows 2026

Windsurf 2 is no longer a Cursor lookalike — it's an agentic editor with three first-class surfaces. Cascade brings multi-file edits and tool calls into the editor itself, Flows make agentic workflows repeatable, and Memory persists context across sessions. This guide covers what shipped, how each surface behaves in practice, and where it wins and loses against Cursor and Claude Code.

DA
Digital Applied Team
Senior strategists · Published May 10, 2026
PublishedMay 10, 2026
Read time9 min
SourcesHands-on across 3 repos
Core surfaces
3
Cascade · Flows · Memory
Flow archetypes
4
we kept after testing
Memory persistence
Cross-session
scoped per workspace
Recommended start
Cascade + 1 Flow
before MCP buildout

Windsurf 2 is the first release where the editor itself feels agentic rather than the assistant beside it. Cascade agents, Flows, and cross-session Memory combine into a surface that does things Cursor and Claude Code can only approximate — and falls short of them in other places. This deep dive is a hands-on read on what Windsurf 2 actually ships, where it wins, and where it still trails.

The reason this matters now: every coding IDE in the agentic wave is converging on the same three primitives — a planning loop, repeatable agentic workflows, and persistent context. Cursor 3 shipped its Agents Window and Design Mode. Claude Code 1.3 doubled down on terminal-native agents and subagents. Windsurf 2 is Codeium's answer in the editor surface — and the choices it makes are different enough to be worth understanding before defaulting your team to one stack.

This guide covers Cascade in depth, the four Flow archetypes that survived our internal testing, how Memory is scoped, where MCP and model routing land, a three-workload head-to-head against Cursor and Claude Code, and the four collaborative workflows that Windsurf 2 genuinely unlocks. Sources: hands-on usage across three production repositories, Windsurf's release notes, and our own benchmark prompts.

Key takeaways
  1. 01
    Cascade is the killer feature.Multi-file edits and tool calls happen inside the editor with diff-staging, plan preview, and per-step approval. It is the surface that justifies switching from Cursor for a meaningful slice of work.
  2. 02
    Flows make agentic workflows repeatable.A Flow is a named, reusable agentic recipe — a prompt template plus tools and scope. Four archetypes have stuck for us: scaffolder, refactor, audit, and review. The repeatability is what compounds.
  3. 03
    Memory across sessions is genuinely useful.Workspace-scoped memory persists across sessions and Cascade calls. Used well it removes the warm-up prompt; used poorly it pollutes context. Treat it as long-lived prompt state, not a knowledge base.
  4. 04
    MCP integration is competitive, not best-in-class.Server installs are straightforward, but Cursor still has the cleaner UI surface for managing MCP servers and Claude Code wins on terminal-native MCP. Windsurf is sufficient — not the reason to switch.
  5. 05
    Windsurf wins for a specific slice of workloads.It is the strongest choice today for editor-native multi-file refactors and repeatable team workflows. For pure terminal automation, Claude Code still leads. For chat-first exploration, Cursor remains a coin flip.

01What's NewWindsurf 2 ships three core surfaces.

The marketing copy around Windsurf 2 is dense; the architectural reality is simpler. Three surfaces define the release. Cascade is the agentic editor pane that plans and executes multi-file edits with tool calls. Flows are reusable agentic recipes you invoke by name. Memory is workspace-scoped context that survives across sessions. Everything else — MCP support, model routing, the Composer-style chat — sits underneath these three.

The distinction matters because the value proposition of Windsurf 2 vs Cursor or Claude Code is not "a better assistant" — it is "an agentic editor surface that removes the prompt rewriting cost of repeatable work." That framing is what makes the rest of this guide useful.

Surface 1
Cascade agents
multi-file edits · tool calls · in-editor

An agentic pane that plans, edits across files, and calls tools — with diff staging and per-step approval. This is the surface that justifies a Windsurf evaluation.

killer feature
Surface 2
Flows
named recipe · prompt + tools + scope

Reusable agentic workflows triggered by intent. Author once, invoke by name. Four archetypes (scaffold, refactor, audit, review) cover most team work.

repeatability layer
Surface 3
Memory
workspace-scoped · cross-session

Long-lived context that survives across sessions and Cascade calls. Best treated as durable prompt state, not as a free-form notebook or knowledge base.

context persistence
The frame to hold in your head
Cascade is the surface. Flows are the scripts. Memory is the state. Every other Windsurf 2 capability — MCP servers, model routing, the Composer-style chat — sits beneath these three primitives.

Behind the surfaces are familiar building blocks: an LLM router that picks between hosted frontier models, an MCP client for tool integration, a memory store, and the editor-host integration that lets Cascade actually stage diffs across files. None of those are novel in isolation. The bet Windsurf is making is on the way they are composed.

02Cascade AgentsMulti-file edits and tool calls in the editor.

Cascade is the most consequential thing in Windsurf 2. It is an agentic pane that lives next to the editor — not a chat window, not a terminal — and operates with three properties that combine into something genuinely different: it can edit multiple files in one run, it can call tools (MCP servers, shell, web), and it stages those edits as a reviewable diff with per-step approval before anything lands on disk.

The behaviour to test on your own repo: ask Cascade to do a refactor that touches at least four files in different directories. Watch the plan preview, watch which files it decides to read first, then watch the diff-staging UI. The per-step approval flow is the difference between "an agent that can be trusted on production code" and "a chat assistant that occasionally gets it right."

What Cascade does well

  • Plan preview before execution. Cascade shows its intended file list and high-level steps before touching anything. You can edit the plan, prune steps, or restart.
  • Diff-staged edits. All edits land in a staging area first. Approve per-file or per-hunk; reject cleanly without leaving stray state.
  • Tool-call transparency. Every MCP call, shell command, or web fetch is rendered as a card in the transcript with arguments and return values visible.
  • Recovery. A failed step is recoverable without losing the rest of the plan — meaningful when a refactor halfway through hits a type error.

Where Cascade still trails

  • Long-horizon coherence. On runs longer than roughly twenty steps, the plan drifts. Break large tasks into Flows rather than one mega-Cascade.
  • Test loop integration.Cascade can run tests, but the read-back of failures is less surgical than Claude Code's terminal-native loop. Expect to babysit the failure-fix cycle.
  • Cross-repo work. Single-workspace today. Multi-repo orchestration is still better in Claude Code or an agent SDK.
"Cascade is the first agentic editor surface that we trust on production refactors. The diff-staging UI is the reason."— Internal Windsurf 2 review, two weeks of paired use

One practical detail worth knowing: the model behind Cascade is configurable. We default to the strongest available reasoning model for plan-and-refactor work and route high-frequency, low-risk edits to a cheaper tier — the same split most teams already do across other IDEs. Section 05 covers the routing surface in detail.

03FlowsRepeatable agentic workflows triggered by intent.

Flows are Windsurf 2's answer to the prompt-rewriting tax that every team pays. A Flow is a named, reusable agentic recipe — a prompt template, a default model, an allowed-tool set, and a scope. You invoke a Flow by name from Cascade, and the editor runs the recipe against the current selection or a named target. Flows are stored as files in the repo, so they travel with the codebase and can be version-controlled.

After two weeks of internal testing across three different repos, four Flow archetypes survived and earned regular use. Anything narrower than these became one-off prompts; anything broader collapsed into Cascade itself.

Archetype 01
Scaffolder Flow
intent → files + tests + docs

Create a new feature scaffold from a one-line intent — component, route, test stub, and docs entry. Best for codebases with a strong house pattern already encoded in a CLAUDE.md or AGENTS.md.

1-shot create
Archetype 02
Refactor Flow
target file → spread edits + checks

Pattern-driven refactors — rename, extract, normalise — that touch a known set of files. Couples best with a typecheck-gate so the Flow exits clean.

multi-file edit
Archetype 03
Audit Flow
scope → report + issues

Read-only Flow that scans a directory or PR for a named class of issues (accessibility, security, dead code) and emits a Markdown report plus inline annotations. No writes.

read-only
Archetype 04
Review Flow
diff → comments + verdict

Reviews the current branch diff or staged hunks against a code-style and house-pattern prompt. Outputs structured comments — meant to augment human review, not replace it.

PR companion

The win from Flows is not any single recipe — it is the removal of the prompt-rewriting tax. The first time someone authors a refactor Flow, the team saves the same cost every subsequent run. Couple that with the file-based storage and the version-controlled review of recipes themselves, and the compounding effect is visible inside two weeks.

The pragmatic posture: ship one Flow per archetype in the first sprint, codify them in a .windsurf/flows/ directory, and treat the Flow library as a shared artefact. Avoid the temptation to author ten Flows on day one — most won't survive contact with real work.

When a Flow earns its place
A Flow is earning its place when the prompt would otherwise be rewritten by hand at least once a week. Below that bar, leave the work in Cascade as a one-off prompt and skip the templating overhead.

04MemoryCross-session context persistence.

Memory in Windsurf 2 is workspace-scoped: the editor maintains a persistent context bundle that survives across sessions, re-opens, and Cascade invocations. Used well, Memory removes the warm-up prompt that every IDE assistant has historically required ("this is a Next.js 16 app with App Router, Tailwind v4, no bg-gradient-*, here is our component naming convention…"). Used poorly, it pollutes the context window and confuses the model on adjacent tasks.

The mental model that works: treat Memory as long-lived prompt state, not a knowledge base. Anything you would put in a CLAUDE.md or an AGENTS.md at the repo root belongs in Memory. Anything that changes day-to-day — open PR context, the bug you are chasing — does not.

Scope
1workspace
Per-project, not global

Memory is scoped to the workspace, not the editor install. Switching repos resets context — which is the right default, and matches how teams actually work.

workspace-bound
Persistence
Across sessions

Survives close-and-reopen and Cascade runs. Effectively a curated system prompt that the editor manages on your behalf, with the option to view and edit raw.

durable state
Editability
1panel
Visible and editable

Memory contents are visible in a dedicated panel. Edit, prune, or wipe at will. The hidden-memory problem that plagued early generations of assistants does not apply.

transparent

The boundary that matters: Memory is not a substitute for repo-rooted docs. If your team relies on a CLAUDE.md or AGENTS.md to encode conventions, keep them. Memory layers on top — it captures things that are true of the project but not yet documented, or are personal to the developer (preferred shorthand, current focus area). The two sources of truth should be additive, not competing.

"Memory is at its best when it is the warm-up prompt you no longer have to write. It is at its worst when it becomes a wiki."— Field note from our two-week Windsurf 2 pilot

05MCP + RoutingServer integration and model picks.

Windsurf 2 supports MCP-server integration and ships a built-in model router that lets you pick a model per Flow or per Cascade run. The MCP story is competitive — install a server, expose it as a tool, Cascade picks it up — and the router is straightforward. Neither is genuinely best in class, but both are sufficient.

For most teams, the right choice today is to default Cascade and the production Flows to a strong reasoning model, route a cheaper-and-faster tier for read-only or high-frequency-low-risk Flows, and bring in a long-context model only for the workloads that actually need it.

Cascade default
Strong reasoning model for plan + multi-file work

Cascade earns its keep on multi-file refactors and tool-calling plans. Default it to the strongest available reasoning model on your plan — accept the latency cost; the diff-staging UI compensates.

Default to top tier
High-frequency Flows
Cheaper tier for scaffold + audit Flows

Scaffolder Flows and read-only audit Flows do not need frontier reasoning. Route them to a fast, cheaper tier — token spend on these adds up faster than people expect.

Pick cheap-and-fast
Long-context workloads
Reserve long-context models for actual long-context jobs

Whole-repo audits, large refactors across hundreds of files, multi-document RAG — these warrant a long-context model. Day-to-day Cascade does not.

Route by intent
MCP servers
Install only what earns its place

Start with one — a docs server or a database read-only server. Add more only when a real workflow demands the tool. An over-stuffed MCP surface costs context and approval friction.

One server first

The Windsurf MCP UI is clean enough for installation and day-to-day use, but Cursor still has the better surface for managing many MCP servers at once. If your stack centres on a large MCP catalogue, that is one of the few cases where the decision flips against Windsurf. For most teams running two to four servers, the difference is cosmetic.

06Head-to-HeadThree workloads vs Cursor + Claude Code.

We ran three identical workloads across Windsurf 2, Cursor 3, and Claude Code 1.3 on the same repos with comparable model picks. The chart below summarises a perceived-quality score across each — calibrated by paired review, not a synthetic benchmark. Numbers are illustrative of our experience; your mileage will vary by codebase and prompt style.

Three workloads · perceived quality · Windsurf vs Cursor vs Claude Code

Source: Digital Applied internal benchmark, May 2026
Multi-file refactor · WindsurfPlan preview + diff-staged edits land cleanly
92
Windsurf
Multi-file refactor · CursorComposer ships the edits, less staging discipline
83
Multi-file refactor · Claude CodeTerminal-native, weaker in-editor diff UX
80
New feature build · WindsurfScaffolder Flow + Cascade carries the work
88
Windsurf
New feature build · CursorComposer + Agents Window competitive
87
New feature build · Claude CodeStronger when scaffolding is terminal-driven
84
Bug hunt + fix · Claude CodeTerminal-native test loop is hard to beat
91
Claude Code
Bug hunt + fix · WindsurfCascade reads + edits, weaker test loop
82
Bug hunt + fix · CursorAgents Window helps, less surgical than CLI
80
Windsurf 2Cursor 3 / Claude Code 1.3

The shape of the result was consistent across the three repos. Windsurf 2 leads on editor-native multi-file refactorsbecause the diff-staging UI removes the "is this safe to land" friction that Cursor and Claude Code both impose differently. Windsurf 2 is competitive on new feature builds— Scaffolder Flows close the gap with Cursor's Composer. Claude Code retains its lead on bug-hunt-and-fix because the terminal-native test loop and surgical file edits are hard to displace from an editor surface.

For a deeper look at the Cursor side of the comparison, our Cursor 3 deep dive covers the Agents Window and Design Mode in detail, and the Claude Code 1.3 deep dive covers the terminal-native side. Reading the three together gives you the calibrated picture for a team-wide IDE decision.

07UnlockedFour collaborative workflows Windsurf 2 enables.

The point of evaluating a new editor is not the feature list — it is the workflows the editor unlocks that were previously expensive or impractical. Four of these survived our pilot, and all four involve Cascade plus at least one Flow plus Memory in concert.

Workflow 01
Pattern-locked refactor sprints

Author a refactor Flow that encodes a single pattern (rename, extract, normalise). Run it across a code area in one Cascade session, review diffs, land in one PR. Replaces a week of careful manual work with an afternoon.

Cascade + refactor Flow
Workflow 02
Agentic code review companion

Author a review Flow with your house style and risk rules. Run on every branch before human review. Outputs structured comments — meant to surface issues your reviewers would catch anyway, faster.

Review Flow + PR loop
Workflow 03
House-pattern scaffolding

Encode your repo's component / route / test conventions in a Scaffolder Flow. Junior engineers ship correctly-shaped code on day one; the convention-drift cost flattens. Memory captures unwritten rules.

Scaffolder Flow + Memory
Workflow 04
Recurring audit cadence

An audit Flow run on a weekly or pre-release cadence — accessibility, security, dead code, whatever your blind spots are. Read-only Flow, no surprises, a routine the team can trust.

Audit Flow on cadence

None of these four workflows are exclusive to Windsurf — you can approximate each in Cursor or Claude Code with discipline. What Windsurf 2 does is make the friction low enough that the workflows become routine rather than aspirational. That is the meaningful change, and the reason we ended up keeping Windsurf in the rotation for this specific work even on teams that previously standardised on Cursor.

If you are scoping a Windsurf 2 evaluation for a team, our AI transformation engagements cover exactly this kind of calibrated rollout — IDE assessment, Flow library design, Memory policy, and the training cadence that turns a tool change into a productivity change.

The shape of agentic editors, May 2026

Windsurf 2 is an agentic editor — Cascade and Flows define the surface.

Windsurf 2 is the clearest articulation yet of what an agentic editor actually is. Cascade is the surface that does the work, Flows make the work repeatable, and Memory keeps the context alive across sessions. The three together are different in kind from a chat-first assistant beside the editor — and the practical effect on a team that adopts the pattern is visible inside two weeks.

The honest framing is the right framing. Windsurf 2 wins on editor-native multi-file refactors and on repeatable team workflows, where Cascade plus a small Flow library compounds faster than the alternatives. It is competitive — not dominant — on new feature builds. It trails Claude Code on terminal-native bug-hunt-and-fix, where the CLI's test loop and surgical file edits are still the strongest tool in the category.

For most teams, the right move is not to standardise on a single tool — it is to be deliberate about which workloads run where. Windsurf 2 for refactor sprints and the recurring audit cadence; Cursor 3 for chat-first exploration and the Design Mode work; Claude Code 1.3 for terminal-native automation and the bug-hunt loop. The editor decision stops being binary and becomes a routing problem — which is exactly the kind of problem agentic tooling is now mature enough to solve.

Evaluate Windsurf 2

Windsurf 2 is an agentic editor — Cascade and Flows define the surface.

Our team runs head-to-head AI IDE assessments and Windsurf rollouts — calibrated to your workloads, with measurable productivity outcomes.

Free consultationExpert guidanceTailored solutions
What we work on

Windsurf 2 engagements

  • Head-to-head workload comparison
  • Cascade and Flow design
  • Memory boundary policy
  • MCP-server integration
  • Team training and adoption cadence
FAQ · Windsurf 2

The questions teams ask before trying Windsurf 2.

Cascade is closest in spirit to Cursor's Composer with the Agents Window pulled into the same surface, with a stronger emphasis on diff-staging and per-step approval before edits land. Composer is faster for chat-first exploration; Cascade is stronger when you need a reviewable plan and surgical multi-file edits in the editor. Claude Code 1.3 sits on the other side of the line — it is terminal-native, so it wins on bug-hunt-and-fix loops and on shell-heavy automation, but it does not match Cascade's editor-pane diff UX. Most teams find that Cascade is the right default for multi-file refactor work and Claude Code remains the right tool for terminal-native automation. Cursor remains a strong third option for chat-first work and for teams already invested in Composer.