The Hermes Agent desktop app is the moment Nous Research's fastest-growing open-source agent stopped requiring a terminal. This is the complete getting-started guide: what Hermes actually is, how to install the desktop on Mac, Windows, and Linux, how its memory and skills compound over time, and the architectural decision that makes the desktop more than a pretty front-end.
We covered the launch news separately in our v0.15.2 desktop launch coverage, and the conceptual origin story in our earlier Hermes v0.10 guide. This post is the evergreen product deep-dive — written for someone who just heard about Hermes and wants to understand the whole thing, not only the latest release note.
The single insight worth holding onto before you read further: most agent tools ship a GUI that drifts apart from the engine underneath it. Hermes did the opposite. The desktop shares the exact same agent core, configuration, API keys, sessions, skills, and memory as the command-line tool. A conversation you start in the desktop resumes in the CLI, and vice versa. Understanding why that matters is the whole point of this guide.
- 01Hermes is a self-improving agent, not a coding copilot.Nous Research positions it as an autonomous agent that lives on your machine or a server, remembers what it learns, auto-generates skills from completed work, and gets faster the longer it runs. It is MIT-licensed and installs with a single pip command.
- 02The desktop is another surface, not a separate app.Same core, same config, same keys, same memory. A session started in the desktop resumes in the CLI or TUI and back. That cross-surface continuity is the architectural decision that sets Hermes apart from tools where the GUI and engine evolved separately.
- 03Install is genuinely simple — with one Linux asterisk.Mac and Windows get a native installer download. Linux still needs the terminal install script with an --include-desktop flag, which undersells the cross-platform headline for the platform that hosts most self-hosted Hermes deployments.
- 04Memory, skills, and GEPA make it compound over time.A four-layer memory stack, 80+ bundled skills across 17 categories, and the GEPA self-improvement loop mean repeated tasks should get faster as the agent accrues procedural knowledge. GEPA's headline speed gains are vendor-stated; we keep both vendor and community numbers together.
- 05Remote backends turn it into a shared team tool.Point the desktop at a Hermes backend on a VPS or behind Tailscale and a whole team gets the visual interface with no local Python install on each client. With xAI SuperGrok OAuth, the effective marginal cost on an existing subscription approaches zero.
01 — The ProductThe agent that grows with you.
Hermes Agent is an open-source, self-improving agent framework from Nous Research, first released on February 25, 2026. Nous positions it deliberately against two more familiar categories: it is not a coding copilot tethered to an IDE, and it is not a thin chatbot wrapper around a single model API. It is an autonomous agent that runs on your own machine — or a server you control — remembers what it learns, and generates new skills from the workflows it completes.
The growth has been fast enough to become part of the story. Hermes reached roughly 99,000 GitHub stars in its first eight weeks, and as of late May 2026 the repository carries approximately 180,000 stars and 30,900 forks, with 321 or more contributors in the v0.15.0 release alone. The "fastest-growing open-source agent of 2026" label is widely cited industry characterization rather than a certified metric — but the underlying trajectory is real and easy to verify on the repository.
Practically, Hermes is MIT-licensed and installs with pip install hermes-agent. No commercial license is required to self-host. That licensing posture, combined with the self-improvement loop, is why it has become a default reference point in the open-agent conversation this year.
"Not a coding copilot tethered to an IDE or a chatbot wrapper around a single API. An autonomous agent that lives on your server, remembers what it learns, and gets more capable the longer it runs."— Nous Research, official Hermes Agent site
That positioning matters because it changes how you should evaluate the desktop app. A copilot lives and dies on its IDE integration. An agent like Hermes is judged on what it accumulates: memory across sessions, a skill library that grows, and a runtime that you can point at any model. The desktop is the surface that finally makes all of that legible to people who never wanted to live in a terminal.
02 — ArchitectureOne core, many surfaces.
Here is the defining design decision, and the part most launch coverage skipped. Hermes Desktop is credibly reported to be built on Electron and React (per Decrypt and MarkTechPost; the stack is not spelled out in the official release notes), but the framework is beside the point. What matters is that the desktop uses exactly the same agent core as the CLI, the TUI, and the messaging gateway. It is not a fork. It is a window onto one shared runtime.
The consequence is continuity. Your configuration, API keys, sessions, installed skills, and accumulated memory are identical across surfaces. Start a task in the desktop, close your laptop, and resume it in the CLI on a server. Every CLI-side improvement — the rebuilt session search, the GEPA-generated skills, the sandbox execution backends — is available in the desktop on day one because there is no second codebase to bring up to parity.
run_agent.py collapsed from 16,083 lines to 3,821, refactored into 14 cohesive modules. That refactor was the prerequisite for reliable cross-surface state sharing, not a coincidence of timing. A clean, modular core is what lets one agent runtime serve a terminal and a window without the two drifting apart.Contrast this with agent ecosystems where the graphical client and the agent engine grew up separately — feature gaps open, behavior diverges, and you end up learning two slightly different tools. Hermes avoids that whole class of problem by construction. If you are weighing it against alternatives, our breakdown of how Hermes benchmarks against OpenClaw and Codex CLI walks through where that architectural choice pays off and where rivals lead.
03 — InstallationInstall on Mac, Windows, and Linux.
The hard prerequisite is short: Python 3.11 or newer. The official docs are blunt about it — "No prerequisites beyond Python 3.11+. Everything else is handled automatically." The hermes postinstall step pulls in Node.js v22, ripgrep, ffmpeg, and a browser for you. One real constraint worth knowing up front: Hermes rejects any model with a context window below 64,000 tokens at startup, because sub-64K windows cannot hold enough working memory for multi-step tool-calling.
Native installer
Download the native installer from the desktop page and run it. Voice mode prompts for microphone access on first use. The smoothest path for non-technical Mac users.
PowerShell installer
Download the native installer, or use the PowerShell path that auto-fetches PortableGit (~50 MB), Python 3.11, Node v22, ripgrep, and ffmpeg via uv. The dashboard /chat terminal pane still needs WSL2 — native Windows has no POSIX PTY equivalent.
Script + flag
Run the CLI install script with the --include-desktop flag. There is no GUI installer — the terminal is still required, which undersells the cross-platform headline on the platform that hosts most self-hosted Hermes deployments.
For the CLI itself, there are two install paths: the package route is pip install hermes-agent followed by hermes postinstall, or you can run the one-line install script from the GitHub raw URL. If you already have a working installation, hermes desktop launches the app straight from your current configuration. Supported operating systems span Linux, macOS, WSL2, native Windows, and even Android via Termux.
--include-desktop flag, with no graphical installer. For the non-technical audience the desktop is meant to serve, that is a material gap — especially because Linux is the dominant platform for the servers most self-hosted Hermes backends run on.04 — First RunThe first five minutes after install.
On the CLI, the fast lane to a working setup is hermes setup --portal, which wires up a Nous Portal account in one step. If you would rather configure manually, the interactive wizard is plain hermes setup, and hermes model handles provider and model selection. Once running, hermes --tui launches the modern terminal UI (recommended over the bare CLI), hermes --continue resumes your last session, and hermes doctor diagnoses problems. Configuration lives at ~/.hermes/config.yaml, and secrets live in ~/.hermes/.env.
In the desktop, the same four areas are where you will spend your time. The chat workspace streams responses with live tool activity in a right-hand preview rail that renders web pages, files, and tool outputs as they happen. A file browser scopes the working directory (set it with hermes desktop --cwd <path> or the HERMES_DESKTOP_CWD environment variable). Voice mode adds microphone input. And a cluster of settings panes covers providers, models, credentials, skill browsing and installation, cron jobs, profile switching, gateway setup, and multi-agent orchestration.
Chat workspace
Responses stream with a right-hand rail that renders live tool activity — web pages, files, and raw tool outputs. For non-CLI users this is the first time they can actually watch what the agent is doing.
File browser
Browse and bound the working directory the agent operates in. Set it via --cwd or the HERMES_DESKTOP_CWD env var so the agent stays inside the folder you intend.
Voice mode
Talk to the agent. macOS prompts for microphone access on first use. Useful for hands-free task kickoff and quick dictated instructions.
Settings & management
Providers, models, and credentials; skill browsing and install; cron job management; profile switching; gateway setup; and multi-agent orchestration — the control surface for everything the agent can do.
One configuration note that trips people up: Nous Research recommends against pointing Hermes Agent at its own Hermes 4 chat models. Those models are tuned for conversation, not the rapid tool-calling an agent runtime leans on. The recommended path is frontier models, reached either through Nous Portal or your own direct API keys — which leads straight into the model-support question below.
05 — Parity MatrixDesktop vs CLI: what reaches full parity.
Because both surfaces sit on one core, feature parity is close to total — but "close to" is not "exactly," and the gaps are precisely the things you want to know before you switch a team over. The matrix below maps the major capabilities across the CLI/TUI and the desktop, with the caveats called out where they exist.
Fully shared
Streaming output, skill install/browse/auto-generate, memory management, session search, scheduling/cron, model selection, and multi-agent Kanban all run on the same core. Available on both CLI/TUI and desktop.
Both surfaces
All five sandbox backends (local, Docker, SSH, Singularity, Modal), the messaging gateway, and the security stack — Bitwarden Secrets Manager and mTLS — are reachable from CLI and desktop alike.
Desktop-forward
Remote backend connectivity is configured cleanly in the desktop (remote URL + credentials), and voice mode with mic input is a desktop affordance. The preview rail makes tool calls visible — a net security benefit for non-CLI users.
Platform caveats
Linux desktop install still requires the terminal (--include-desktop). On Windows, the dashboard /chat terminal pane requires WSL2 — native Windows has no POSIX PTY equivalent. These are the real adoption gaps.
The honest read: there is no capability the desktop locks you out of. The asterisks are about installation and host platform, not about what the agent can do once it is running. If your team is on Mac and Windows with WSL2 available, the desktop is a strict superset of the CLI experience for most people. If you are deploying onto Linux servers, plan on a terminal step regardless of how the headline reads.
06 — CompoundingMemory, skills, and the GEPA loop.
The reason an agent "gets more capable the longer it runs" comes down to three systems working together: a layered memory stack, a growing skill library, and a self-improvement loop called GEPA. Memory uses four layers. Prompt memory holds a small MEMORY.md (about 2,200 characters of agent notes) and USER.md(about 1,375 characters of user profile), injected as a frozen snapshot at session start to preserve prefix caching. A session archive stores history in a local SQLite database with full-text search, so "did we discuss X last week?" recall costs no token budget. Eight optional external providers add semantic search and knowledge-graph capabilities, and the skills layer encodes procedural knowledge as files. Retrieval runs at roughly 10 milliseconds across 10,000 documents.
Skills are the visible half of the compounding. Hermes ships with 80 or more bundled skills across 17 categories — Software Development, MLOps, GitHub, Research, Creative, Productivity, DevOps, and others — synced via hermes update. They use a three-tier progressive-disclosure model so the agent only loads a skill's full content when it actually needs it: a roughly 3K-token index first, full content on demand, and reference files only when called. Skills follow the open agentskills.io standard, so they stay portable across agent harnesses that implement it.
"When your agent remembers everything, you stop repeating yourself."— Future Humanism, community characterization, May 2026
GEPA — Generic Evolution of Prompt Architectures — is the peer-reviewed mechanism underneath the self-improvement claim. The GEPA paper was accepted as an Oral presentation at ICLR 2026, and the implementation lives in a separate Nous Research repository built on DSPy. Mechanically, roughly every 15 completed tool calls the system reads the full execution traces — error messages, profiling data, reasoning chains — and proposes targeted prompt improvements. If a task took 47 tool calls when 12 would have done, GEPA spots the gap and revises the relevant skill, storing what it learns as new skill files.
Our own read on this, having watched a year of agent launches: the compounding story is the part that holds up regardless of which exact percentage you trust. An agent that persists memory, accrues a skill library, and revises its own prompts from execution traces is structurally different from a stateless model call — and that structural difference is what makes "gets faster the longer it runs" plausible even before you adjudicate the benchmark debate. For most teams the right move is to measure it on your own recurring tasks rather than adopt a vendor number as a planning input.
07 — Models & CostModel-agnostic, and the $0 marginal-cost angle.
Hermes is fully model-agnostic. It works with Nous Portal (a proxy exposing 300+ models via OpenRouter), OpenRouter directly, NVIDIA NIM, Hugging Face, the OpenAI API, xAI SuperGrok via OAuth, and any custom OpenAI-compatible endpoint — including a local vLLM server. There is no model lock-in by design, which is what makes the cost math interesting.
The underreported angle: SuperGrok OAuth support (added in v0.14.0, and bringing a 1M-token context window) lets you drive Hermes with an existing ChatGPT or xAI SuperGrok subscription instead of a separate metered API bill. For anyone already paying for a frontier subscription, Hermes' effective marginal cost approaches zero — and that gap widens the moment you run several agents in parallel. That is a structural difference from tools wired to a single metered-API provider.
Via Nous Portal
Nous Portal proxies 300+ models through OpenRouter, and paid tiers add a Tool Gateway (web search, image generation, TTS, cloud browser automation, cloud terminal sandbox). Portal pricing is listed on a Nous Research social post; verify current tiers before budgeting.
Tokens via OAuth
xAI SuperGrok OAuth (v0.14.0) brings a 1M-token context window and lets you reuse an existing subscription instead of paying separate API rates — the core of the $0-marginal-cost argument for multi-agent setups.
Hard requirement
Models below 64,000 tokens are rejected at startup — sub-64K windows can't hold enough working memory for multi-step tool-calling. A real constraint when wiring up smaller local models via vLLM.
08 — Enterprise PatternRemote backends: the team unlock.
The capability with the largest organizational payoff is the one that got the least coverage. The desktop can connect to a Hermes backend running somewhere else — a VPS, a home server, or a node behind Tailscale. You enter the remote URL (the backend listens on port 9119) and authenticate with the username and password set in ~/.hermes/.env on that backend. It is also configurable through the HERMES_DESKTOP_REMOTE_URL environment variable, and boot logs land in the desktop log under your Hermes home directory.
Reframe that and a deployment pattern appears: run one shared Hermes instance on a server, and let every team member connect the desktop app to it. Nobody needs a local Python install. The skill library and memory accrue centrally. And the desktop's preview rail means non-technical staff can finally watch and audit what the agent does — a net security gain over a headless CLI on a box they cannot see. For teams weighing where the workload should physically run, our guide to running AI agents locally for privacy and cost control covers the trade-offs in depth.
"Most agencies run two agents in parallel: Claude Code plus Hermes Agent — Claude Code handles day-to-day codebase work while Hermes handles recurring research and support automation, with a compounding skill library accruing on Hermes."— MindStudio agent comparison guide
That two-agent split is the practical shape we see most often, and the remote-backend pattern is what makes the Hermes half of it scale across a team rather than living on one engineer's laptop. If the model powering that shared instance is a self-hosted open-weight model, our guide to self-hosting open-weight models to power your Hermes instance covers the deployment decisions that follow.
09 — Version HistoryFrom v0.10 to v0.15.2 in one quarter.
Hermes' release cadence is part of what makes it worth tracking. Each named release stacked a headline capability, and the contributor count climbed alongside the star count. The chart below traces the velocity that took the project from launch to a desktop app in roughly three months.
Hermes release velocity · v0.13 → desktop preview
Source: GitHub Releases (NousResearch/hermes-agent)A detail worth keeping straight: the v0.15.2 CLI hotfix (May 29) and the desktop public preview announcement (June 2–3) are separate events. The packaging fix stabilized the Velocity line; the desktop announcement is the marketing moment a few days later. They get conflated in some coverage, but they are not the same release.
run_agent.py in v0.15.0
The core script collapsed from 16,083 to 3,821 lines across 14 cohesive modules — the refactor that made one runtime serving many surfaces practical.
Faster, no LLM
v0.15.0 rebuilt session search with no model call and no token cost — recall across your archive without burning budget. Per the release: 'No LLM, no cost, 4,500x faster.'
Fewer per conversation
Per-conversation function calls dropped from roughly 399K to 213K, with about 19 seconds shaved off cold start and version-check time cut from 701ms to 258ms.
10 — Adoption FitWho should actually run Hermes.
Hermes, OpenClaw, and Codex CLI are best understood as complementary rather than competing. The consensus across independent comparisons: Hermes is the strongest self-improving runtime — lighter and more stable between releases than OpenClaw — while OpenClaw carries the largest community (345K+ stars versus Hermes' roughly 180K) and Codex CLI has the tightest OpenAI integration. The recommended pattern is to run a desk-based coding agent for day-to-day work, Hermes for background scheduled automations and persistent personal memory, and OpenClaw where community-heavy enterprise integrations matter most.
Background scheduled work
Cron-driven research, support triage, and report generation where a compounding skill library and persistent memory pay off over weeks. This is Hermes' core strength.
Remote backend deployment
One Hermes backend on a VPS or behind Tailscale, with the desktop app on every team member's machine — central skills, central memory, visible tool calls, no local Python per client.
Cost-sensitive multi-agent
Already paying for ChatGPT or SuperGrok and want several agents running without a metered API bill stacking up? SuperGrok OAuth makes Hermes' marginal cost approach zero.
Tight in-editor loop
For minute-to-minute codebase work inside an editor, a desk-based coding agent or Codex CLI's OpenAI integration is usually the better default. Run Hermes alongside it, not instead of it.
Looking forward, the cross-surface continuity is what we expect to matter most over the next two quarters. As agents move from single-developer toys to shared team infrastructure, the tools that keep one consistent runtime behind every interface will be far easier to operate, secure, and audit than those juggling a separate GUI and engine. Hermes made that bet early. If it holds, "one core, many surfaces" becomes the pattern others copy — and the teams who standardized on it will have a head start on the skill libraries and memory that take months to accrue. Standing up that kind of evaluation is exactly where our AI and digital transformation engagements begin.
11 — ConclusionA real product, with honest asterisks.
The desktop isn't the story. One shared core behind every surface is.
Hermes Agent Desktop earns the attention not because it added a window on top of a terminal, but because it added that window without forking the engine. The same memory, skills, sessions, and configuration follow you from CLI to GUI to a remote server. That continuity, enabled by a 76% core refactor most coverage treated as a footnote, is the design decision worth copying.
The honest asterisks belong in the same breath. The GEPA self-improvement numbers are vendor-stated and only partially corroborated by community runs — promising, not proven. Linux install still needs the terminal. Windows' in-app terminal pane needs WSL2. And the Nous Portal pricing circulating online traces to a social post rather than an official pricing page, so verify it before you budget. None of these sink the product; all of them belong in an honest evaluation.
For most teams the smart move is to run Hermes on its strengths — background automations, a shared remote instance, and a compounding skill library — alongside whatever coding agent you already trust in the editor. Measure the self-improvement claims on your own recurring work rather than adopting a headline percentage. And if you are pairing it with an existing subscription, the marginal cost may be the most persuasive line on the whole spec sheet.