Hermes Agent v0.10: Self-Improving Open-Source AI Agent
Nous Research's Hermes Agent v0.10 (April 16, 2026) ships 118 skills, three-layer memory, six messaging integrations, and a closed learning loop.
GitHub Stars
Bundled Skills
Memory Retrieval
License
Key Takeaways
Nous Research shipped Hermes Agent v0.10.0 on April 16, 2026 — the release that turned the framework from an interesting experiment into the fastest-growing open-source agent stack of 2026. Seven weeks after its February 25 debut, Hermes Agent crossed 95,600 GitHub stars, a trajectory that has only been matched historically by LangChain and AutoGen combined. The pull is structural: Hermes is the first production-grade agent framework where a closed learning loop is not a feature, it is the runtime.
This guide is the technical reference for evaluating, installing, and productionising Hermes Agent v0.10. It covers the three-layer memory architecture, the closed learning loop that converts sessions into reusable Markdown skills, the 118-skill bundled catalog, the six-channel messaging gateway, Fast Mode priority queuing, and the decision matrix versus OpenClaw and Codex CLI — plus the deployment pattern agencies are using to ship Hermes on client infrastructure under the MIT license.
Why self-improvement matters: stateless agents solve the same problem from scratch every time. Hermes writes a skill document after a 5+ tool-call task, stores it with FTS5 full-text search, and loads it instantly when a similar task fires. Agencies report 40% research-task time cuts after two weeks of runtime — the compounding advantage is real.
Why Hermes Matters in 2026
The agent framework category in 2026 splits into three tiers. Tier 1 is hosted-only (OpenAI Agents, Anthropic Agents): excellent defaults but no self-hosting. Tier 2 is orchestration libraries (LangChain, CrewAI, AutoGen): flexible but stateless per-run by default. Tier 3 is runtime agents that ship with persistent memory, learning, and deployment in the same binary. Until Hermes, Tier 3 was closed-source. Hermes is the first fully MIT-licensed Tier 3 runtime.
| Capability | Hermes Agent | LangChain / CrewAI | Hosted (OpenAI / Anthropic Agents) |
|---|---|---|---|
| Cross-session memory | Built-in, SQLite + FTS5 | DIY — vector store wiring required | Provider-locked, limited depth |
| Skill reuse across runs | Auto-generated Markdown | Manual chain authoring | Custom assistants only |
| Self-hostable (MIT) | Yes | Yes | No |
| Multi-provider routing | 10+ providers incl. OpenRouter | Yes | Single-provider |
| Built-in messaging gateway | Telegram, Discord, Slack, WhatsApp, Signal, CLI | DIY integrations | Platform-specific |
v0.10.0 Release Facts
Ship date April 16, 2026. Minor version. Additive: no breaking changes for existing Hermes installations, but several features only light up after re-running the install script to pick up updated dependencies.
MLOps, GitHub workflows, research pipelines, web scraping, code execution, and agentskills.io-standard skills shipped in the default install. Browse the full set with hermes skills list.
SQLite + FTS5 backs layers 2 and 3; layer 1 is in-process. Retrieval latency stays under 10ms across 10K+ skill documents.
All six run through a single gateway process. Sessions and skills are shared across channels — start a task in Slack, finish it in Telegram.
Toggle via /fast. The new dashboard at localhost:7777 surfaces session history, skill catalog, and gateway config without editing files.
Three-Layer Memory Architecture
The memory system is what makes Hermes feel different in practice. Every agent claims “memory” — most mean a vector store bolted on. Hermes ships three layers that serve different purposes and are backed by different storage:
| Layer | Purpose | Storage | Lifetime |
|---|---|---|---|
| L1 — Session context | Current conversation buffer; tool outputs; scratch. | In-process | Session |
| L2 — Persistent store | Completed task outcomes, generated skill files, user notes. | SQLite + FTS5 | Forever (backed by ~/.hermes/) |
| L3 — User model | Preferences, coding style, timezone, tone, frequent collaborators. | SQLite JSON field | Drift-adjusted across sessions |
Retrieval is FTS5 full-text search plus LLM summarisation. The combination keeps latency at roughly 10ms for 10,000+ skill documents — the inflection point at which most vector-DB architectures start showing tail latency. Nous Research explicitly chose SQLite + FTS5 over pgvector for the embedded-first deployment story; the whole agent ships as files under ~/.hermes/.
Backup pattern: ~/.hermes/ is the entire agent state. Snapshot nightly to an encrypted S3 bucket (or the client's equivalent). Restoring an agent is rsync + restart — no re-indexing, no re-embedding.
The Closed Learning Loop
This is the piece that distinguishes Hermes. After a task crosses a complexity threshold (Nous Research cites “typically five or more tool calls”), the agent writes a Markdown skill document automatically. The skill captures three things: the procedure it followed, known pitfalls it encountered, and verification steps it took. Next time a similar task fires, the skill surfaces through FTS5 and the agent starts from that file rather than reasoning from scratch.
The loop in six steps
- Task enters — user prompt or gateway message.
- Skill search — FTS5 query against the 118 bundled + your created skills; top matches prepended to context.
- Plan + execute — agent drafts a plan, runs tool calls (up to 8 parallel via
ThreadPoolExecutor). - Verify — agent runs explicit verification steps (check outputs, compare to expected state).
- Skill generation — if task was complex and new, write a skill document; if the task refined an existing skill, update it.
- Memory update — outcome logged to L2; L3 user model nudged based on preferences surfaced during the run.
Example skill file
A generated skill is a plain Markdown file. You can read it, edit it, delete it, or commit it to your agency's shared repo. Here is a representative shape:
---
name: gsc-weekly-regression-audit
description: Pull GSC data, identify week-over-week ranking drops, flag cause hypotheses
tools: [search-console, gsc-fetch, table-diff]
created: 2026-04-17
updated: 2026-04-18
---
## Procedure
1. Authenticate to Search Console via service account creds stored in Vault
2. Fetch last 14 days of query + page performance; pivot on query
3. Compute delta_position; filter where delta_position < -3 AND impressions > 100
4. For each flagged query, run core-update overlap check (see also: information-gain-audit)
5. Return ranked list with cause hypothesis column
## Pitfalls
- GSC data is 48-hour lagged; always exclude the last two days
- Position averages hide device splits — always segment desktop vs mobile
## Verification
- Spot-check 3 flagged queries against Semrush or Ahrefs
- Confirm at least one hypothesis per query before returning to user
The 118-Skill Bundled Catalog
v0.10.0 triples the out-of-box capability surface versus v0.8. The bundled catalog is grouped into functional families:
| Family | Representative skills | Agency use case |
|---|---|---|
| MLOps | Model eval runner, dataset diff, prompt regression | Internal model governance; pre-launch QA |
| GitHub workflows | PR triage, issue labeling, changelog generation | Agency repo hygiene for client projects |
| Research pipelines | Competitive intel pull, citation grader, SERP diff | Weekly competitor monitoring |
| Web scraping | Headless page capture, schema extraction, change detection | Price / messaging / content change tracking |
| Code execution | Sandboxed Python runner, dependency resolver, test runner | Data cleaning, one-off migrations |
| agentskills.io standard | Third-party-compatible skill manifests | Portable skill libraries across teams |
Custom skill rollout: build a private agency skill registry and point Hermes at it during install. Our AI digital transformation team builds and maintains these for agencies — compounding skills are the actual competitive moat.
The Unified Messaging Gateway
One of the most under-sold v0.10 features is the gateway. It is a single long-running Python process that connects Hermes to Telegram, Discord, Slack, WhatsApp, Signal, and the local CLI — simultaneously and from one place. Sessions and skills are shared across channels, so a task started in Slack can be resumed in Telegram without any state transfer.
Channel configuration
Each channel requires an API token in ~/.hermes/config.toml. The dashboard at localhost:7777 now surfaces a form for each — no more manual TOML edits for new channels in v0.10.
# ~/.hermes/config.toml
[gateway]
enabled = true
[gateway.telegram]
bot_token = "${TELEGRAM_BOT_TOKEN}"
allowed_user_ids = [12345678]
[gateway.slack]
bot_token = "${SLACK_BOT_TOKEN}"
signing_secret = "${SLACK_SIGNING_SECRET}"
default_channel = "#hermes-agent"
[gateway.discord]
bot_token = "${DISCORD_BOT_TOKEN}"
guild_id = 1234567890Launch the gateway with hermes gateway start. The process is designed to run as a systemd service or inside Docker — the GitHub repo ships reference unit files for both.
Installation and First Run
Hermes ships an install script that provisions Python 3.11 via uv, creates a virtual environment, installs dependencies, registers the hermes CLI on your PATH, and initialises ~/.hermes/. No sudo required.
macOS / Linux / WSL2
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
# Verify
hermes --version
# → hermes-agent 0.10.0
# First-run configuration wizard
hermes init
# Install a provider key (pick one or more)
hermes provider add openrouter --key "sk-or-v1-..."
hermes provider add anthropic --key "sk-ant-..."
# Launch interactive session
hermes chatWindows users: native Windows is not supported. Install WSL2 Ubuntu, run the install script inside it, and launch the CLI from the WSL shell. Nous Research has been clear this stance will not change in 2026.
System requirements
- Python 3.11+ (installed by
uvautomatically) - ~500 MB disk at install;
~/.hermes/grows with skills and session history - Network egress to your chosen LLM provider(s)
- Optional: Docker for running the gateway in production
LLM Provider Matrix + Fast Mode
Hermes is deliberately multi-provider. v0.10 supports Nous Portal, OpenRouter (200+ models), NVIDIA NIM (Nemotron), Xiaomi MiMo, z.ai/GLM, Kimi/Moonshot, MiniMax, Hugging Face, OpenAI, and custom endpoints. The routing layer lets you set per-skill provider preferences — run research skills on cheap long-context models and code-execution skills on faster coding models.
Fast Mode
Toggle via /fast in the CLI or gateway. Fast Mode reroutes OpenAI and Anthropic requests through priority queues, reducing tail latency on supported models. It does not change the underlying model — only the delivery path — so quality, context windows, and rate limits are identical to the default lane. Use it when a human is waiting; don't use it for background batch work.
Hermes vs. OpenClaw vs. Codex CLI
The three open agent CLIs that matter in April 2026. A fuller benchmark comparison lives in the dedicated OpenClaw vs Hermes vs Codex CLI post; here is the condensed decision matrix.
| Dimension | Hermes Agent | OpenClaw | Codex CLI |
|---|---|---|---|
| Persistent memory | Built-in, three-layer | None | Limited (session only) |
| Skill reuse | Auto-generated Markdown | Per-run only | No |
| Coding throughput | High | Highest (822B OpenRouter tokens) | Very high |
| Multi-provider | 10+ providers | Broad | OpenAI only |
| Messaging channels | 6 built-in | CLI only | CLI only |
| Best for | Cross-session agency workflows | One-shot heavy coding | Polished OpenAI-native setup |
Production Patterns for Agencies
Hermes is MIT-licensed and self-hosts cleanly, which makes it the first framework where “deploy per client” is both technically and legally straightforward. Two patterns we see working:
1. Internal-only agency agent
Deploy Hermes on an agency-owned VPS. Connect the gateway to an internal Slack workspace. Ship custom skills for recurring client deliverables (audits, performance reports, competitive research). Back up ~/.hermes/ nightly to an encrypted bucket.
Why it works: the agency's accumulated skill library becomes a compounding moat. Six months in, new hires onboard in a week because the skills encode the playbook.
2. Per-client dedicated agent
For enterprise clients, deploy a dedicated Hermes inside the client's infrastructure. Gateway connects to their Slack or Teams. Credentials live in their secret store. Hermes never phones home — everything stays under MIT inside the client's network.
Why it works: client data never leaves their boundary. Compliance signs off easily. Agency keeps the skill-authoring IP.
Skill authoring as a service: the real work is building high-quality skills for your domain. Our team designs, tests, and maintains skill libraries for agencies — see AI digital transformation for how we deliver it.
Conclusion
Hermes Agent v0.10 earns its 95.6K stars. It is the first MIT runtime agent where persistent memory, skill reuse, and multi-channel delivery are defaults — not bolt-ons. The three-layer memory architecture is the actual differentiator; the 118 bundled skills, six messaging integrations, browser dashboard, and Fast Mode queues are what make the differentiator feel smooth in daily use.
If you are evaluating agent frameworks in Q2 2026, install Hermes on a VPS, connect it to one Slack channel, and let it accumulate skills for two weeks. The 40% research-task time cut is real. The compounding advantage after three months is what separates agencies who adopted early from those still wiring LangChain from scratch.
Deploy a Self-Improving Agent for Your Team
We design, deploy, and maintain Hermes Agent installations for agencies and in-house teams — from skill authoring to infrastructure.
Frequently Asked Questions
Related Guides
More on open-source agents, persistent memory, and 2026 agent deployment.