AI Development11 min read

Hermes Agent v0.10: Self-Improving Open-Source AI Agent

Nous Research's Hermes Agent v0.10 (April 16, 2026) ships 118 skills, three-layer memory, six messaging integrations, and a closed learning loop.

Digital Applied Team

April 18, 2026

11 min read

95.6K

GitHub Stars

118

Bundled Skills

<10ms

Memory Retrieval

MIT

License

Key Takeaways

95.6K Stars in Seven Weeks: Hermes Agent is the fastest-growing open-source agent framework of 2026 — the first to ship a closed learning loop built directly into the runtime.

Three-Layer Memory Is the Differentiator: Session context, persistent SQLite + FTS5 store, and a drift-adjusting user model — retrieval stays sub-10ms even across 10,000+ skill documents.

118 Skills Ship by Default: v0.10.0 bundles MLOps, GitHub, research, scraping, and code-execution skills. They are Markdown files — portable across Hermes deployments and compatible with agentskills.io.

One Gateway, Six Channels: A single gateway process serves Telegram, Discord, Slack, WhatsApp, Signal, and CLI — deploy once, invoke Hermes from any channel your team lives in.

Agencies Get 40% Research Time Back: TokenMix.ai benchmarks show self-created skills cut research-task time by 40% versus a fresh agent instance. The compounding advantage grows with your skill library.

Nous Research shipped Hermes Agent v0.10.0 on April 16, 2026 — the release that turned the framework from an interesting experiment into the fastest-growing open-source agent stack of 2026. Seven weeks after its February 25 debut, Hermes Agent crossed 95,600 GitHub stars, a trajectory that has only been matched historically by LangChain and AutoGen combined. The pull is structural: Hermes is the first production-grade agent framework where a closed learning loop is not a feature, it is the runtime.

This guide is the technical reference for evaluating, installing, and productionising Hermes Agent v0.10. It covers the three-layer memory architecture, the closed learning loop that converts sessions into reusable Markdown skills, the 118-skill bundled catalog, the six-channel messaging gateway, Fast Mode priority queuing, and the decision matrix versus OpenClaw and Codex CLI — plus the deployment pattern agencies are using to ship Hermes on client infrastructure under the MIT license.

Why self-improvement matters: stateless agents solve the same problem from scratch every time. Hermes writes a skill document after a 5+ tool-call task, stores it with FTS5 full-text search, and loads it instantly when a similar task fires. Agencies report 40% research-task time cuts after two weeks of runtime — the compounding advantage is real.

Why Hermes Matters in 2026

The agent framework category in 2026 splits into three tiers. Tier 1 is hosted-only (OpenAI Agents, Anthropic Agents): excellent defaults but no self-hosting. Tier 2 is orchestration libraries (LangChain, CrewAI, AutoGen): flexible but stateless per-run by default. Tier 3 is runtime agents that ship with persistent memory, learning, and deployment in the same binary. Until Hermes, Tier 3 was closed-source. Hermes is the first fully MIT-licensed Tier 3 runtime.

Capability	Hermes Agent	LangChain / CrewAI	Hosted (OpenAI / Anthropic Agents)
Cross-session memory	Built-in, SQLite + FTS5	DIY — vector store wiring required	Provider-locked, limited depth
Skill reuse across runs	Auto-generated Markdown	Manual chain authoring	Custom assistants only
Self-hostable (MIT)	Yes	Yes	No
Multi-provider routing	10+ providers incl. OpenRouter	Yes	Single-provider
Built-in messaging gateway	Telegram, Discord, Slack, WhatsApp, Signal, CLI	DIY integrations	Platform-specific

v0.10.0 Release Facts

Ship date April 16, 2026. Minor version. Additive: no breaking changes for existing Hermes installations, but several features only light up after re-running the install script to pick up updated dependencies.

118 Bundled Skills

Up from ~40 in v0.8; triples the default capability surface.

MLOps, GitHub workflows, research pipelines, web scraping, code execution, and agentskills.io-standard skills shipped in the default install. Browse the full set with hermes skills list.

Three-Layer Memory

Session context, persistent store, user model — each retrievable independently.

SQLite + FTS5 backs layers 2 and 3; layer 1 is in-process. Retrieval latency stays under 10ms across 10K+ skill documents.

Six Messaging Integrations

Telegram, Discord, Slack, WhatsApp, Signal, CLI.

All six run through a single gateway process. Sessions and skills are shared across channels — start a task in Slack, finish it in Telegram.

Fast Mode + Browser Dashboard

Priority queues for OpenAI/Anthropic; UI for sessions + skills.

Toggle via /fast. The new dashboard at localhost:7777 surfaces session history, skill catalog, and gateway config without editing files.

Three-Layer Memory Architecture

The memory system is what makes Hermes feel different in practice. Every agent claims “memory” — most mean a vector store bolted on. Hermes ships three layers that serve different purposes and are backed by different storage:

Layer	Purpose	Storage	Lifetime
L1 — Session context	Current conversation buffer; tool outputs; scratch.	In-process	Session
L2 — Persistent store	Completed task outcomes, generated skill files, user notes.	SQLite + FTS5	Forever (backed by ~/.hermes/)
L3 — User model	Preferences, coding style, timezone, tone, frequent collaborators.	SQLite JSON field	Drift-adjusted across sessions

Retrieval is FTS5 full-text search plus LLM summarisation. The combination keeps latency at roughly 10ms for 10,000+ skill documents — the inflection point at which most vector-DB architectures start showing tail latency. Nous Research explicitly chose SQLite + FTS5 over pgvector for the embedded-first deployment story; the whole agent ships as files under ~/.hermes/.

Backup pattern: ~/.hermes/ is the entire agent state. Snapshot nightly to an encrypted S3 bucket (or the client's equivalent). Restoring an agent is rsync + restart — no re-indexing, no re-embedding.

The Closed Learning Loop

This is the piece that distinguishes Hermes. After a task crosses a complexity threshold (Nous Research cites “typically five or more tool calls”), the agent writes a Markdown skill document automatically. The skill captures three things: the procedure it followed, known pitfalls it encountered, and verification steps it took. Next time a similar task fires, the skill surfaces through FTS5 and the agent starts from that file rather than reasoning from scratch.

The loop in six steps

Task enters — user prompt or gateway message.
Skill search — FTS5 query against the 118 bundled + your created skills; top matches prepended to context.
Plan + execute — agent drafts a plan, runs tool calls (up to 8 parallel via ThreadPoolExecutor).
Verify — agent runs explicit verification steps (check outputs, compare to expected state).
Skill generation — if task was complex and new, write a skill document; if the task refined an existing skill, update it.
Memory update — outcome logged to L2; L3 user model nudged based on preferences surfaced during the run.

Example skill file

A generated skill is a plain Markdown file. You can read it, edit it, delete it, or commit it to your agency's shared repo. Here is a representative shape:

---
name: gsc-weekly-regression-audit
description: Pull GSC data, identify week-over-week ranking drops, flag cause hypotheses
tools: [search-console, gsc-fetch, table-diff]
created: 2026-04-17
updated: 2026-04-18
---

## Procedure

1. Authenticate to Search Console via service account creds stored in Vault
2. Fetch last 14 days of query + page performance; pivot on query
3. Compute delta_position; filter where delta_position < -3 AND impressions > 100
4. For each flagged query, run core-update overlap check (see also: information-gain-audit)
5. Return ranked list with cause hypothesis column

## Pitfalls

- GSC data is 48-hour lagged; always exclude the last two days
- Position averages hide device splits — always segment desktop vs mobile

## Verification

- Spot-check 3 flagged queries against Semrush or Ahrefs
- Confirm at least one hypothesis per query before returning to user

The 118-Skill Bundled Catalog

v0.10.0 triples the out-of-box capability surface versus v0.8. The bundled catalog is grouped into functional families:

Family	Representative skills	Agency use case
MLOps	Model eval runner, dataset diff, prompt regression	Internal model governance; pre-launch QA
GitHub workflows	PR triage, issue labeling, changelog generation	Agency repo hygiene for client projects
Research pipelines	Competitive intel pull, citation grader, SERP diff	Weekly competitor monitoring
Web scraping	Headless page capture, schema extraction, change detection	Price / messaging / content change tracking
Code execution	Sandboxed Python runner, dependency resolver, test runner	Data cleaning, one-off migrations
agentskills.io standard	Third-party-compatible skill manifests	Portable skill libraries across teams

Custom skill rollout: build a private agency skill registry and point Hermes at it during install. Our AI digital transformation team builds and maintains these for agencies — compounding skills are the actual competitive moat.

The Unified Messaging Gateway

One of the most under-sold v0.10 features is the gateway. It is a single long-running Python process that connects Hermes to Telegram, Discord, Slack, WhatsApp, Signal, and the local CLI — simultaneously and from one place. Sessions and skills are shared across channels, so a task started in Slack can be resumed in Telegram without any state transfer.

Channel configuration

Each channel requires an API token in ~/.hermes/config.toml. The dashboard at localhost:7777 now surfaces a form for each — no more manual TOML edits for new channels in v0.10.

# ~/.hermes/config.toml
[gateway]
enabled = true

[gateway.telegram]
bot_token = "${TELEGRAM_BOT_TOKEN}"
allowed_user_ids = [12345678]

[gateway.slack]
bot_token = "${SLACK_BOT_TOKEN}"
signing_secret = "${SLACK_SIGNING_SECRET}"
default_channel = "#hermes-agent"

[gateway.discord]
bot_token = "${DISCORD_BOT_TOKEN}"
guild_id = 1234567890

Launch the gateway with hermes gateway start. The process is designed to run as a systemd service or inside Docker — the GitHub repo ships reference unit files for both.

Installation and First Run

Hermes ships an install script that provisions Python 3.11 via uv, creates a virtual environment, installs dependencies, registers the hermes CLI on your PATH, and initialises ~/.hermes/. No sudo required.

macOS / Linux / WSL2

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

# Verify
hermes --version
# → hermes-agent 0.10.0

# First-run configuration wizard
hermes init

# Install a provider key (pick one or more)
hermes provider add openrouter --key "sk-or-v1-..."
hermes provider add anthropic --key "sk-ant-..."

# Launch interactive session
hermes chat

Windows users: native Windows is not supported. Install WSL2 Ubuntu, run the install script inside it, and launch the CLI from the WSL shell. Nous Research has been clear this stance will not change in 2026.

System requirements

Python 3.11+ (installed by uv automatically)
~500 MB disk at install; ~/.hermes/ grows with skills and session history
Network egress to your chosen LLM provider(s)
Optional: Docker for running the gateway in production

LLM Provider Matrix + Fast Mode

Hermes is deliberately multi-provider. v0.10 supports Nous Portal, OpenRouter (200+ models), NVIDIA NIM (Nemotron), Xiaomi MiMo, z.ai/GLM, Kimi/Moonshot, MiniMax, Hugging Face, OpenAI, and custom endpoints. The routing layer lets you set per-skill provider preferences — run research skills on cheap long-context models and code-execution skills on faster coding models.

Fast Mode

Toggle via /fast in the CLI or gateway. Fast Mode reroutes OpenAI and Anthropic requests through priority queues, reducing tail latency on supported models. It does not change the underlying model — only the delivery path — so quality, context windows, and rate limits are identical to the default lane. Use it when a human is waiting; don't use it for background batch work.

Hermes vs. OpenClaw vs. Codex CLI

The three open agent CLIs that matter in April 2026. A fuller benchmark comparison lives in the dedicated OpenClaw vs Hermes vs Codex CLI post; here is the condensed decision matrix.

Dimension	Hermes Agent	OpenClaw	Codex CLI
Persistent memory	Built-in, three-layer	None	Limited (session only)
Skill reuse	Auto-generated Markdown	Per-run only	No
Coding throughput	High	Highest (822B OpenRouter tokens)	Very high
Multi-provider	10+ providers	Broad	OpenAI only
Messaging channels	6 built-in	CLI only	CLI only
Best for	Cross-session agency workflows	One-shot heavy coding	Polished OpenAI-native setup

Production Patterns for Agencies

Hermes is MIT-licensed and self-hosts cleanly, which makes it the first framework where “deploy per client” is both technically and legally straightforward. Two patterns we see working:

1. Internal-only agency agent

Deploy Hermes on an agency-owned VPS. Connect the gateway to an internal Slack workspace. Ship custom skills for recurring client deliverables (audits, performance reports, competitive research). Back up ~/.hermes/ nightly to an encrypted bucket.

Why it works: the agency's accumulated skill library becomes a compounding moat. Six months in, new hires onboard in a week because the skills encode the playbook.

2. Per-client dedicated agent

For enterprise clients, deploy a dedicated Hermes inside the client's infrastructure. Gateway connects to their Slack or Teams. Credentials live in their secret store. Hermes never phones home — everything stays under MIT inside the client's network.

Why it works: client data never leaves their boundary. Compliance signs off easily. Agency keeps the skill-authoring IP.

Skill authoring as a service: the real work is building high-quality skills for your domain. Our team designs, tests, and maintains skill libraries for agencies — see AI digital transformation for how we deliver it.

Conclusion

Hermes Agent v0.10 earns its 95.6K stars. It is the first MIT runtime agent where persistent memory, skill reuse, and multi-channel delivery are defaults — not bolt-ons. The three-layer memory architecture is the actual differentiator; the 118 bundled skills, six messaging integrations, browser dashboard, and Fast Mode queues are what make the differentiator feel smooth in daily use.

If you are evaluating agent frameworks in Q2 2026, install Hermes on a VPS, connect it to one Slack channel, and let it accumulate skills for two weeks. The 40% research-task time cut is real. The compounding advantage after three months is what separates agencies who adopted early from those still wiring LangChain from scratch.