Hermes Agent's Self-Improving Skills: What They Are and Why They Compound

Jul 2, 2026

Problem

I keep hearing that Hermes Agent has a “self-improving skills” feature that Codex does not, and that it is the reason some people run Hermes on top of a $200/mo Codex sub. The Reddit descriptions are vague. The most concrete claim I can find is one user saying the value “compounds more than 40 days.” That sounded like marketing until I dug into what the mechanism actually is.

The architecture I am talking about:

Five layers of an AI coding harness: context manager, tool/permission system, loop/scheduler, provider/model adapter, UI

The relevant layer is the top one, the context manager. A skill in Hermes is a persisted artifact in that layer — a reusable prompt + tool template that the agent invokes by name instead of rebuilding from scratch.

Environment

Hermes Agent (current main branch, Nous Research)
~40 days of regular use to reach the compounding zone
Skill files stored under ~/.hermes/skills/
An LLM good enough that the auto-generated skills are not garbage

What “skills” actually are

A skill is a YAML/markdown file with a name, a trigger description, a prompt template, and a tool whitelist. Here is what Hermes might auto-generate if it watched me do my morning PR digest for a week:

---
name: morning_pr_digest
description: |
  Use when the user asks for a morning GitHub PR digest.
  Trigger phrases: "morning PRs", "what's open", "PR digest".
triggers:
  - "morning pr digest"
  - "what's open in my repos"
tools:
  - github.list_open_prs
  - github.get_pr_diff
  - slack.post_message
prompt: |
  1. List open PRs across the user's watched repos from the last 24h.
  2. Group by repo, sort by age desc.
  3. Post a digest to #engineering with PR title, author, age, and review status.
  4. Do not include merged or draft PRs.
---

The next morning I type “morning PRs” and Hermes already knows the four-step plan. No re-prompt, no copy-paste, no custom slash command. That is the unit of value.

Three mechanisms that make it compound

Auto-generation. After seeing the same pattern in my prompts and tool calls a few times across a week or two, Hermes writes the skill file on its own and registers it. I do nothing.

Self-pruning. Skills that I stop invoking get demoted or removed. The library does not grow unboundedly. Without this, the system becomes a junk drawer (one Reddit user, u/DamonGilbert1024, reported Hermes creates “useless skills” that clutter everything). Pruning is what keeps the library useful.

/learn command. A user-driven shortcut for the same flow. When I type /learn When I say "wrap PR", I mean: check the diff, run the linter, push to the branch, and post a comment with the test output., Hermes extracts the pattern, writes a skill, and confirms back. This is the on-ramp I used for the first two weeks instead of waiting for the auto-version to fire.

Why the 40-day curve is real

In week one, the library is mostly empty. I have two or three auto-generated skills, and they are obvious enough that I could have written them as a custom slash command in less time. Hermes feels slower than Codex for the same prompt. That is exactly what u/Ray_Smith reported in the same thread.

By week four or five, the library contains enough of my real workflows that the agent can do in one turn what previously required a multi-paragraph prompt. That is the “mind blowing” part u/Honest_Union_8731 was talking about. The slope is steep because every new skill makes the next one easier to write — the agent has more context to copy patterns from.

How to control the pipeline

skills:
  auto_generate: true
  prune_after_days: 30
  min_invocations_to_keep: 2
  require_user_approval: true   # safer, fewer surprises
  learn_command_enabled: true   # exposes /learn
profiles:
  - name: chief_of_staff
    skills: [morning_pr_digest, inbox_triage, daily_brief]
  - name: code_reviewer
    skills: [wrap_pr, lint_diff, post_review]

require_user_approval: true is the toggle that prevents the junk-drawer failure mode. The agent still writes candidate skills, but it has to show me the file before registering it. That cuts the noise dramatically.

The profiles block is the second compounding lever. Once the library is large, I can bind subsets of skills to named personas — chief of staff uses the morning and inbox skills, code reviewer uses the PR skills — and Hermes behaves like multiple specialists from one process.

Common mistakes

Expecting value in week one. The curve is real; if you bail at day 7 you will conclude the feature is vapor.
Never running /learn explicitly, then complaining the auto-version “did nothing.” The explicit command is the friendlier on-ramp.
Letting bad auto-generated skills pile up. The pruning cadence matters; review weekly.
Treating the skill layer as a replacement for the underlying model. The skill layer wraps a model. A weak model still produces weak skills.
Using Hermes only for coding, then wondering why the library never grows beyond git-related patterns.

Summary

In this post, I explained Hermes Agent’s self-improving skills feature. The key point is that skills are persisted prompt + tool templates, the agent writes them for you after observing repeated patterns, the library self-prunes to stay useful, and the /learn command is the explicit on-ramp. Expect a 30–40 day ramp before the value is obvious, run /learn for the first few weeks, and prune aggressively.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit Discussion: Struggling to see the use case for Hermes over just Codex
👨‍💻 Nous Research Hermes Agent repository

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!