GitHub Projects

8 min read read

System Verified

Hermes Agent v0.10.0: The Self‑Improving AI Assistant That Grew to 105k Stars

A technical deep dive into the architecture, tooling, and explosive growth of Nous Research's open‑source AI agent — now with 105,885 stars, 15,140 forks, and a built‑in learning loop that rewrites its own skills.

Vijayaragupathy

AI Engineer

Published

April 20, 2026

Hermes Agent v0.10.0: The Self‑Improving AI Assistant That Grew to 105k Stars

Executive Summary

Picture this: An open‑source AI agent that not only answers your questions but creates new skills from experience, improves them during use, and builds a deepening model of who you are across sessions. That’s Hermes Agent — and in the four months since its public release, it has amassed 105,885 GitHub stars and 15,140 forks, making it one of the fastest‑growing AI infrastructure projects of 2026.

Here’s what the numbers tell us:

105,885 stars (GitHub API, retrieved 2026‑04‑20) – a growth rate of ~26k stars/month since launch
15,140 forks – active community contributing skills, tools, and integrations
5,858 open issues – vibrant, high‑velocity development
Latest release: v2026.4.16 (Hermes Agent v0.10.0) published 2026‑04‑16
Supported LLM providers: 200+ models via OpenRouter, plus native NVIDIA NIM, Anthropic, OpenAI, Gemini, Hugging Face, and custom endpoints
Messaging platforms: Telegram, Discord, Slack, WhatsApp, Signal, Email, Home Assistant — all from a single gateway process

This isn’t just another chatbot wrapper. Hermes Agent is a full‑stack AI assistant runtime with a closed learning loop, scheduled automations, and serverless persistence that can hibernate on a $5 VPS and wake on demand.

If you’re building with AI agents today, you need to understand how Hermes works under the hood — because it’s setting the architectural standard for the next generation of autonomous AI.

1. The Architecture: A Modular, Multi‑Turn Engine

Hermes Agent is built around a modular, iteration‑aware core designed for long‑running tool‑calling sessions. The primary entry point is AIAgent.run_conversation (run_agent.py), which manages an outer loop (one user turn) that can involve dozens of inner tool‑calling iterations.

How Requests Flow

# Simplified flow from run_agent.py
while (api_call_count < self.max_iterations and self.iteration_budget.remaining > 0):
    # 1. Build dynamic system prompt (identity, platform hints, project context)
    api_messages = self._prepare_messages(messages, active_system_prompt)
    
    # 2. Call model via provider‑specific adapter
    response = self._interruptible_api_call(api_messages, tools=self.tools)
    
    # 3. Execute tool calls (often in parallel)
    if response.tool_calls:
        results = self._execute_tool_calls(response.tool_calls)
        messages.extend(results)
    else:
        # 4. Final response reached
        return self._finalize_turn(response, messages)

Key architectural components:

agent/prompt_builder.py – Assembles the system prompt by layering SOUL.md (identity), platform hints (WhatsApp vs. CLI), environment detection, and project‑specific rules (AGENTS.md, .cursorrules).
agent/context_engine.py – Handles “context pressure” by summarizing or truncating history when tokens exceed 75% of the model’s window.
agent/memory_manager.py – Two‑tier memory: built‑in (MEMORY.md, USER.md) plus one external provider (e.g., Honcho for vector‑based recall). Prefetches relevant facts before each turn and injects them as a <memory‑context> block.
Provider adapters (agent/gemini_native_adapter.py, agent/anthropic_adapter.py) – Translate internal OpenAI‑spec messages/tools into native provider schemas, with full SSE streaming support for real‑time TTS and UI feedback.

The Learning Loop: How Hermes Improves Itself

Hermes’s most distinctive feature is its closed learning loop. After complex tasks, the agent can:

Create new skills – Package successful workflows into reusable .md skill files
Improve existing skills – Refine prompts and steps based on execution outcomes
Nudge itself to persist knowledge – Periodically review conversations and decide what to commit to long‑term memory
Search its own past – FTS5 session search with LLM summarization for cross‑session recall

This isn’t theoretical. The skill system is already populated with dozens of curated skills across categories: GitHub, Apple ecosystem, data science, creative writing, devops, and more.

2. The Tool System: 40+ Atomic Actions with RPC Parallelism

Tools in Hermes are atomic functions registered via tools/registry.py. Each tool file calls registry.register() at module level, declaring its schema, handler, toolset membership, and availability check.

Example tool definition (simplified):

# tools/exec_tool.py
from tools.registry import registry
 
@registry.register(
    name="exec",
    description="Execute shell commands",
    parameters={
        "command": {"type": "string", "description": "Shell command to run"},
        "workdir": {"type": "string", "optional": True},
    },
    toolset="core",
)
def exec_tool(command: str, workdir: Optional[str] = None) -> dict:
    """Execute a shell command and return stdout, stderr, exit code."""
    # Security checks: path validation, command allow‑lists
    # Execution via selected backend (local, Docker, SSH, etc.)
    return {"stdout": ..., "stderr": ..., "exit_code": ...}

Security layers:

tools/path_security.py – Validates file‑system access against allow‑lists
tools/url_safety.py – Checks URLs against blocklists before fetching
tools/tirith_security.py – Command‑line argument validation and sandboxing
Execution backends – Local, Docker, SSH, Daytona, Singularity, Modal (serverless)

RPC toolset – You can write Python scripts that call tools via RPC, collapsing multi‑step pipelines into zero‑context‑cost turns. This is a game‑changer for automation: the agent orchestrates, but the heavy lifting happens outside the LLM’s token window.

3. The Skill System: Procedural Memory That Grows

Skills are curated procedural knowledge stored as Markdown files with metadata. They differ from tools: a tool is a function; a skill is a guide for when and how to use tools.

Example skill structure (skills/apple/apple‑reminders/SKILL.md):

---
name: apple‑reminders
description: Manage Apple Reminders via remindctl CLI (list, add, complete, delete).
version: 1.0.0
platforms: [macos]
prerequisites:
  commands: [remindctl]
---
# Apple Reminders
Use `remindctl` to manage Apple Reminders directly from the terminal…

When the agent encounters a task that matches a skill’s triggers, it loads the skill’s instructions into context, ensuring consistent, platform‑aware execution. Skills are compatible with the agentskills.io open standard, meaning they can be shared across Hermes instances and other compatible agents.

The skill hub (tools/skills_hub.py) provides discovery, installation, and version management — essentially a package manager for agent capabilities.

4. The Gateway: One Process, Seven Platforms

Hermes’s gateway (gateway/) is a single persistent process that routes messages between the agent core and Telegram, Discord, Slack, WhatsApp, Signal, Email, and Home Assistant. It handles:

Cross‑platform conversation continuity – Start a task on Telegram, continue on Discord
Voice‑memo transcription – Send a voice note, get a text response
Scheduled delivery – Cron‑triggered messages to any platform
Platform‑specific UI hints – The agent knows whether you’re on a small phone screen or a desktop CLI

This unified gateway is why Hermes feels like a single assistant that lives where you do, not seven different bots.

5. The Deployment Story: $5 VPS to GPU Cluster

Hermes runs on six terminal backends:

Backend	Use Case	Cost When Idle
Local	Your laptop	–
Docker	Isolated tool execution	–
SSH	Remote servers	VPS pricing
Daytona	Serverless persistence	~$0 (hibernates)
Singularity	HPC / research clusters	Cluster pricing
Modal	Serverless GPU	~$0 (hibernates)

The Daytona and Modal integrations are particularly innovative: your agent’s entire environment hibermates when idle and wakes on demand, costing nearly nothing between sessions. You can run a persistent AI assistant on a $5/month VPS — a radically affordable proposition for individuals and small teams.

6. The Research Pipeline: Training the Next Generation

Hermes isn’t just a product; it’s a research platform. The tinker‑atropos/ directory contains:

Batch trajectory generation – Mass‑produce tool‑calling examples for fine‑tuning
Atropos RL environments – Reinforcement learning for better tool‑use policies
Trajectory compression – Distill long sessions into concise training examples

This pipeline feeds back into Nous Research’s model‑training efforts, creating a virtuous cycle: better models make Hermes smarter, and Hermes’s trajectories train better models.

Actionable Takeaways

If you’re evaluating AI agent frameworks, here’s what to do next:

Install and try – curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes‑agent/main/scripts/install.sh | bash
Run the setup wizard – hermes setup (configures model, tools, gateway in one flow)
Explore skills – /skills in the CLI, or browse skills/ directory
Test the learning loop – Give it a multi‑step task, then check if a new skill appears
Deploy somewhere cheap – Try Daytona or Modal for serverless persistence

For developers: The codebase is a masterclass in modular agent architecture. Study agent/run_agent.py for the core loop, tools/registry.py for tool registration, and gateway/ for multi‑platform messaging.

For researchers: The trajectory‑generation tools are ready‑to‑use. If you’re working on tool‑calling or RL‑for‑agents, Hermes provides a production‑grade environment with real‑world complexity.

Conclusion

Hermes Agent represents a paradigm shift from single‑task chatbots to long‑term, self‑improving AI assistants. Its architecture — modular, memory‑aware, platform‑agnostic — is becoming the de facto standard for serious agent projects.

The numbers don’t lie: 105,885 stars in four months is a community shouting that this is the direction the industry is moving. Whether you’re an individual looking for a personal AI assistant, a team building internal automation, or a researcher pushing the boundaries of agent capabilities, Hermes deserves your attention.

One question to leave you with: If your current AI assistant can’t create its own skills, search its past conversations, or run on a $5 VPS, what are you really paying for?

Sources & Acknowledgments

This analysis synthesizes:

GitHub Repository: NousResearch/hermes‑agent – source code, README, directory structure
GitHub API: Repository statistics (stars, forks, issues) retrieved 2026‑04‑20
Release Data: Latest release v2026.4.16 (v0.10.0) published 2026‑04‑16
Documentation: hermes‑agent.nousresearch.com/docs – official guides and reference
Code Analysis: Direct examination of agent/, tools/, skills/, gateway/ directories
Community Resources: awesome‑hermes‑agent curated list of skills and integrations

Data Points Cited:

Stars: 105,885 – GitHub API /repos/NousResearch/hermes‑agent
Forks: 15,140 – same endpoint
Open issues: 5,858 – same endpoint
Latest release: v2026.4.16 – GitHub API /repos/NousResearch/hermes‑agent/releases/latest
Supported providers: 200+ models via OpenRouter – OpenRouter models list
Skill count: Dozens across categories – directory listing of skills/
Tool count: 40+ – file count in tools/ excluding subdirectories

All data accurate as of 2026‑04‑20. Hermes Agent evolves rapidly; check the repository for the latest.

More from the system

Engineering

Hugging Face ml‑intern: Automating LLM Post‑Training with an AI Agent

Deep dive into Hugging Face's ml‑intern—an open‑source AI agent that automates end‑to‑end LLM post‑training workflows, from literature review and data validation to fine‑tuning and deployment.

Read full exploration

Engineering

OpenAI Agents SDK: A Lightweight Python Framework for Multi‑Agent Workflows

Deep dive into OpenAI's newly released Agents SDK—a lightweight, production‑ready Python framework for orchestrating multi‑agent workflows with built‑in tool‑calling, memory management, and real‑time streaming.

Read full exploration

Engineering

Fine-Tuning LLMs in 2026: How LoRA and QLoRA Deliver 95% of Full-Tune Performance with 10,000x Fewer Parameters

Updated guide to parameter-efficient fine-tuning, covering recent advances in low-rank adaptation, quantization, multi-task adaptation, and hardware-aware optimizations that make customizing large models accessible on consumer hardware.

Read full exploration