Skip to content

What I'd Steal From Hermes for the Next Orbit

Published: at 10:30 PM

Table of contents

Open Table of contents

Where we left off

The last post ended on a suspended thought: maintaining taste across projects isn’t an audit problem, it’s an infrastructure problem. A single recorded reference gets you one refactor. What I actually want is a system where every audit deposits something reusable, and next quarter’s refactor starts warmer than this one.

Orbit was my first attempt at a stateful agent harness — for bugs, not UI. It worked well enough that I know the shape is right. Reading Hermes’s source this week, I saw a more developed version of the same shape, and figured out exactly what I’d port.

What Orbit got right

Orbit’s core insight was that state has to live in files, not in agent brains. .orbit/tasks.json held every task’s full lifecycle — repo, issue, PR url, verify status, rework attempts — and each agent read and wrote to it. The agents were fungible. The state was not.

That single decision solved most of the pathologies I hit with single-agent scripts. No context pollution between tasks. Independent verifier runs that actually disagreed with the hunter. The ability to kill and restart any stage without losing progress.

It wasn’t novel. Ramp and Stripe had both published versions of the same idea. But it was load-bearing. Without it, everything else I tried to stack on top was noise.

For a bug-hunter, that was enough. For a UI audit workflow, it isn’t.

What Hermes has that Orbit doesn’t

Hermes is the same idea, grown up. Four pieces I didn’t know I needed until I read the code.

What I'd steal from Hermes — Orbit's foundations vs. the four modules to port

Agent loop with four modules: adapter, skills, memory, compressor

The first is the provider adapter layer. Orbit is Claude-only. Hermes has a dedicated abstraction that normalizes Anthropic, Bedrock, Gemini, Copilot-ACP and OpenAI-compatible endpoints into one internal message format. For a bug-hunter that looks like vanity. For an audit tool it isn’t — different providers have different vision capabilities, different context windows, different per-token economics. You want to audit cheap and verify expensive, and the adapter is the thing that lets you swap mid-pipeline without rewriting the loop.

The second is self-creating skills. Orbit tasks are one-shot — the agent finishes a PR and the knowledge evaporates. Hermes writes skills to disk during a session, and those skills reload into context the next time. The UI audit equivalent is obvious once you see it: every refactor produces a pattern-map, and the pattern-map survives into the next audit. Taste stops being something I reconstruct from screenshots each time and starts being something the system accumulates.

The third is structured memory. Not a chatlog. Hermes splits memory into three layers — a static preference file, an FTS5-indexed history of past sessions, and a dialectic user model via Honcho. For audits that maps cleanly onto per-project skeletons, per-competitor taste notes, per-pattern libraries. Each layer has its own access semantics. You don’t cram everything into a vector store and pray.

The fourth is context compression. Orbit kills an agent when it runs long; it isn’t built to think in one thread for hours. Hermes has a dedicated compressor that keeps a single conversation alive across long tool-heavy investigations. For audit work, that’s the difference between a 20-turn session and an all-day one where you’re walking through hundreds of extracted frames.

Where Hermes is overkill

Hermes ships ten-plus messaging platforms, six terminal backends, a full TUI, cron scheduling, voice memo transcription. I don’t need any of that. The honest skeleton of Hermes is maybe 15% of the repo — the agent loop, the adapters, the compressor, the skill plumbing, the memory layers. Everything else is horizontal surface for reach.

Hermes repo: the 15% I'd steal vs. the 85% I'd skip

That’s actually useful. It tells me which files to read and which to ignore. I don’t want to adopt Hermes. I want to steal four modules.

What the next Orbit looks like

The audit harness I’d build is narrower than either project. A thin agent loop, a provider adapter with vision-capability routing, a skill system where every completed audit writes back to a shared pattern library, and a compressor tuned for image-heavy sessions. Nothing else. Not a platform, a harness.

The point isn’t to rebuild Hermes for a smaller domain. It’s that the shape of a useful agent system, once you see it, keeps repeating: state in files, provider abstraction, accumulating skills, bounded memory. Orbit has two of those. Hermes has all four. The audit workflow I want needs all four.

Whether I extend Orbit, fork a minimal Hermes, or write a fresh harness from scratch, TBD. But at least now I know what it has to contain.

The hot take

Hermes is a generalist — a universal agent, built so any domain can bolt on. Orbit is mine, and it’s narrow on purpose. It does one thing: pick up GitHub issues and ship fixes.

But take Orbit and keep expanding its universe — more domains, more tool surfaces, more providers, more persistence — and the limit of that expansion is Hermes. Orbit scaled to infinity is Hermes.

That’s the claim: Hermes is the terminal form of an agent orchestrator. Every domain-specific agent that grows long enough eventually rediscovers the same skeleton — state in files, provider abstraction, accumulating skills, bounded memory. Hermes is what you get when someone has already walked the whole road and left the scaffolding behind for you to reuse.

Which means the interesting question isn’t whether to extend Orbit or adopt Hermes. It’s whether to stay narrow — knowing any expansion drifts toward Hermes — or start from Hermes and cut down, accepting the overkill as the price of standing on the right asymptote.