Harness Architecture Comparison

Key dimensions

The pages in this wiki differ most clearly along five architectural axes: session container, memory substrate, work representation, evaluation loop, and deployment surface.

Comparison table

SystemSession containerMemory substrateWork representationEvaluation styleSurface model
codex-cliThreads / turns / items via codex-app-server plus cloud-delegated tasksSession history plus repo docs and shared client statePlans, docs, tool state, worktrees, automations, pluginsSelf-review, GitHub auto-review, and enforced repo checksCLI, IDE, web, app, cloud, and SDK/Slack control paths
claude-codeFresh sessions with resumable artifacts, custom subagents, and experimental agent teams across separate sessionsCLAUDE.md, auto memory, feature lists, progress logsSprint contracts, subagents, agent teams, scheduled tasks, and explicit pass/fail featuresSeparate evaluator plus CI/review integrations and hooksTerminal, IDE, desktop, browser, and remote-control surfaces
hermes-agentPersistent multi-platform conversations and gateway-backed sessionsSearchable memory, skills, user modeling, API-backed reuseTasks, skills, cron jobs, profilesTool-driven verification and memory reuseCLI, messaging, MCP, and OpenAI-compatible HTTP frontends
memento-skillsPersistent local sessions plus per-user IM sessions and stateful promptsStructured markdown skills, local/vector/db skill stores, and layered runtime configurationRetrieved skills, generated skills, reflection-driven rewrites, skill market downloadsReflection loop plus static and execution-oriented skill verificationCLI, desktop GUI, local sandbox, and IM gateway surfaces
gas-townSwarm sessions across named rolesBeads in Git / DoltBeads, epics, molecules, formulas, wispsHuman plus role-based oversighttmux-style orchestrator/factory
gas-cityModular orchestration nodesBeads plus Wasteland federationBuilder primitives and wanted-board exchangeFederated trust and validator rolesCustom topologies over shared protocols
openclawPersistent service runtimeWorkspace files, long-lived agent state, and layered skillsEmbedded runtime plus ecosystem skills and integrationsLess explicit in current corpusCross-channel background service with a single main workspace

Main architectural lesson

The important divergence is not “which model is best” but where state lives and how work is represented. Codex externalizes protocol boundaries, Claude externalizes handoff artifacts, Hermes externalizes personal memory and skills, Memento-Skills externalizes learning itself into a writable skill library, and Gas Town externalizes the work graph.

Read this page after agent-harness-anatomy and alongside orchestration-topologies, memory-persistence, and work-management-primitives. It is also the factual substrate for new-harness-design-notes.