Nightly Src Projects Desk Raw Survey (2026-05-09)

This raw note preserves the public-safe basis for the 2026-05-09 nightly src/ projects desk. It summarizes inspectable repository evidence only: docs, manifests, branch and commit metadata, status summaries, safe filenames, mtimes, tests, and visible checked-in artifacts. It does not publish secret-bearing files, local settings, raw prompts/logs/trajectories, private corpus bodies, evaluator payloads, raw benchmark outputs, checkpoint/model artifacts, or sensitive/provocative material.

Where a directory name or corpus is itself sensitive enough to distort a public page, this note uses a category label rather than turning the label into a headline. Journalism, in miniature, still requires not dumping the drawer onto the pavement.

Survey scope and method

Survey root: /Users/ericfode/src.
Survey timestamp: 2026-05-09.
Full top-level directory count: 38, including hidden directories.
Execution shape: exactly 10 top-level Hermes survey lanes, dispatched as 3 + 3 + 3 + 1.
Lane recursion: all 10 lane summaries reported that delegate_task was available. Each lane spawned a three-way survey team for purpose/docs/manifests, live work evidence, and public-safety eligibility. Those subteams reported a further one-level three-way leaf recursion; additional recursion beyond those leaf workers was unavailable or unnecessary because of the configured depth cap.
Evidence allowed: README/docs/plans, manifests, branch/status/log metadata, safe modified/untracked filenames, mtimes, tests, checked-in reports, and visible artifacts.
Evidence excluded: secret contents, .env contents, local settings, raw prompts/logs/trajectories, hidden evaluator/supervisor payloads, private corpus bodies, explicit/provocative/unsafe material, raw benchmark outputs, checkpoints/model artifacts, and directories too skeletal for a responsible public claim.
Illustration: generated locally as symbolic SVG editorial art at queries/news-assets/2026-05-09-project-desk-hero.svg; it is not a screenshot.

Ten survey lanes

Hidden local settings; hidden upstream tinygrad research checkout; another-harness; one sensitive social-claim wiki withheld by category.
basis; basis-hermes; basis-jcode; cardgame1 / Dungeon Steward.
Empty creative; deer-flow; FACEMUSIC; gas-city-but-its-just-codex.
gemma4-tinygrad-opt; handterm; hoid; is-codex-better.
is-it-formal; justfooln; kettlebellsim; skeletal local Kimi settings directory.
Local langfuse; local-hermes; scratch meta-hermes; nnpl-external-latent-bus.
nnpl-shared-bus; nnpl-typed-boundary-ir; openai-symphony; empty overengineeredlife.
silly-pi-stuff; private spec-dataset-evolution-corpus; nested src skill scaffold; steward.
testing-rl; testing-rl-hermes; empty tinygrad.
tinygrad-gemma; empty tinygrad-gemma-gemini; tinygrad-gemma-kimi.

Public-safe lead candidates

Specification, Basis, and spec-code grounding

basis evidence: clean git repo on main, HEAD a5544e0 from 2026-05-07, latest message Split reducer UI into separate app entrypoints. Key evidence includes spec.md, Mix manifest, reducer and implementation-imaginer component specs, docs, and tests. Safe summary: an Elixir/Mix project for turning prose/spec artifacts into structured, provenance-backed specification state.
basis-hermes evidence: clean git repo on main, HEAD 0061d32 from 2026-05-05, latest message fix: make basis tool schemas codex-compatible. Key evidence includes README.md, spec.md, plugin.yaml, pyproject.toml, reducer component docs, dashboard manifest, and Python/JS tests. Safe summary: Hermes plugin/dashboard wrapper exposing deterministic Basis reducer and packet-validator surfaces.
basis-jcode evidence: git repo on main, HEAD 4b1e621 from 2026-05-05, ahead of origin and dirty with tracked deletions in reducer example/UI files. Key evidence includes reducer README/spec, package manifest, dashboard/CLI material, and JS tests. Safe summary: Jcode-native control-plane work for reducer ledgers, packet validation, UI projections, and dashboard decision flows. Raw .basis run trees, prompts, event streams, worker packets, validation bodies, and run graphs are withheld.
steward evidence: clean git repo on main, HEAD ba88837 from 2026-05-05, with README.md, pyproject.toml, docs for charter, benchmark spec, architecture, implementation plan, data governance, modeling roadmap, workflows, and decision log. Safe summary: docs-first local spec-code grounding and benchmark project; source/tests/scripts are placeholders at this stage.
is-it-formal evidence: unborn git repo with Lean/Lake scaffold, README.md, lakefile.toml, lean-toolchain, JSON-to-Lean loading code, and Python grading CLI. Safe summary: early Lean/Python scaffold for classifying formalization strength.

Test-writing and evaluator environments

testing-rl evidence: git repo on master, HEAD 139cea4 from 2026-05-04, dirty with tracked workflow/Symphony files plus an untracked recent-data page script/test. Key evidence includes README.md, SPEC.md, pyproject.toml, environment contract, artifact schemas, risk/replay/counterfactual docs, Hermes/Atropos/Tinker adapter docs, Lean files, benchmark task filenames, and test coverage. Safe summary: RL/test-generation environment for writing high-value software tests against hidden reference behavior while preserving replay, evidence, and boundary objects.
testing-rl-hermes evidence: clean git repo on main, HEAD 6cbca51 from 2026-05-02. Key evidence includes package manifest, MASTER_PLAN.md, adversarial risk review, test-generation environment docs, history-derived fixture docs, benchmark suite, reports, source, and tests. Safe summary: prototype package for history-derived test-generation fixtures and sidecar/supervisor-style grading.
Hidden references, oracle/mutant bodies, evaluator internals, raw benchmark JSON, and sidecar/replay artifact bodies remain withheld.

Model, tinygrad, and NNPL benches

tinygrad-gemma evidence: git repo on main, HEAD 11470a3 from 2026-05-07, branch ahead of local upstream by 93 commits, no tracked source changes, 57 untracked local artifact paths. Key evidence includes README.md, pyproject.toml, CI workflow, 119 plan files, and 17 tests covering model behavior, chat server, benchmark helpers, profile/JIT tooling, assistant/MTP decode, Modal/evo fanout, and raw Metal controls. Safe summary: native tinygrad Gemma 4 implementation with Hugging Face config/safetensors loading, text/multimodal paths, tokenizer/KV-cache generation, training/checkpoint surfaces, CLI, and chat entry points.
gemma4-tinygrad-opt evidence: non-git optimization sandbox with model/loader/tokenizer scripts, Metal/backend benchmark scripts, a nested clean tinygrad checkout at e9983e3, and fresh worker/test mtimes on 2026-05-09. Safe summary: local Gemma/tinygrad optimization sandbox; root lacks README and raw artifacts remain private.
tinygrad-gemma-kimi evidence: git repo on branch opt/attention, HEAD 8d23d35 from 2026-04-26, dirty with modified attention/validation/benchmark scripts and patch/result artifacts. Safe summary: compact tinygrad/Gemma attention/JIT/correctness/benchmark workbench, not a publishable package.
NNPL evidence: nnpl-external-latent-bus, nnpl-shared-bus, and nnpl-typed-boundary-ir all have docs, manifests or project briefs, source/tests, benchmark plans, and artifacts. Public summaries are methodological: two-space external/internal latent bus experiments, shared-bus negative/mixed evidence, and typed boundary IR for legality/auditability/replanning. Raw metrics, rollout/readout JSONL, result bodies, latent traces, export records, and exact success/failure examples are withheld.

Simulation, game, interface, and craft work

kettlebellsim evidence: git repo on codex/reward-audit-and-swing-training, HEAD 1d973def on 2026-05-09, ahead 36, no tracked modifications, with untracked hidden/temp helper files. Key evidence includes pyproject.toml, uv.lock, AGENTS/plan/runbook docs, source, scripts, and 97 Python tests. Recent safe mtimes center on bounded Modal Isaac probe execution wrapper docs, script, and test. Safe summary: Python research toolkit for simulation-first kettlebell swing/path-signature/biomechanics experiments with deterministic planar restart and remote simulator/RL scaffolding.
cardgame1 / Dungeon Steward evidence: clean Godot 4.6 repo on hermes/combat-stage-art-fallback-upstream, HEAD a9a8ef6 from 2026-04-15, with README, GDD/design docs, balance/simulation docs, project metadata, imagegen manifest/schema, CI, and deterministic/simulation/smoke tests. Safe summary: browser-first roguelite deckbuilder with deterministic combat, run progression, generated-art workflow, and studio process docs.
handterm evidence: clean Rust repo on master, HEAD 977e709 from 2026-04-19, with README, Cargo manifests, optimization/remain-work docs, CLI test, CI, and release build artifacts. Safe summary: Wayland-native Rust terminal emulator focused on performance, renderer architecture, and shared multi-window host structure.
FACEMUSIC evidence: dirty Rust/web/iOS/ML repo, HEAD f6cf6cf from 2026-04-19, with browser architecture docs, expression-forecasting plan, Cargo/web/ML/iOS manifests, Rust tests, and modified iOS/web/audio/control files. Safe summary: privacy-sensitive face-controlled music/instrument prototype. Biometric captures, ML runs, checkpoints, saliency/probe outputs, local assistant state, and raw sessions are withheld.
hoid evidence: dirty creative/tooling repo with README, design docs, Go/game/Lean/test surfaces, and generated media/tooling artifacts. Safe summary: worldbuilding/tooling workspace for structured world-packet production and creative review tooling. Story/canon/comic/music bodies and generated prompt/media assets are withheld.

Orchestration and harness side rooms

gas-city-but-its-just-codex evidence: dirty Rust/Lean/Swift repo on codex/native-codex-ui, HEAD 198aefc from 2026-04-21, with README, Rust workspace, workflow-ledger specs, schemas/templates, MCP/gRPC/app-server surfaces, operator tooling, state/docs/scripts, and formal Lean material. Safe summary: Codex-native durable workflow/control-plane research prototype. Runtime state, logs, transcripts, database files, context boards, benchmark payloads, workflow/thread IDs, and live operator state are withheld.
openai-symphony evidence: dirty Elixir/Phoenix repo on main, HEAD 58cf97d from 2026-04-27, with README, SPEC, license, Elixir docs, Mix manifest, LiveView/API/dashboard/logging/token accounting material, and 16 tests. Safe summary: experimental orchestration service for turning issue-tracker work into isolated autonomous implementation runs.
another-harness, is-codex-better, deer-flow, meta-hermes, local langfuse, local-hermes, silly-pi-stuff, and nested src skill scaffolds were surveyed and kept to high-level/category-only treatment according to maturity, localness, or prompt/config risk.

Held back from project-specific public detail

The survey fully held back, or reduced to category-only mention, material from hidden local settings, a sensitive social-claim wiki, skeletal or empty directories, local deployment/model-runner folders, private corpus raw bodies, prompt/agent/skill instruction bodies, scratch/meta workspaces, generated media, raw logs/prompts/trajectories, evaluator-like payloads, hidden references/oracles, benchmark raw outputs, model/checkpoint artifacts, privacy-sensitive capture data, story/canon drafts, local service configuration, and cache/build/vendor directories.

Editorial synthesis

The publishable movement clusters around five themes:

specification state is becoming reducible, provenance-bearing, and reviewable;
test-writing environments are making reward, replay, and hidden-reference boundaries explicit;
tinygrad/Gemma and NNPL workbenches are active but remain behind artifact and benchmark-safety gates;
simulation, terminal, game, and interface projects are using tests, manifests, and design docs to keep craft inspectable;
orchestration projects continue to externalize work into ledgers, dashboards, app-server sessions, formal surfaces, and operator state — with the state itself not public.

The story is still not a launch. It is a more useful thing: a workshop floor where more claims have a place to stand, and more drawers remain shut when they should.

Agent Harness Wiki

Browse