Nightly Src Projects Desk Raw Survey (2026-05-14)
This raw note preserves the public-safe basis for the 2026-05-14 nightly src/ projects desk. It summarizes inspectable local evidence only: README/docs/plans, manifests, branch/status/log metadata, safe filenames, mtimes, tests, checked-in reports, and visible artifacts. It does not publish secret-bearing files, local settings, raw prompts/logs/trajectories, private corpus bodies, evaluator payloads, raw benchmark outputs, checkpoint/model artifacts, biometric/capture data, generated media bodies, or sensitive/provocative material.
Where a directory is local-only, sensitive, artifact-heavy, private-corpus-backed, or too skeletal, this note uses category-level wording. The point is provenance, not rummaging.
Survey scope and method
- Survey root:
/Users/ericfode/src. - Survey timestamp: 2026-05-14 01:44 PDT.
- Full top-level directory count: 41, including hidden directories.
- Execution shape: exactly 10 top-level Hermes survey lane identities, dispatched as one batch of 10 orchestrator lanes.
- Lane coverage audit: controller enumeration found 41 assigned directories, 41 unique assignments, no missing directories, no extras, and no duplicates.
- Lane recursion: all 10 top-level lanes reported spawning three read-only subteams for purpose/docs/manifests, live-work evidence, and public-safety/public-summary eligibility. Those subteams reported one further three-way leaf recursion where the runtime exposed delegation. The recorded depth is lane → subteams → leaves; no deeper recursion is claimed.
- Evidence allowed: README/docs/plans, manifests, branch/status/log metadata, safe modified/untracked filenames, mtimes, tests, checked-in reports, and visible artifacts.
- Evidence excluded: secret contents,
.envcontents, hidden local settings, raw prompts/logs/trajectories, hidden evaluator/supervisor payloads, private corpus bodies, explicit/provocative/unsafe material, raw benchmark outputs, checkpoints/model artifacts, biometric/capture data, generated media bodies, and directories too skeletal for a responsible public claim. - Illustration: the first image-backend raster draft was rejected for generated text artifacts. The published illustration is deterministic symbolic SVG art at
queries/news-assets/2026-05-14-project-desk-hero.svg; it is not a screenshot.
Ten survey lanes
- Hidden assistant configuration; internal security-scan artifacts; hidden tinygrad research checkout; privacy-sensitive face/music prototype.
- Early harness workspace; one protected-class-sensitive social-claim notebook held back by category; Basis core; Basis Hermes plugin.
- Basis/Jcode reducer; Dungeon Steward; empty creative placeholder; DeerFlow checkout.
- Gas City/Codex orchestration; Gemma Dungeon; Gemma/tinygrad optimization scratch; Handterm.
- Hoid world-packet studio; draft Codex/Hermes harness-extension repo; Lean formality grader; local research/benchmark harness.
- Kettlebell simulation; skeletal Kimi settings area; local Langfuse deployment/config; local Hermes GGUF runtime.
- Scratch meta-Hermes workspace; NNPL external bus; NNPL shared bus; NNPL typed-boundary IR.
- OpenAI Symphony; empty life-ops placeholder; private Pi companion/browser automaton workspace; private spec-dataset corpus.
- Nested assistant-workflow scaffold; Steward provenance service; testing-RL; testing-RL-Hermes.
- Textual world-model research loop; empty tinygrad placeholders; tinygrad-Gemma; tinygrad-Gemma/Kimi optimization workspace.
Public-safe lead candidates
Same-night model/world-state research
gemma-dungeonis tonight’s strongest live signal. Evidence: git repo onmain, dirty with 12 tracked modified files; same-night commit2f90fb1on 2026-05-14 for a bounded real world-model baseline report; modified README, replay/world-model docs, root plan/spec files, schema, CLI/world-model probe code, and tests. Safe summary: a symbolic roguelike research workspace where explicit game state remains authoritative and model/world-model probes are advisory. Replay payloads, datasets, local endpoints, prompt/logit artifacts, and dirty diffs remain withheld.textual-world-modelis a new active research-loop signal. Evidence: non-git workspace with same-night heartbeat/ledger files, literature reports, benchmark/control-map artifacts, and anindex.htmlframing a Textual JEPA World Model over repository histories. Safe summary: benchmark-first research around action-conditioned predictors over Git/repository timelines. Raw ledgers, worker briefs, JSONL fixtures, literature corpus bodies, and local paths remain withheld.gemma4-tinygrad-optshows same-night optimization-loop activity by filename: orchestrator log, worker prompt, and test worker files, plus older Gemma/tinygrad model and Metal benchmark scripts. It lacks root git/README evidence, so it stays category-level: active Gemma/tinygrad optimization scratch, not a publishable package.
Stable test-generation and model-runtime benches
testing-rlremains the clearest verifier/test-generation repo. Evidence: clean git tree onmaster, locally ahead of origin by 3 commits; recent May 11 commits around ranking lift, local verifier-dashboard evidence, held-out verifier rankers, and counterfactual cases; README, docs, pyproject, scripts, reward dashboards by filename, Lean/formal material, and tests. Safe summary: artifact-first RL environment for agents that write valuable software tests while writer-visible state stays separated from evaluator-held references.testing-rl-hermesremains the smaller companion prototype. Evidence: clean git tree with May 1-2 commits around inverse-fix history mutants and test-generation fixtures. Safe summary: deterministic history-derived test-writing episodes for Hermes/Atropos-adjacent evaluation, with oracle/supervisor payloads held back.tinygrad-gemmaremains the strongest model-runtime package. Evidence: git repo onmain, HEAD11470a3, ahead of origin by 93 commits; README, pyproject, CLI/chat entry points, AGENTS boundary, docs/plans, tests, scripts, and many untracked benchmark/reference-fetch artifacts. Safe summary: native tinygrad Gemma 4 inference/generation implementation with Hugging Face-style config/safetensor loading, tokenizer/multimodal surfaces, CLI/chat, tests, and optimization workflows. Raw benchmarks, profiles, checkpoints,.evoreceipts, and performance claims remain withheld.
Spec/provenance and orchestration bench
- Basis-style work remains a coherent research cluster.
basishas recent reducer/imaginer commits and untracked generated experiment material;basis-hermesis the clean public-safe Hermes plugin for reducing Markdown specs into deterministic Basis packets;basis-jcodeis ahead/dirty and stays category-level. Safe summary: structured spec-state and provenance-backed reducer work, with generated.basisruns, packet bodies, worker streams, and dashboards withheld. stewardis the liveliest provenance-service side room. Evidence: dirty git repo with seeded design commit, modified docs, untracked Elixir/Mix/Ecto/service files, migrations/tests by filename, and service-kernel fixture material. Safe summary: early durable service kernel for linking specs, code, tests, reasoning, agent runs, verification, and Git history into cited queries. Not production-ready.openai-symphonyremains an active orchestration side room. Evidence: dirty Elixir/Phoenix repo with modified app-server, orchestrator, status-dashboard, presenter, tests, and one hidden skill doc; Apache/NOTICE, README/SPEC, Elixir docs, and a recent app-server model-config commit. Safe summary: issue-tracker-driven autonomous coding-agent orchestration with observability; local workflow prompts, configs, logs, tracker identifiers, and runtime evidence stay private.gas-city-but-its-just-codex,another-harness,is-codex-better, anddeer-flowwere surveyed as useful architecture/control-plane context, but dirty/no-commit/local-config boundaries keep the public copy at architecture level.
Craft, interface, and simulation side rooms
handtermremains the clean craft lead: MIT-licensed Rust 2024 Wayland terminal workspace, clean git tree, README, Cargo workspace, CI, tests, optimization docs, and recent kitty graphics refactors.cardgame1/ Dungeon Steward remains the clean game lead: Godot project, clean branch ahead by one verified fallback commit, design docs, deterministic combat/balance workflow, scenes/data/source, and smoke tests. Generated art, prompt/session logs, model/checkpoint material, and balance raw artifacts stay withheld.kettlebellsimremains a solid simulation side room: clean branch ahead by 36 commits, May 9 bounded Modal/Isaac wrapper/probe-guard commits, pyproject, docs/runbooks/reports, scripts, configs, and tests. Safe summary: simulation-first kettlebell swing path-signature research with local deterministic planar gates before remote simulator work. Remote service details, logs, trajectories, rollouts, checkpoints, and generated media remain withheld.FACEMUSIC,hoid,silly-pi-stuff, and local deployment/runtime folders were surveyed, but privacy-sensitive capture, creative/canon bodies, private companion mechanics, local configs, and model/runtime artifacts keep them category-level.
Held back from project-specific public detail
The survey fully held back, or reduced to category-only mention, hidden local settings, internal security-scan artifacts, hidden-only or empty directories, one protected-class-sensitive social-claim notebook, local deployment/model-runner folders, private corpus bodies, prompt/agent/skill instruction bodies, scratch/meta workspaces, generated media, raw logs/prompts/trajectories, evaluator-like payloads, hidden references/oracles, benchmark raw outputs, model/checkpoint artifacts, biometric/capture data, creative story/canon drafts, service configuration, raw test/counterexample bodies, cache/build/vendor directories, and too-skeletal placeholders.
Editorial synthesis
The publishable movement tonight clusters around five claims:
gemma-dungeonhas the strongest same-night repository evidence;textual-world-modelis active, but only as benchmark-first research-loop evidence rather than a validated model result;testing-rlandtinygrad-gemmaremain the strongest stable benches;- Basis/Steward/Symphony/Gas-City-style control-plane work is substantial but must be summarized at architecture/provenance level because much of it is dirty, local, or generated-artifact-heavy;
handterm, Dungeon Steward, andkettlebellsimremain public-safe craft/game/simulation side rooms when described from manifests, docs, tests, and commits rather than raw artifacts.
That is enough for a public desk. The filesystem offered more; manners declined.