Nightly Src Projects Desk Raw Survey (2026-05-31)

This raw note preserves the public-safe basis for the 2026-05-31 nightly src/ projects desk. It uses inspectable local evidence only: README/docs/plans, manifests, branch/status/log metadata, safe modified/untracked filenames, mtimes, tests, checked-in reports, and visible artifacts. It does not publish secret-bearing files, .env contents, hidden local settings, raw prompts/logs/trajectories, private corpus bodies, evaluator/oracle payloads, raw benchmark bodies, checkpoint/model artifacts, biometric/capture data, generated media bodies, or explicit/provocative material.

Where a directory is local-only, sensitive, artifact-heavy, private-corpus-backed, or too skeletal, this note uses category-level wording. A filesystem is not a press release; this is one of its more civilized properties.

Survey scope and method

  • Survey root: /Users/ericfode/src.
  • Survey timestamp: 2026-05-31 01:35 PDT.
  • Full top-level directory count: 50, including hidden directories.
  • Execution shape: exactly 10 top-level Hermes survey lane identities, dispatched as one batch of ten orchestrator lanes.
  • Lane coverage audit: controller enumeration found 50 assigned directories, 50 unique assignments, no missing directories, no extras, and no duplicates.
  • Lane recursion: all 10 lanes completed. Each lane reported spawning three read-only evidence subteams for purpose/docs/manifests, live-work evidence, and public-safety/public-summary eligibility; each subteam reported a further three-way leaf probe. The recorded depth is lane → subteams → leaves; no deeper recursion is claimed.
  • Evidence allowed: README/docs/plans, manifests, branch/status/log metadata, safe modified/untracked filenames, mtimes, tests, checked-in reports, and visible artifacts.
  • Evidence excluded: secret contents, .env contents, hidden local settings, raw prompts/logs/trajectories, hidden evaluator/supervisor payloads, private corpus bodies, explicit/provocative/unsafe material, raw benchmark bodies, checkpoints/model artifacts, biometric/capture data, generated media bodies, and directories too skeletal for responsible public claims.
  • Illustration: raster image generation was attempted but unavailable because the configured backend lacked FAL_KEY. The final editorial illustration was generated locally as deterministic symbolic SVG at queries/news-assets/2026-05-31-project-desk-hero.svg. It is an illustration, not a screenshot.

Ten survey lanes

The exact top-level lane count was 10. One provocative/protected-class-sensitive directory in lane 01 is intentionally withheld by name; it was surveyed and counted, but not publicized.

  1. .claude; .socket-dev-scan; .tinygrad_research; another-harness; one sensitive social-claim directory withheld by name.
  2. basis; basis-hermes; basis-jcode; cardgame1; creative.
  3. deer-flow; FACEMUSIC; gas-city-but-its-just-codex; gemma-dungeon; gemma4-tinygrad-opt.
  4. handterm; hoid; iii-wiki; is-codex-better; is-it-formal.
  5. jepa-expriments; jepa-lang; jepa-poker; justfooln; kettlebellsim.
  6. kimi-tests; langfuse; llama.cpp; local-hermes; meta-hermes.
  7. nnpl-external-latent-bus; nnpl-shared-bus; nnpl-typed-boundary-ir; openai-symphony; overengineeredlife.
  8. parenting-bookshelf-compass; quiz; silly-pi-stuff; spec-dataset-evolution-corpus; src.
  9. steward; testing-rl; testing-rl-hermes; textual-world-model; tinygrad.
  10. tinygrad-gemma; tinygrad-gemma-gemini; tinygrad-gemma-kimi; unconventional-jepa-lab; word-games.

Public-safe lead candidates

Game, symbolic-world, verifier, and simulation work

  • gemma-dungeon is tonight’s cleanest same-night research/game lead. Evidence: README, goal/spec/implementation docs, pyproject, schemas, tests, clean main, and 2026-05-30/31 commits around verified eval/train-gap and sweep-best status-token work. Safe summary: embedding-native roguelike/world-model research where symbolic game state remains authoritative and model-facing projections are auditable. Hold back prompts, replay/example JSON bodies, generated packs, private corpora, and model/checkpoint artifacts.
  • cardgame1 / Dungeon Steward remains a solid game-craft lead. Evidence: Godot project, README, MIT license, design/docs/data, branch hermes/combat-stage-art-fallback-upstream ahead by one commit, recent work around combat-stage art fallback, map hover legality, and authored floor-one map layout. Safe summary: browser-first fantasy roguelite deckbuilder prototype with deterministic combat/map/reward systems and generated-art fallback handling. Hold back .beads, agent state, raw JSONL balance outputs, imagegen prompts/inputs/outputs, generated media, and imported/generated Godot artifacts.
  • testing-rl remains the stable verifier/test-generation bench. Evidence: README/SPEC/WORKFLOW, pyproject, docs/formal material, branch master ahead of origin by three commits, and clean status with recent commits around verifier dashboard evidence, held-out verifier ranking, live rewards dashboard, and counterfactual case breakdowns. Safe summary: an RL/test-generation environment with evaluator-held references and local verifier evidence. No training-victory claim is supported by the inspected evidence.
  • testing-rl-hermes is a sibling sanitized bench for deterministic history-derived test-generation fixtures and grading logic. Evidence: docs, benchmark fixtures, clean main, and recent commits adding inverse-fix history mutants and materialized history commits as fixtures. Hold back hidden reference/mutant/oracle trees and generated reports.
  • kettlebellsim is the clean simulation lead. Evidence: clean branch codex/reward-audit-and-swing-training, ahead by 36 commits, pyproject, docs/configs/scripts/tests, and visible work on bounded Modal Isaac probe execution wrappers/guards plus planar local-to-remote restart validation. Safe summary: deterministic local restart and validation before bounded remote simulator/RL execution.

Orchestration, provenance, and formal/spec work

  • The Basis cluster remains coherent. basis shows active Elixir/BEAM spec-basis rewrite/reducer work, branch ahead by one, and a 2026-05-24 imaginer workflow commit; basis-hermes is the clean plugin/dashboard slice exposing deterministic reducer and packet validation tools; basis-jcode is useful but artifact-heavy, ahead by 10 with .basis ledgers, prompts, NDJSON, dashboard outputs, and tracked deletions. Safe summary: structured spec-state custody and deterministic/provenance-backed reduction, not raw packet publication.
  • another-harness is a high-level side-room: a Lean-backed Codex/Hermes harness experimentation repo exists, but the repo has no commits yet and consists of untracked project content. Publish only the curated purpose, not state, prompts, logs, or operational internals.
  • is-it-formal is publishable as a narrow project summary: a Lean/Python scaffold for grading formalization strength across domains. Caveat: no visible license and no commits yet, so do not imply public code-release status.
  • openai-symphony has a clear public concept — isolated autonomous implementation runs managed by an Elixir reference service and dashboard — but local modified app-server/orchestrator/dashboard/test files, logs, prompts, and token/accounting operational details keep it side-room only.
  • steward is promising but held back: local evidence supports an Elixir/Postgres semantic provenance/query service for agentic software work, yet the dirty/untracked service tree, private-corpus references, no visible license, and local config surfaces require redaction.

JEPA, NNPL, tinygrad/Gemma, systems craft, and humane artifacts

  • jepa-lang is the clean small IR/replay artifact: README, pyproject, docs, tests, and source files support a deterministic typed-operation IR with replayable traces and evidence receipts.
  • jepa-poker is public-safe at high level: JEPA/world-model experiments for imperfect-information poker, visibly oriented around Kuhn/Leduc/player benchmarking. Hold back experiment ledgers, match/hand outputs, and raw benchmark artifacts.
  • unconventional-jepa-lab is a strong research-bench summary candidate: branch ahead by one, dirty lane packet/evidence manifests, README/mission/gates/lane docs, and explicit falsification gates. Redact private/local-path context, .beads, .codex, .gascity, and raw packet details.
  • The NNPL trio is useful as concept-level research context: external latent bus, shared bus with a documented negative v0 result, and typed-boundary IR. All three require run/result/export artifact redaction.
  • tinygrad-gemma is technically rich but artifact-heavy: native tinygrad Gemma 4 runtime/experimentation surfaces, branch ahead by 93, untracked benchmark/reference-fetch artifacts, and checkpoint/evolution-state boundaries. Publish only sanitized summaries, not performance claims or repo snapshots.
  • tinygrad-gemma-kimi is a separate optimization scratch repo on opt/attention with dirty files, patch/reject/test/base artifacts, and result JSONs; summarize only as experimental tinygrad/Gemma optimization work.
  • word-games has pivoted toward Story JEPA / character-interiority modeling. It is interesting, but generated runs/checkpoints/metrics and missing license keep it a curated side room.
  • handterm remains a clean systems-craft note: MIT-licensed Rust/Wayland terminal work with CPU/GPU components and clean upstream-tracking status.
  • parenting-bookshelf-compass is a clean humane/static artifact: README, index.html, clean main, and a recent publish commit support a non-diagnostic parenting-books compass quiz summary.

Held back from project-specific public detail

The survey fully held back, or reduced to category-only mention, hidden local assistant/settings directories, security/dependency scan artifacts, empty or skeletal directories, one provocative/protected-class-sensitive social-claim notebook, local deployment/model-runner folders, private corpus bodies, prompt/agent/skill instruction bodies, scratch/meta workspaces, generated media, raw logs/prompts/trajectories, evaluator/oracle payloads, raw benchmark outputs, model/checkpoint artifacts, biometric/capture data, creative/canon/world-packet drafts, service configuration, raw test/counterexample bodies, local .env-style material, cache/build/vendor directories, dirty patch/reject variants, and too-skeletal placeholders.

Specific category-level handling: FACEMUSIC has meaningful face-control music evidence but camera/facial-capture and generated/model material make owner review appropriate; spec-dataset-evolution-corpus is explicitly private and remains unpublished; llama.cpp and .tinygrad_research are recognized as public upstream/reference substrates rather than local original leads; langfuse, local-hermes, kimi-tests, quiz, overengineeredlife, creative, empty tinygrad, tinygrad-gemma-gemini, and nested src were not promoted into public claims.

Editorial synthesis

The publishable movement tonight clusters around six claims:

  1. gemma-dungeon is the clean same-night lead; cardgame1 carries the game-craft line.
  2. testing-rl and testing-rl-hermes remain the verifier/test-generation bench.
  3. kettlebellsim is the clearest simulation-validation lead.
  4. Basis/Hermes, another-harness, is-it-formal, openai-symphony, and steward form the formal/spec/provenance/control-plane room, but most of that room remains side-room material under redaction.
  5. jepa-lang, jepa-poker, unconventional-jepa-lab, the NNPL trio, tinygrad-gemma, tinygrad-gemma-kimi, and word-games belong on the research bench with narrower claims than their artifact directories might tempt.
  6. handterm and parenting-bookshelf-compass are the tidy public-safe side notes.