ArXiv Round Two: Formal Semantics for Agent Harnesses

Goal

Re-run arXiv research with a sharper target: not generic “agent papers,” but mathematical ideas that can actually guide a formal theory of agent harnesses.

What I was looking for

  1. Formal methods that address the intent gap in AI coding systems.
  2. Epistemic update machinery that can justify richer belief layers without discarding simpler quotients.
  3. Partial-order or pomset semantics that treat concurrent and branching harness work honestly.

Most relevant papers by thread

1. Formal methods for agentic coding

  • Zhang et al. (2024), The Fusion of Large Language Models and Formal Methods for Trustworthy AI Agents
    • Best as a roadmap paper.
    • Main use: legitimizes the fusion of LLM flexibility with formal-method guarantees rather than treating them as opposing paradigms.
  • Lahiri (2026), Intent Formalization
    • Best statement of the coding-agent problem.
    • Main use: reframes reliability as a specification problem, not merely a generation problem.

2. Epistemic and probabilistic updates

  • Conradie et al. (2016), Probabilistic Epistemic Updates on Algebras
    • Most relevant to the current bounded-confidence direction.
    • Main use: updates as algebraic transformations, which is congenial to quotient maps and forgetful semantics.
  • Kishida (2017), Categories for Dynamic Epistemic Logic
    • Best structural paper in this batch.
    • Main use: encourages thinking in families of related models and update maps, not isolated state mutations.

3. Partial-order traces and concurrency

  • Wang (2026), Structural Operational Semantics for True Concurrency
    • Best direct bridge from ordinary SOS to true-concurrency semantics.
    • Main use: suggests a path from serial event traces to pomset-like harness transitions.
  • Edixhoven et al. (2022), Branching Pomsets for Choreographies
    • Best paper here for multi-agent coordination with explicit branching.
    • Main use: shows that concurrency alone is insufficient; choices need compact semantic representation too.

Provisional synthesis

The most promising shape for a harness theory now looks like this:

That is much cleaner than trying to squeeze every concern into one giant state machine. It suggests a tower of related semantics rather than a single overburdened object.

What seems most actionable for the current repo

  1. Treat intent surfaces and acceptance checks as first-class formal objects, not just review habits.
  2. Keep pushing on commuting forgetful maps between richer epistemic layers and simpler quotients.
  3. Treat the current scheduler layer as a coarse shadow of a future partial-order trace semantics.
  4. Resist fake Bayesian grandeur: modest bounded-confidence layers are acceptable if their quotient behaviour is explicit and useful.
  • More on dynamic / probabilistic epistemic logic with explicit update operators.
  • More on event structures and realizability conditions for choreographies.
  • More on specification validation and user-in-the-loop intent formalization for coding agents.

This note distills into formal-methods-for-agent-harnesses, probabilistic-epistemic-updates, and partial-order-trace-semantics, and it should be read beside new-harness-design-notes and harness-engineering.