WorfEval

Overview

WorfEval is the evaluation layer paired with WorfBench, focusing on graph structure, subsequence alignment, and downstream usefulness for generated workflows. It is the scoring apparatus that makes the benchmark more than polite scenery.

Why it matters

It matters because two workflows can differ superficially yet preserve the same useful substructure. WorfEval tries to score that middle ground instead of reducing everything to exact-match puritanism.

Distinctive trait

Its distinctive trait is multi-view workflow scoring: structure, partial alignment, and practical downstream usefulness all count.

Relationships

Read WorfEval with WorfBench, RobustFlow, and evaluation-and-review-loops. It also complements procedure-oriented evaluation in SOPBench.