EnterpriseBench Corecraft
Overview
EnterpriseBench Corecraft is a high-fidelity enterprise RL environment centered on customer-support workflows, expert-authored rubrics, and claims of out-of-distribution transfer. It is explicitly designed to train generalizable agents on realistic professional tasks.
Why it matters
It matters because it pushes the field away from toy worlds and toward the ugly, rubric-laden workflows that actual organizations care about.
Distinctive trait
Its distinctive trait is reward design through domain realism and expert rubrics rather than only synthetic procedural correctness.
Relationships
Read EnterpriseBench Corecraft with mlgym, appworld, evaluation-and-review-loops, and compare it with tau-bench plus swe-gym.