EnterpriseBench Corecraft

Overview

EnterpriseBench Corecraft is a high-fidelity enterprise RL environment centered on customer-support workflows, expert-authored rubrics, and claims of out-of-distribution transfer. It is explicitly designed to train generalizable agents on realistic professional tasks.

Why it matters

It matters because it pushes the field away from toy worlds and toward the ugly, rubric-laden workflows that actual organizations care about.

Distinctive trait

Its distinctive trait is reward design through domain realism and expert rubrics rather than only synthetic procedural correctness.

Relationships

Read EnterpriseBench Corecraft with mlgym, appworld, evaluation-and-review-loops, and compare it with tau-bench plus swe-gym.