EnterpriseBench Corecraft: Training Generalizable Agents on High-Fidelity RL Environments

Source: arXiv Authors: Sushant Mehta, Logan Ritchie, Suhaas Garre, Ian Niebres, Nick Heiner, Edwin Chen Date: 2026-02-18 Primary category: cs.AI All categories: cs.AI, cs.LG

Abstract

EnterpriseBench Corecraft is a high-fidelity enterprise RL environment with expert-authored rubrics, realistic workflows, and explicit claims about out-of-distribution transfer. It is notable because it treats environment realism and rubric quality as the key determinants of whether agent training generalizes.