AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Source: arXiv Authors: Harsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal Date: 2024-07-26 Primary category: cs.SE All categories: cs.SE, cs.AI, cs.CL, cs.LG
Abstract
AppWorld is a high-fidelity multi-app environment with hundreds of APIs and state-based unit-test evaluation for rich interactive coding agents. It is arguably the cleanest current bridge between tool-using agents, executable environments, and RL-style training loops.