Reinforcement Learning for Long-Horizon Interactive LLM Agents
Source: arXiv Authors: Kevin Chen, Marco Cusumano-Towner, Brody Huval, Aleksei Petrenko, Jackson Hamburger, Vladlen Koltun, Philipp Krähenbühl Date: 2025-02-03 Primary category: cs.LG All categories: cs.LG, cs.AI
Abstract
This paper trains interactive digital agents directly inside a stateful multi-app environment and shows that RL improves long-horizon behavior on AppWorld. It is one of the strongest demonstrations that harness environments can move from evaluation substrates to actual training gyms.