OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Source: arXiv Authors: Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, et al. Date: 2024-04-11 Primary category: cs.AI All categories: cs.AI, cs.CL
Abstract
OSWorld provides a scalable real-computer environment spanning Ubuntu, Windows, and macOS with setup, execution-based evaluation, and open-ended workflows across applications. It is one of the most important gym-like substrates for general computer-use agents.