WorkArena

Overview

WorkArena is a benchmark of common knowledge-work tasks performed in enterprise software, built on top of the BrowserGym substrate. It focuses on the kinds of routine operational work that browser agents would need to automate in real organizations.

Why it matters

It matters because knowledge-work automation is a more serious target for harnesses than merely proving one can book a flight badly with confidence.

Distinctive trait

Its distinctive trait is focusing on everyday enterprise work rather than internet miscellany, which makes its failure modes more relevant to operator-facing harnesses.

Relationships

Read WorkArena with browsergym, workarena-plus-plus, webarena, and the enterprise-task section of rl-gyms-and-executable-environments-for-ai-harnesses.