AutoDSPy: Automating Modular Prompt Design with Reinforcement Learning for Small and Large Language Models
Source: ACL Anthology DOI: 10.18653/v1/2025.emnlp-industry.192 Authors: Nafew Azim, Abrar Ur Alam, Hasan Bin Omar, Abdullah Mohammad Muntasir Adnan Jami, Jawad Ibn Ahad, Muhammad Rafsan Kabir, Md. Ismail Hossain, Fuad Rahman, Mohammad Ruhul Amin, Shafin Rahman, Nabeel Mohammed Date: 2025
Abstract / key passage
AutoDSPy frames DSPy pipeline construction as a reinforcement-learning problem. Instead of only tuning one prompt or one already-declared program, it uses an RL-tuned policy to select reasoning modules, input/output signatures, and execution strategies, thereby automating modular LM workflow design.
Harness takeaway
AutoDSPy matters because it moves optimization closer to what long-lived harnesses actually operate: modular, inspectable workflow artifacts rather than one opaque prompt string. It is therefore best read as the RL-over-prompt-program branch, not merely as another prompt-tuning paper.