AutoDSPy: Automating Modular Prompt Design with Reinforcement Learning for Small and Large Language Models

Source: ACL Anthology DOI: 10.18653/v1/2025.emnlp-industry.192 Authors: Nafew Azim, Abrar Ur Alam, Hasan Bin Omar, Abdullah Mohammad Muntasir Adnan Jami, Jawad Ibn Ahad, Muhammad Rafsan Kabir, Md. Ismail Hossain, Fuad Rahman, Mohammad Ruhul Amin, Shafin Rahman, Nabeel Mohammed Date: 2025

Abstract / key passage

AutoDSPy frames DSPy pipeline construction as a reinforcement-learning problem. Instead of only tuning one prompt or one already-declared program, it uses an RL-tuned policy to select reasoning modules, input/output signatures, and execution strategies, thereby automating modular LM workflow design.

Harness takeaway

AutoDSPy matters because it moves optimization closer to what long-lived harnesses actually operate: modular, inspectable workflow artifacts rather than one opaque prompt string. It is therefore best read as the RL-over-prompt-program branch, not merely as another prompt-tuning paper.