Skip to main content
← HomeJacob Dineen
All publications

Mid-training with Self-Generated Data Improves Reinforcement Learning in Language Models

Aswin RRV, Jacob Dineen, Divij Handa, Mihir Parmar, Ben Zhou, Chitta Baral, Swaroop Mishra

Pending ICML 2026