DeepYardDeepYard
R

RODS

RL-powered data synthesis for training multi-turn tool-use agents at capability boundaries

Open SourceFree

About

RODS (Reward-Driven Online Data Synthesis) is a research framework that addresses data depletion in reinforcement learning for tool-using agents. It continuously generates high-quality training samples at the agent's capability frontier, improving sample efficiency by 2-3x compared to static datasets. Designed for researchers building multi-turn agents that need to learn complex tool interactions through RL, RODS dynamically adapts training data as agents improve.

Details

Type
Integrations
Language

Tags

tool-usemulti-agentautonomousframeworkopen-sourcepython