DeepYardDeepYard
S

SimLab

Adaptive simulation platform for building, evaluating, and refining long-horizon AI agents

Open SourceFree

About

SimLab is the data layer for adaptively composing RL simulations and evaluating and refining agents. It provides a self-serve platform for developing long-horizon AI agents through adaptive simulation composition, task generation, and automated evaluation. Users can browse and compose environments from tool servers and scenario templates, execute agents against tasks using multiple LLM providers, generate custom tasks with built-in pipelines, and evaluate agent performance via verifiers and reward models. SimLab scales execution via Daytona for remote sandbox operations and is toolset, agent harness, and sandbox agnostic.

Details

Typesimulation, evaluation, agent-platform
Deploymentlocal, cloud
Supported Modelsgpt-4, claude, fireworks, custom

Tags

open-sourcesimulationevaluationreinforcement-learningagent-developmenttask-generationsandbox