AgentDS

Benchmark framework measuring AI agent performance vs human experts on data science tasks

unknownFree

About

AgentDS is an academic research framework for evaluating AI agent capabilities in domain-specific data science workflows. It provides standardized benchmarks and metrics to assess agent performance against human experts, focusing on human-AI collaboration effectiveness. Designed for researchers studying autonomous agents in data analysis, modeling, and interpretation tasks.

Details

Type
Integrations
Language

AgentDS

About

Details

Tags

Quick Info

Also in Dev Tools

Crawl4AI

Firecrawl

Headroom Context Optimization