SWE-Explore

Benchmark for evaluating coding agents' repository exploration and code understanding abilities

Open SourceFree

About

Research benchmark that measures fine-grained capabilities of coding agents in repository exploration. Unlike traditional benchmarks, it evaluates specific skills including repository understanding, context retrieval, code localization, and bug diagnosis rather than binary pass/fail metrics. Designed to help researchers and developers assess how well AI agents navigate and comprehend codebases.

Details

Type
Integrations
Language

SWE-Explore

About

Details

Tags

Quick Info

Also in Dev Tools

Crawl4AI

Firecrawl

Headroom Context Optimization