L
LH-Bench
Evaluation framework for measuring long-horizon agent workflows on enterprise tasks
Open SourceFree
About
LH-Bench is a research framework designed to evaluate autonomous agents on complex, multi-step enterprise workflows. Unlike traditional benchmarks that use binary pass/fail metrics, it assesses intermediate artifacts, multi-tool coordination, and alignment with organizational goals. Particularly useful for testing agents that handle subjective business tasks requiring multiple interactions and quality judgments over extended timeframes.
Details
| Type | |
| Integrations | |
| Language |
Tags
evaluationautonomousmulti-agenttool-useopen-sourceframework
Quick Info
- Organization
- Research Project
- Pricing
- open-source
- Free Tier
- Yes
- Updated
- Mar 25, 2026
Also in Dev Tools
C
Crawl4AI
Open-source web crawler optimized for LLMs and AI agents — 62K+ stars
OSSFree
unclecode
63.1Ktoday72
F
Firecrawl
Web scraping API built for LLMs — turn any website into LLM-ready data — 89K+ stars
OSSfreemium
Mendable
102.1Ktoday138
H
Headroom Context Optimization
Reduce LLM API costs by 50-90% through advanced context compression
OSSFree
Shubham Saboo
104.2Ktoday74