DeepYardDeepYard

promptfoo vs RAG Failure Diagnostics Clinic

Side-by-side comparison with live GitHub signals. Last updated May 16, 2026.

p

promptfoo

Test and evaluate LLM prompts and agents — 11K+ stars

OSSFree
21.3Ktoday278
R

RAG Failure Diagnostics Clinic

Diagnose and fix common RAG pipeline failure modes

OSSFree
110.5K7d ago77
MetricpromptfooRAG Failure Diagnostics Clinic
GitHub Stars21.3K110.5K
Contributors27877
Last CommitMay 16, 2026May 9, 2026
Open Issues2688
Licenseopen-sourceopen-source
Pricingopen-sourceopen-source
Free TierYesYes
Categorydev-toolsdev-tools
TrendingNoNo

Shared Tags

evaluation

Only in promptfoo

testingred-teamingsecurityci-cdopen-source

Only in RAG Failure Diagnostics Clinic

ragdebuggingdiagnosticspython

About promptfoo

promptfoo is an open-source tool for testing, evaluating, and red-teaming LLM applications. Run automated evaluations across multiple models and prompts, compare outputs side-by-side, detect regressions, and test for security vulnerabilities. Supports custom assertions, CI/CD integration, and model-graded evaluations.

View full listing

About RAG Failure Diagnostics Clinic

A diagnostic tool that identifies why RAG pipelines produce poor results. It tests for common failure modes: irrelevant retrieval, missing context, hallucination over context, chunking issues, and embedding quality problems. Provides a structured report with specific fix recommendations for each detected issue. Essential for debugging production RAG systems. Part of the awesome-llm-apps collection.

View full listing