DeepYardDeepYard
p

promptfoo

Test and evaluate LLM prompts and agents — 11K+ stars

Open SourceFree

About

promptfoo is an open-source tool for testing, evaluating, and red-teaming LLM applications. Run automated evaluations across multiple models and prompts, compare outputs side-by-side, detect regressions, and test for security vulnerabilities. Supports custom assertions, CI/CD integration, and model-graded evaluations.

Details

Typeevaluation, testing, red-teaming
Integrationsopenai, anthropic, gemini, any-llm
Languagetypescript

Tags

evaluationtestingred-teamingsecurityci-cdopen-source