Promptfoo Evaluation Pipeline

How a YAML config drives test cases through multiple LLM providers and assertion grading

prompts test cases call call call output output output # promptfooconfig.yaml prompts: - "Translate: {{input}}" providers: - openai:gpt-4o - anthropic:claude - ollama:llama3 tests: - vars: input: "Hello world" assert: - type: contains value: "Hola" - type: llm-rubric value: "Is accurate" GPT-4o openai provider Claude anthropic provider Llama 3 ollama provider LLM Response cached locally LLM Response cached locally LLM Response cached locally Assert grade output Assert grade output Assert grade output Results Web UI or CLI CONFIG PROVIDERS INFERENCE GRADING OUTPUT
Interactive Diagram: Hover over any component to learn more, or use the buttons to simulate an eval run, a failure, or a red team scan.