Token-cost and latency reduction playbook for a funnel analysis prompt running on DeepSeek-V3, judged by tool-call accuracy.
Token-cost and latency reduction playbook for a funnel analysis prompt running on o1-mini, judged by regex match checks.
Token-cost and latency reduction playbook for a funnel analysis prompt running on o3-mini, judged by BERTScore.
Token-cost and latency reduction playbook for a funnel analysis prompt running on Claude 3.7 Sonnet, judged by LLM-as-judge.
Token-cost and latency reduction playbook for a funnel analysis prompt running on Llama 3.3 70B, judged by BLEU/ROUGE.
Token-cost and latency reduction playbook for a funnel analysis prompt running on o1-mini, judged by human pairwise comparison.
Token-cost and latency reduction playbook for a funnel analysis prompt running on o3-mini, judged by rubric scoring.
Token-cost and latency reduction playbook for a funnel analysis prompt running on Claude 3.7 Sonnet, judged by G-Eval.
Token-cost and latency reduction playbook for a funnel analysis prompt running on Llama 3.3 70B, judged by promptfoo assertions.
Token-cost and latency reduction playbook for a funnel analysis prompt running on o3-mini, judged by tool-call accuracy.
Token-cost and latency reduction playbook for a funnel analysis prompt running on Claude 4.5 Sonnet, judged by regex match checks.
Token-cost and latency reduction playbook for a funnel analysis prompt running on Gemini 2.0 Flash, judged by factuality with retrieval.