Token-cost and latency reduction playbook for a academic grading prompt running on Gemini 2.5 Pro, judged by rubric scoring.
Token-cost and latency reduction playbook for a academic grading prompt running on DeepSeek-V3, judged by rubric scoring.
Token-cost and latency reduction playbook for a academic grading prompt running on Llama 3.1 405B, judged by G-Eval.
Token-cost and latency reduction playbook for a academic grading prompt running on Mistral Small 3, judged by G-Eval.
Token-cost and latency reduction playbook for a academic grading prompt running on o1, judged by Trulens feedback functions.
Token-cost and latency reduction playbook for a academic grading prompt running on o3, judged by Trulens feedback functions.
Token-cost and latency reduction playbook for a academic grading prompt running on Grok 3, judged by DeepEval metrics.
Token-cost and latency reduction playbook for a academic grading prompt running on GPT-4o, judged by promptfoo assertions.
Token-cost and latency reduction playbook for a academic grading prompt running on Claude 3.5 Sonnet, judged by promptfoo assertions.
Token-cost and latency reduction playbook for a academic grading prompt running on Claude 4 Sonnet, judged by embedding distance.
Token-cost and latency reduction playbook for a academic grading prompt running on Claude Opus 4.5, judged by embedding distance.
Token-cost and latency reduction playbook for a academic grading prompt running on Gemini 2.5 Pro, judged by tool-call accuracy.