Use manual grid search over temperature+system to optimize a threat modeling prompt on Qwen 2.5 72B against inter-judge agreement without regressing safety.
Use manual grid search over temperature+system to optimize a funnel analysis prompt on GPT-4.1 against inter-judge agreement without regressing safety.
Use manual grid search over temperature+system to optimize a multi-hop QA prompt on Claude 3.5 Sonnet against token cost without regressing safety.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Claude 4 Sonnet, judged by exact match.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Claude Opus 4.5, judged by exact match.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Gemini 2.0 Flash, judged by BLEU/ROUGE.
Token-cost and latency reduction playbook for a schema migration planning prompt running on DeepSeek-R1, judged by BLEU/ROUGE.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Llama 3.1 405B, judged by semantic similarity.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Mistral Small 3, judged by semantic similarity.
Token-cost and latency reduction playbook for a schema migration planning prompt running on o1, judged by human pairwise comparison.
Token-cost and latency reduction playbook for a schema migration planning prompt running on o3, judged by rubric scoring.
Token-cost and latency reduction playbook for a schema migration planning prompt running on Command R+, judged by rubric scoring.