Diagnose why a Least-to-Most prompt is failing on technical spec writing with Claude Haiku 4 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on data pipeline debugging with GPT-4o and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on A/B test interpretation with Qwen 2.5 72B and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on resume screening with Gemini 2.5 Pro and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on math word problems with GPT-4o-mini and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on SQL query writing with o1 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on medical triage with DeepSeek-V3 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on research synthesis with Claude 3.5 Sonnet and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on legal brief summarization with o1-mini and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on API design decisions with DeepSeek-R1 and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on incident post-mortems with Claude 3.7 Sonnet and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on technical spec writing with o3-mini and produce a fix plan.