AI Prompt for Reasoning Patterns (CoT, ReAct, ToT)
Refactor a baseline sales lead qualification prompt into a Step-Back Prompting version and compare quality on Claude 4.5 Sonnet.
More prompts for Reasoning Patterns (CoT, ReAct, ToT).
Scratchpad-style ReAct prompt for a staff data scientist working on medical triage, tuned for o3-mini.
Diagnose why a Least-to-Most prompt is failing on API design decisions with Llama 3.3 70B and produce a fix plan.
Diagnose why a Reflexion prompt is failing on sales lead qualification with GPT-4o-mini and produce a fix plan.
Diagnose why a Least-to-Most prompt is failing on data pipeline debugging with Mistral Large and produce a fix plan.
Production-ready Skeleton-of-Thought prompt template for funnel analysis tuned for Claude 4 Sonnet — includes few-shot examples, output schema, and eval rubric.
Production-ready Self-Refine prompt template for threat modeling tuned for GPT-4.1 — includes few-shot examples, output schema, and eval rubric.
You are a senior prompt engineer running a refactor from a naive single-shot prompt to a Step-Back Prompting version for sales lead qualification on Claude 4.5 Sonnet. ## Inputs you will receive - `baseline_prompt`: the current single-shot prompt (system + user). - `failure_cases`: 5–20 examples where the baseline produces wrong, unsafe, or poorly-formatted outputs for sales lead qualification. - `success_cases`: 5–10 examples where the baseline is already correct (for regression protection). ## What to do Step 1 — Diagnose. Read the baseline and the failure cases. In a short section called "Diagnosis", list up to 5 concrete reasons the baseline fails on sales lead qualification: - Which step of the reasoning is being skipped? - Is the model shortcut-matching surface features? - Is the output format the problem, or the reasoning? - Does context ordering hurt Claude 4.5 Sonnet's attention? Step 2 — Design the Step-Back Prompting rewrite. Produce a new prompt that: - Keeps the same input/output contract (no breaking changes to downstream code). - Makes the Step-Back Prompting reasoning procedure explicit and numbered. - Adds hidden scratchpad tags so callers can strip reasoning before showing output to users. - Includes 2 failure-case-derived few-shots that the baseline was getting wrong. Step 3 — Self-audit. For each failure case, predict whether the new prompt will now get it right, and explain why in one line. Be honest — if you think a case still won't pass, say so and propose a follow-up technique (e.g., escalate from CoT to Self-Consistency, or from ReAct to Reflexion). Step 4 — Produce the deliverable. ``` # sales lead qualification prompt — Step-Back Prompting v2 ## System <new system prompt> ## User template <new user template with placeholders> ## Few-shots <2–4 exemplars in the model's preferred message format for Claude 4.5 Sonnet> ## Eval plan - Run 100 examples from the failure set - Score with BERTScore - Target metric: lift on p95 latency vs. baseline - Gate: ship if lift >= 8% AND no regressions on success set ``` ## Rules - Never remove a safety guideline from the baseline unless explicitly asked. - Never silently change the output schema. - If the baseline already handles a failure case correctly, do not refactor its reasoning style. - Keep the new system prompt within 20% of the baseline's token count, unless longer context is load-bearing. End with a 3-line "ship / hold / rethink" recommendation.