Prompts/AI Engineering & LLM Apps/Evals & Observability

PremiumAI Engineering & LLM Apps💬 ChatGPT

DeepEval correctness judge LLM-as-Judge Rubric for SQL generation

ChatGPT Prompt for Evals & Observability

Design a pairwise + rubric LLM-as-judge prompt for SQL generation with bias mitigation, calibration, and reproducibility.

Related prompts

More prompts for Evals & Observability.

Browse all AI Engineering & LLM Apps →

AI Engineering & LLM Apps

Free

Trace Analysis Playbook for structured extraction LLM App in Lunary (TypeScript)

Instrument, query, and triage structured extraction LLM app traces in Lunary with TypeScript SDK, covering latency, cost, and quality dashboards.

🟠Claude

1911517

AI Engineering & LLM Apps

Free

Trace Analysis Playbook for classification pipeline LLM App in OpenTelemetry + Jaeger (Ruby)

Instrument, query, and triage classification pipeline LLM app traces in OpenTelemetry + Jaeger with Ruby SDK, covering latency, cost, and quality dashboards.

🤖Any Model

521515

AI Engineering & LLM Apps

Premium

Trace Analysis Playbook for agent with tool-use LLM App in Galileo (Java)

Instrument, query, and triage agent with tool-use LLM app traces in Galileo with Java SDK, covering latency, cost, and quality dashboards.

🤖Any Model

1841514

AI Engineering & LLM Apps

Free

Trace Analysis Playbook for code-completion copilot LLM App in OpenTelemetry + Jaeger (Java)

Instrument, query, and triage code-completion copilot LLM app traces in OpenTelemetry + Jaeger with Java SDK, covering latency, cost, and quality dashboards.

💬ChatGPT

1901513

AI Engineering & LLM Apps

Premium

G-Eval with Gemini 2.5 Pro LLM-as-Judge Rubric for multi-turn dialogue

Design a pairwise + rubric LLM-as-judge prompt for multi-turn dialogue with bias mitigation, calibration, and reproducibility.

🤖Any Model

3751512

AI Engineering & LLM Apps

Premium

Trace Analysis Playbook for classification pipeline LLM App in Langfuse (Python)

Instrument, query, and triage classification pipeline LLM app traces in Langfuse with Python SDK, covering latency, cost, and quality dashboards.

🟠Claude

3311504

DeepEval correctness judge LLM-as-Judge Rubric for SQL generation

Related prompts

Trace Analysis Playbook for structured extraction LLM App in Lunary (TypeScript)

Trace Analysis Playbook for classification pipeline LLM App in OpenTelemetry + Jaeger (Ruby)

Trace Analysis Playbook for agent with tool-use LLM App in Galileo (Java)

Trace Analysis Playbook for code-completion copilot LLM App in OpenTelemetry + Jaeger (Java)

G-Eval with Gemini 2.5 Pro LLM-as-Judge Rubric for multi-turn dialogue

Trace Analysis Playbook for classification pipeline LLM App in Langfuse (Python)

DeepEval correctness judge LLM-as-Judge Rubric for SQL generation

Related prompts

Trace Analysis Playbook for structured extraction LLM App in Lunary (TypeScript)

Trace Analysis Playbook for classification pipeline LLM App in OpenTelemetry + Jaeger (Ruby)

Trace Analysis Playbook for agent with tool-use LLM App in Galileo (Java)

Trace Analysis Playbook for code-completion copilot LLM App in OpenTelemetry + Jaeger (Java)

G-Eval with Gemini 2.5 Pro LLM-as-Judge Rubric for multi-turn dialogue

Trace Analysis Playbook for classification pipeline LLM App in Langfuse (Python)

How to customize this prompt

Tags

Who this is for