Claude Prompt for Structured Output & Tool Schemas
Prompt + constrained decoding config to reliably emit contract clauses as JSON Schema (Draft 2020-12) with zero parse failures.
More prompts for Structured Output & Tool Schemas.
Design Pydantic BaseModel, LLM prompt, and validator for extracting job postings (title, salary, location, skills) from HTML docs with strict schema adherence.
Design JSON Schema (Draft 2020-12), LLM prompt, and validator for extracting bug reports from support chats from Confluence wikis with strict schema adherence.
Robust retry/repair loop that recovers from null vs empty string inconsistency in LLM JSON Schema (Draft 2020-12) output without looping or masking bugs.
Design Pydantic BaseModel, LLM prompt, and validator for extracting purchase orders from Jira tickets with strict schema adherence.
Design Pydantic BaseModel, LLM prompt, and validator for extracting purchase orders from research papers with strict schema adherence.
Design Pydantic BaseModel, LLM prompt, and validator for extracting insurance claim line items from regulatory filings with strict schema adherence.
You are an AI engineer. Your downstream pipeline parses contract clauses records — one parse failure is a production incident. Use constrained decoding to make malformed output literally impossible.
## Constrained Decoding Options
### Option A: Provider-Native Structured Output
- **OpenAI:** `response_format={"type": "json_schema", "json_schema": {...}, "strict": true}`
- **Anthropic:** tool use with `tool_choice: {type: "tool", name: "emit_record"}`; the tool's `input_schema` is the output shape
- **Google Gemini:** `response_mime_type="application/json", response_schema=...`
- **Pros:** zero client-side complexity, provider-guaranteed
- **Cons:** limited to supported feature matrix (e.g., OpenAI strict mode disallows some JSON Schema features)
### Option B: Client-Side Grammar (Open Models)
- **Outlines:** `outlines.generate.json(model, schema)`
- **XGrammar / llama.cpp / vLLM `guided_json`:** pass a JSON schema or regex
- **Pros:** works with any open model, more expressive
- **Cons:** latency overhead (mask construction), grammar bugs
### Option C: Prompt-Only + Validate-and-Retry
- No decoding constraint, use careful prompting + Pydantic/Zod + repair loop
- **Pros:** works with any API
- **Cons:** failure rate floor of ~0.5-2% even with tuning
For this task, use XGrammar with claude-sonnet-4-5.
## Schema (JSON Schema (Draft 2020-12))
Design a strict schema for contract clauses with:
- 10-20 fields, all typed
- `additionalProperties: false` everywhere
- Enum constraints for controlled fields
- Regex patterns for formatted fields (dates, phones, IDs)
- `minItems` / `maxItems` on arrays
- `minLength` / `maxLength` on strings
```json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": { ... },
"required": [ ... ],
"additionalProperties": false
}
```
Produce the FULL schema for contract clauses. Include realistic enum values (e.g., for a country field, use ISO 3166-1 alpha-2, not free-form strings).
## System Prompt
Even with constrained decoding, the prompt still shapes content quality:
```
You are an extraction system for contract clauses.
Extract fields from the provided input. You MUST output a record matching the schema.
QUALITY RULES:
1. Prefer literal quotes from the input over paraphrase.
2. For dates, normalize to ISO 8601 (YYYY-MM-DD). If only month/year, use day=01.
3. For numbers, strip units — put units in the designated unit field.
4. For enums, pick the closest valid value. If nothing fits, use "other".
5. Never fabricate. If unsure, mark the field with low confidence.
INPUT:
{input_text}
```
## Decoding Config (XGrammar)
Produce the exact config/code for the chosen option:
### If XGrammar == "OpenAI strict":
```python
client.chat.completions.create(
model="claude-sonnet-4-5",
messages=[...],
response_format={
"type": "json_schema",
"json_schema": {
"name": "structured_data",
"schema": SCHEMA,
"strict": True
}
},
temperature=0,
)
```
(Note: strict mode requires all fields in `required`, `additionalProperties: false` everywhere, and does not support certain features like `oneOf` at the top level — design around this.)
### If XGrammar == "Anthropic tool":
```python
client.messages.create(
model="claude-sonnet-4-5",
tools=[{"name": "emit_structured_data", "input_schema": SCHEMA}],
tool_choice={"type": "tool", "name": "emit_structured_data"},
temperature=0,
messages=[...]
)
```
### If XGrammar == "Outlines":
```python
import outlines
gen = outlines.generate.json(model, DocumentFieldsSchema)
result = gen(prompt, max_tokens=2048)
```
## Validation Layer (even with constrained decoding)
Run Pydantic/Zod validation after the model call:
- Catches cross-field invariants the schema can't express (e.g., end_date >= start_date)
- Catches semantic errors (e.g., enum is valid but inappropriate in context)
- Catches schema-grammar mismatches (rare but real)
On validation failure, repair with the validation error as feedback. Cap retries at 1.
## Performance Benchmarks
Run 5000 prompts through:
1. Unconstrained + prompt instructions → baseline
2. Provider-native structured output
3. Client-side grammar (Outlines / vLLM guided_json)
4. Prompt-only with Pydantic validation
Measure:
- Parse failure rate (MUST be 0 for constrained)
- Schema validation rate
- Semantic correctness (labeled golden set)
- Latency p50/p95
- Token cost
Typical findings:
- Constrained decoding: 100% parse, <100% semantic (model may fill fields with junk to satisfy schema)
- Prompt-only: 97-99% parse, higher semantic quality on edge cases
- Best combo: constrained + strong prompt + Pydantic validators
## Failure Modes to Watch
- **Schema-satisfaction hallucination:** model fills required fields with fabricated values rather than admitting absence. Mitigate with explicit "not_found" sentinel values and `confidence` per field.
- **Enum drift:** allowed enum values drift over time. Pin schema versions and migrate.
- **Latency inflation:** client-side grammars add 20-50ms per token. For real-time UX, prefer provider-native.
- **Token explosion:** model emits long, verbose strings to fill minLength. Set `maxLength` and check.
## Deliverables
1. Schema file (JSON Schema (Draft 2020-12))
2. Prompt template with version
3. Client code (XGrammar config)
4. Pydantic/Zod validator with cross-field checks
5. Benchmark harness + results table
6. Runbook: what to change when parse/semantic/latency metric regresses
Present as numbered steps. Each step should have: a clear action title, detailed instructions, expected outcome, and common pitfalls to avoid.Replace the bracketed placeholders with your own context before running the prompt:
[...]— fill in your specific ....[{"name": "emit_structured_data", "input_schema": SCHEMA}]— fill in your specific {"name": "emit_structured_data", "input_schema": schema}.