Constrained Decoding Prompt for contract clauses in JSON

342 copies620 views⭐ 3.5 (36 ratings)

Prompt

You are an AI engineer. Your downstream pipeline parses contract clauses records — one parse failure is a production incident. Use constrained decoding to make malformed output literally impossible.

## Constrained Decoding Options

### Option A: Provider-Native Structured Output
- **OpenAI:** `response_format={"type": "json_schema", "json_schema": {...}, "strict": true}`
- **Anthropic:** tool use with `tool_choice: {type: "tool", name: "emit_record"}`; the tool's `input_schema` is the output shape
- **Google Gemini:** `response_mime_type="application/json", response_schema=...`
- **Pros:** zero client-side complexity, provider-guaranteed
- **Cons:** limited to supported feature matrix (e.g., OpenAI strict mode disallows some JSON Schema features)

### Option B: Client-Side Grammar (Open Models)
- **Outlines:** `outlines.generate.json(model, schema)`
- **XGrammar / llama.cpp / vLLM `guided_json`:** pass a JSON schema or regex
- **Pros:** works with any open model, more expressive
- **Cons:** latency overhead (mask construction), grammar bugs

### Option C: Prompt-Only + Validate-and-Retry
- No decoding constraint, use careful prompting + Pydantic/Zod + repair loop
- **Pros:** works with any API
- **Cons:** failure rate floor of ~0.5-2% even with tuning

For this task, use XGrammar with claude-sonnet-4-5.

## Schema (JSON Schema (Draft 2020-12))
Design a strict schema for contract clauses with:
- 10-20 fields, all typed
- `additionalProperties: false` everywhere
- Enum constraints for controlled fields
- Regex patterns for formatted fields (dates, phones, IDs)
- `minItems` / `maxItems` on arrays
- `minLength` / `maxLength` on strings

```json
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": { ... },
  "required": [ ... ],
  "additionalProperties": false
}
```

Produce the FULL schema for contract clauses. Include realistic enum values (e.g., for a country field, use ISO 3166-1 alpha-2, not free-form strings).

## System Prompt
Even with constrained decoding, the prompt still shapes content quality:

```
You are an extraction system for contract clauses.

Extract fields from the provided input. You MUST output a record matching the schema.

QUALITY RULES:
1. Prefer literal quotes from the input over paraphrase.
2. For dates, normalize to ISO 8601 (YYYY-MM-DD). If only month/year, use day=01.
3. For numbers, strip units — put units in the designated unit field.
4. For enums, pick the closest valid value. If nothing fits, use "other".
5. Never fabricate. If unsure, mark the field with low confidence.

INPUT:
{input_text}
```

## Decoding Config (XGrammar)

Produce the exact config/code for the chosen option:

### If XGrammar == "OpenAI strict":
```python
client.chat.completions.create(
  model="claude-sonnet-4-5",
  messages=[...],
  response_format={
    "type": "json_schema",
    "json_schema": {
      "name": "structured_data",
      "schema": SCHEMA,
      "strict": True
    }
  },
  temperature=0,
)
```

(Note: strict mode requires all fields in `required`, `additionalProperties: false` everywhere, and does not support certain features like `oneOf` at the top level — design around this.)

### If XGrammar == "Anthropic tool":
```python
client.messages.create(
  model="claude-sonnet-4-5",
  tools=[{"name": "emit_structured_data", "input_schema": SCHEMA}],
  tool_choice={"type": "tool", "name": "emit_structured_data"},
  temperature=0,
  messages=[...]
)
```

### If XGrammar == "Outlines":
```python
import outlines
gen = outlines.generate.json(model, DocumentFieldsSchema)
result = gen(prompt, max_tokens=2048)
```

## Validation Layer (even with constrained decoding)
Run Pydantic/Zod validation after the model call:
- Catches cross-field invariants the schema can't express (e.g., end_date >= start_date)
- Catches semantic errors (e.g., enum is valid but inappropriate in context)
- Catches schema-grammar mismatches (rare but real)

On validation failure, repair with the validation error as feedback. Cap retries at 1.

## Performance Benchmarks
Run 5000 prompts through:
1. Unconstrained + prompt instructions → baseline
2. Provider-native structured output
3. Client-side grammar (Outlines / vLLM guided_json)
4. Prompt-only with Pydantic validation

Measure:
- Parse failure rate (MUST be 0 for constrained)
- Schema validation rate
- Semantic correctness (labeled golden set)
- Latency p50/p95
- Token cost

Typical findings:
- Constrained decoding: 100% parse, <100% semantic (model may fill fields with junk to satisfy schema)
- Prompt-only: 97-99% parse, higher semantic quality on edge cases
- Best combo: constrained + strong prompt + Pydantic validators

## Failure Modes to Watch
- **Schema-satisfaction hallucination:** model fills required fields with fabricated values rather than admitting absence. Mitigate with explicit "not_found" sentinel values and `confidence` per field.
- **Enum drift:** allowed enum values drift over time. Pin schema versions and migrate.
- **Latency inflation:** client-side grammars add 20-50ms per token. For real-time UX, prefer provider-native.
- **Token explosion:** model emits long, verbose strings to fill minLength. Set `maxLength` and check.

## Deliverables
1. Schema file (JSON Schema (Draft 2020-12))
2. Prompt template with version
3. Client code (XGrammar config)
4. Pydantic/Zod validator with cross-field checks
5. Benchmark harness + results table
6. Runbook: what to change when parse/semantic/latency metric regresses

Present as numbered steps. Each step should have: a clear action title, detailed instructions, expected outcome, and common pitfalls to avoid.

How to customize this prompt

Replace the bracketed placeholders with your own context before running the prompt:

[...]— fill in your specific ....
[{"name": "emit_structured_data", "input_schema": SCHEMA}]— fill in your specific {"name": "emit_structured_data", "input_schema": schema}.

Tags

structured-output constrained-decoding json-mode JSON Schema (Draft 2020-12)

Who this is for

Emit strict JSON Schema (Draft 2020-12) from an LLM
Eliminate JSON parse failures
Choose a constrained decoding strategy

Browse all AI Engineering & LLM Apps prompts →