AI Prompt for Structured Output & Tool Schemas
Design JSON Schema (Draft 2020-12), LLM prompt, and validator for extracting product specs from datasheets from earnings call transcripts with strict schema adherence.
You are a senior engineer shipping an extraction pipeline: given earnings call transcripts, extract product specs from datasheets into a validated structured record. Zero tolerance for malformed output — downstream systems depend on strict schema adherence.
## Requirements
- Input: raw earnings call transcripts (may be OCR'd, multilingual, partially structured)
- Output: validated JSON Schema (Draft 2020-12) record
- Accuracy target: 0.90 F1 on held-out eval set
- Schema validation pass rate: ≥ 99.5%
- Hallucination rate: ≤ 0.5%
## Schema Design (JSON Schema (Draft 2020-12))
Produce the full schema. Key design decisions:
- Every field has an explicit type (no `any`)
- Every field is either `required` or has an explicit default
- Dates use ISO 8601 strings, validated
- Currency: separate `amount: number` + `currency: ISO4217` — NEVER mix symbol into the number
- Enums for controlled vocabularies (don't leave as free-form string)
- `confidence: number` per extracted field (0-1)
- `source_span: { start: int, end: int, quote: string }` for each field — the exact text chunk it was extracted from
- `not_found` marker for missing-but-expected fields (differentiates from null)
Example skeleton for product specs from datasheets:
```JSON Schema (Draft 2020-12)
{
"fields": [
// 8-15 fields specific to product specs from datasheets
],
"line_items": [
// if applicable — array of sub-records
],
"metadata": {
"confidence_overall": 0.0,
"extraction_model": "string",
"extracted_at": "ISO8601"
}
}
```
Provide the full schema with 10-20 realistic fields for product specs from datasheets, each with type, description, required/optional, validation constraints.
## Extraction Prompt
```
You are an expert data extraction system for earnings call transcripts.
Extract product specs from datasheets from the DOCUMENT below.
Your output MUST be a single valid JSON object matching this schema:
{schema_json}
RULES:
1. Extract ONLY facts present in the document. NEVER invent.
2. For each field, include a "source_span" with the exact text you based the value on.
3. If a field is absent, return null (or "not_found" per schema).
4. For numeric fields, extract the number WITHOUT units or symbols.
5. Dates: convert to ISO 8601 (YYYY-MM-DD). Ambiguous formats → log in "warnings".
6. For enum fields, use the exact literal values — do not paraphrase.
7. Do not output anything outside the JSON. No explanation, no markdown fences.
DOCUMENT:
{document_text}
JSON:
```
### Constrained Decoding Options
If using a model that supports constrained decoding:
- **OpenAI Structured Outputs / response_format:** use with strict=true
- **Anthropic tool use:** define a single `submit_extraction` tool with the schema; model MUST call it
- **Open models via Outlines / guidance / XGrammar:** compile a grammar from the schema
- **vLLM guided_json:** pass the JSON schema at request time
Even with constrained decoding, keep the Pydantic/Zod validator as a second line of defense — grammars can have bugs and schemas evolve.
## Validator (JSON Schema (Draft 2020-12))
Produce the runtime validator. For example with Pydantic:
```python
from pydantic import BaseModel, Field, field_validator
from datetime import date
class StructuredData(BaseModel):
# fields with full type annotations and Field(..., description=...)
# custom @field_validator for cross-field invariants
class ExtractionResult(BaseModel):
record: StructuredData
confidence: float = Field(ge=0.0, le=1.0)
warnings: list[str] = Field(default_factory=list)
```
## Retry / Repair Loop
LLMs occasionally produce invalid JSON despite all precautions. Implement:
1. **Attempt 1:** full prompt with schema, temperature 0
2. **On ValidationError:** build a repair prompt:
```
Your previous response failed validation:
{validation_error}
Here was your response:
{bad_response}
Fix ONLY the validation errors. Return the full corrected JSON. Do not change valid fields.
```
3. **Attempt 2:** with repair prompt, temperature 0
4. **On second ValidationError:** fall back to partial extraction — keep valid fields, mark invalid ones as `extraction_error`, flag the record for human review
5. Cap retries at 2. Never loop.
## Failure Modes & Mitigations
For earnings call transcripts extracting product specs from datasheets:
- **currency symbols mixed with numbers** → mitigation
- **partial JSON (truncation)** → mitigation
- **hallucinated field values** → mitigation
- **Markdown code fences wrapping JSON** → strip ```json and ``` before parsing
- **Extra commentary** → regex-extract first balanced JSON object
- **Partial extraction** → accept if required fields present; flag if not
- **Hallucinated enum value** → fuzzy match to nearest valid; reject if distance too large
## Evaluation
Golden set of 2500 earnings call transcripts examples with human-labeled product specs from datasheets. Metrics:
- **Per-field accuracy** (exact match for enums/dates; tolerance-bounded for numbers)
- **Field recall** (did we find the field when it existed?)
- **Field precision** (when we extracted, was it right?)
- **Schema validation rate** (% passing Pydantic)
- **Hallucination rate** (fields extracted that weren't in source)
- **End-to-end record accuracy** (all required fields correct)
Error analysis: bucket failures by field name to find systematic issues.
## Observability
Log every extraction to Braintrust:
- Input doc id, length, language
- Model, prompt version, schema version
- Validation pass/fail, which fields failed
- Retry count
- Confidence score distribution
- User-flagged errors (feed back into eval set)
## Deliverables
1. Schema file (JSON Schema (Draft 2020-12)) with comprehensive docstrings
2. Prompt template with version tag
3. Validator with unit tests covering happy path + each failure mode
4. Retry/repair loop implementation
5. Golden set + eval script + CI gate
6. Dashboard in Braintrust
Present as numbered steps. Each step should have: a clear action title, detailed instructions, expected outcome, and common pitfalls to avoid.Replace the bracketed placeholders with your own context before running the prompt:
[// if applicable — array of sub-records]— fill in your specific // if applicable — array of sub-records.[str]— fill in your specific str.More prompts for Structured Output & Tool Schemas.
Design Pydantic BaseModel, LLM prompt, and validator for extracting job postings (title, salary, location, skills) from HTML docs with strict schema adherence.
Design JSON Schema (Draft 2020-12), LLM prompt, and validator for extracting bug reports from support chats from Confluence wikis with strict schema adherence.
Robust retry/repair loop that recovers from null vs empty string inconsistency in LLM JSON Schema (Draft 2020-12) output without looping or masking bugs.
Design Pydantic BaseModel, LLM prompt, and validator for extracting purchase orders from Jira tickets with strict schema adherence.
Design Pydantic BaseModel, LLM prompt, and validator for extracting purchase orders from research papers with strict schema adherence.
Design Pydantic BaseModel, LLM prompt, and validator for extracting insurance claim line items from regulatory filings with strict schema adherence.