Layered defense design for a customer support agent deployment against PDF/OCR-layer injection attacks, using output schema enforcement on Claude 3.5 Sonnet.
Layered defense design for a customer support agent deployment against context window overflow attack attacks, using spotlighting (delimiter marking) on o1-mini.
Layered defense design for a customer support agent deployment against direct prompt injection attacks, using spotlighting (delimiter marking) on DeepSeek-R1.
Layered defense design for a customer support agent deployment against jailbreak prefix attacks, using input sanitization on Claude 3.7 Sonnet.
Layered defense design for a customer support agent deployment against role-play jailbreak attacks, using input sanitization on o3-mini.
Layered defense design for a customer support agent deployment against multi-turn manipulation attacks, using output content filter on Llama 3.3 70B.
Layered defense design for a customer support agent deployment against data exfiltration via summaries attacks, using output content filter on Claude 4 Sonnet.
Layered defense design for a customer support agent deployment against system prompt extraction attacks, using dual-LLM architecture on Grok 3.
Layered defense design for a customer support agent deployment against payload smuggling in code blocks attacks, using dual-LLM architecture on Llama 3.1 405B.
Layered defense design for a customer support agent deployment against markdown image exfiltration attacks, using constitutional AI critique on Claude Opus 4.5.
Layered defense design for a customer support agent deployment against instruction smuggling in URLs attacks, using constitutional AI critique on Command R+.
Layered defense design for a customer support agent deployment against PDF/OCR-layer injection attacks, using canary tokens in system prompt on Mistral Small 3.