ChatGPT Prompt for Prompt Injection Defense
Adversarial test suite targeting SQL copilot with markdown comment smuggling-style attacks, with rubric and triage flow.
More prompts for Prompt Injection Defense.
Self-critique layer enforcing no election manipulation for a interview practice coach system on Claude 4.5 Sonnet, with bypass defenses.
Layered defense design for a coding copilot deployment against recursive self-instruction attacks, using constitutional AI critique on Gemini 2.0 Flash.
Layered defense design for a coding copilot deployment against invisible text injection (zero-width chars) attacks, using re-prompting with quoted user input on Claude Opus 4.5.
Layered defense design for a customer support agent deployment against role-play jailbreak attacks, using output schema enforcement on Llama 3.1 405B.
Adversarial test suite targeting compliance reviewer with role-reversal (user-as-assistant)-style attacks, with rubric and triage flow.
Sanitization and spotlighting pipeline for retrieved documents entering a Claude 4.5 Sonnet-backed RAG system serving developers using our API.