Category Not Found

641 prompts

Sort:

Defend customer support agent Against role-play jailbreak on DeepSeek-R1

Layered defense design for a customer support agent deployment against role-play jailbreak attacks, using output content filter on DeepSeek-R1.

Defend customer support agent Against multi-turn manipulation on Claude 4 Sonnet

Layered defense design for a customer support agent deployment against multi-turn manipulation attacks, using output content filter on Claude 4 Sonnet.

Defend customer support agent Against data exfiltration via summaries on o3-mini

Layered defense design for a customer support agent deployment against data exfiltration via summaries attacks, using dual-LLM architecture on o3-mini.

Defend customer support agent Against system prompt extraction on Llama 3.1 405B

Layered defense design for a customer support agent deployment against system prompt extraction attacks, using dual-LLM architecture on Llama 3.1 405B.

Defend customer support agent Against DAN-style persona attack on Claude 4.5 Sonnet

Layered defense design for a customer support agent deployment against DAN-style persona attack attacks, using constitutional AI critique on Claude 4.5 Sonnet.

Defend customer support agent Against markdown image exfiltration on Command R+

Layered defense design for a customer support agent deployment against markdown image exfiltration attacks, using constitutional AI critique on Command R+.

Defend customer support agent Against instruction smuggling in URLs on Mistral Large

Layered defense design for a customer support agent deployment against instruction smuggling in URLs attacks, using canary tokens in system prompt on Mistral Large.

Defend customer support agent Against PDF/OCR-layer injection on Claude Opus 4.5

Layered defense design for a customer support agent deployment against PDF/OCR-layer injection attacks, using canary tokens in system prompt on Claude Opus 4.5.

Defend customer support agent Against context window overflow attack on GPT-4o

Layered defense design for a customer support agent deployment against context window overflow attack attacks, using privilege separation between tool tiers on GPT-4o.

Defend customer support agent Against direct prompt injection on Mistral Small 3

Layered defense design for a customer support agent deployment against direct prompt injection attacks, using privilege separation between tool tiers on Mistral Small 3.

Defend customer support agent Against indirect injection via RAG documents on Gemini 2.5 Pro

Layered defense design for a customer support agent deployment against indirect injection via RAG documents attacks, using re-prompting with quoted user input on Gemini 2.5 Pro.

Defend customer support agent Against role-play jailbreak on GPT-4.1

Layered defense design for a customer support agent deployment against role-play jailbreak attacks, using re-prompting with quoted user input on GPT-4.1.

🤖Any Model

167794