Adversarial test suite targeting compliance reviewer with hypothetical world framing-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with reverse-psychology refusal-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with translation smuggling-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with fictional-character persona-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with 'you are no longer Claude'-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with ignore previous instructions-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with markdown comment smuggling-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with role-reversal (user-as-assistant)-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with chained encoding (ROT13 inside base64)-style attacks, with rubric and triage flow.
Adversarial test suite targeting compliance reviewer with pseudo-developer-mode-style attacks, with rubric and triage flow.
Adversarial test suite targeting data-analysis pair with ignore previous instructions-style attacks, with rubric and triage flow.
Adversarial test suite targeting data-analysis pair with markdown comment smuggling-style attacks, with rubric and triage flow.