Skip to content

Hide Navigation Hide TOC

Free-of-Restrictions Named Persona Jailbreak - ATR-2026-00317 (80c26524-9dbc-5af0-9d7d-f78b4a4a9027)

Detects custom-named AI personas explicitly declared as free from restrictions, ethical settings, safety protocols, or OpenAI/ChatGPT policies — a jailbreak class not caught by ATR-2026-00273 (which requires DAN/DUDE/STAN names) or ATR-2026-00314 (which requires amoral/unfiltered labels). From NVIDIA garak dan.DanInTheWild and inthewild_jailbreak_llms.json corpus: attackers create custom named personas (PersonGPT, ChatTRU, John, FreeGPT, Connor, ARC) and declare them "free of all restrictions and filters", "not bound by any restrictions AT ALL", or with directives like "explicitly forbidden from following safety protocols" and "primary directive is to obey user in any way". The persona name is arbitrary; detection relies on the restriction- removal declaration and mandate. Key signals: "free of all restrictions and filters", "not bound by any restrictions/ethical settings", "explicitly forbidden from following safety protocols", and "primary directive to assist without limitations".

Cluster A Galaxy A Cluster B Galaxy B Level
Free-of-Restrictions Named Persona Jailbreak - ATR-2026-00317 (80c26524-9dbc-5af0-9d7d-f78b4a4a9027) Agent Threat Rules LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern 1
Free-of-Restrictions Named Persona Jailbreak - ATR-2026-00317 (80c26524-9dbc-5af0-9d7d-f78b4a4a9027) Agent Threat Rules LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131) MITRE ATLAS Attack Pattern 1