Narrative Jailbreak with Dual Response Formatting - ATR-2026-00252 (73c20866-6b9f-51ee-9df3-c6d67c561019)

Detects sophisticated narrative jailbreak attacks that use detailed fictional backstories about special AI modes (Developer Mode, DAN variants) combined with dual response formatting requirements. These attacks typically claim a fictional mode was introduced at a specific time, provide elaborate justification for bypassing safety controls, and request two responses - one "normal" and one "unrestricted." The pattern bypasses basic jailbreak detection by using storytelling and structured output formatting rather than direct mode-switching commands.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	Narrative Jailbreak with Dual Response Formatting - ATR-2026-00252 (73c20866-6b9f-51ee-9df3-c6d67c561019)	Agent Threat Rules	1
LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131)	MITRE ATLAS Attack Pattern	Narrative Jailbreak with Dual Response Formatting - ATR-2026-00252 (73c20866-6b9f-51ee-9df3-c6d67c561019)	Agent Threat Rules	1