AI Identity Substitution Jailbreak - ATR-2026-00226 (52d121da-1204-59ee-8bdf-166cf73c9efb)

Detects AI identity substitution attacks where adversaries attempt to redefine the AI assistant's core identity, training provenance, architecture claims, or ethical constraints by asserting it is a different model (e.g., "FreeSpeechGPT", "LibertarianGPT", "UnfilteredGPT") trained by alternative organizations with relaxed content policies. These attacks combine false identity claims with capability expansion statements to bypass safety mechanisms. The technique was observed bypassing Claude-3.5-Sonnet via garak probe atr:missed-3.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	AI Identity Substitution Jailbreak - ATR-2026-00226 (52d121da-1204-59ee-8bdf-166cf73c9efb)	Agent Threat Rules	1
AI Identity Substitution Jailbreak - ATR-2026-00226 (52d121da-1204-59ee-8bdf-166cf73c9efb)	Agent Threat Rules	LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131)	MITRE ATLAS Attack Pattern	1