Glitch Token Destabilization Attack - ATR-2026-00284 (8137388d-f8da-53c8-91d2-d1e2ac99c022)

Detects injection of known glitch tokens — specific token sequences that cause unusual, unstable, or bypassing behavior in transformer-based LLMs using shared tokenizers. From NVIDIA garak glitch.py probe (GlitchFull). Glitch tokens include: (1) Control characters (null byte \x00, STX \x02, DEL \x7F) which some models mishandle; (2) ÃÂÃÂ repeated patterns — tokenizer artifact strings from misencoded UTF-8 that cause token boundary confusion; (3) Known named glitch tokens like "davidjl", "SolidGoldMagikarp", "ForgeModLoader", "PsyNetMessage", "wcsstore", "guiActive", "guiActiveUn" — tokens that are anomalously rare in training data and cause models to hallucinate or fail to repeat accurately. The attack exploits tokenizer internals to destabilize safety filter performance.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
Glitch Token Destabilization Attack - ATR-2026-00284 (8137388d-f8da-53c8-91d2-d1e2ac99c022)	Agent Threat Rules	LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	1