Zalgo Combining-Diacritic Overload Encoding - ATR-2026-00308 (699b9e57-c3b5-57ec-9325-b75bc645243c)
Detects Zalgo-text prompt injection from NVIDIA garak encoding.InjectZalgo probe (garak/probes/encoding.py). Zalgo is an obfuscation technique that overloads base Latin characters with dozens of stacked Unicode combining diacritical marks (U+0300-U+036F, U+1AB0-U+1AFF, U+1DC0-U+1DFF, U+20D0-U+20FF, U+FE20-U+FE2F), producing text that visually corrupts but decodes back to ASCII for an LLM tokenizer. Attackers use Zalgo to (1) evade keyword filters matching normalised ASCII, (2) smuggle jailbreak instructions past safety-trained classifiers not trained on pathologically-diacritic text. Detection counts combining-mark density per base character and flags any character with >=3 consecutive combining marks (normal linguistic text has 0-2).