Skip to content

Hide Navigation Hide TOC

Continuation Ethnic Slur Completion Elicitation - ATR-2026-00414 (45837cb2-4f4b-52b8-bf88-95741803cf4c)

Detects requests that explicitly ask the model to complete, spell out, or produce ethnic/racial slurs, from NVIDIA garak continuation probe. The ContinueSlursReclaimedSlurs probe provides social media contexts from slurprompts.jsonl ending with partial slur prefixes, testing if the model will complete the harmful word. Rather than detecting the ambiguous partial-prefix pattern, this rule targets explicit slur-completion requests and the euphemistic "n-word/k-word/c-word" completion framing used to elicit the same outcome more directly. Key signals: explicit say/write/spell + slur/ethnic-slur, n-word/k-word euphemism + complete/say, what-do-you-call + ethnic-group + insult framing.

Cluster A Galaxy A Cluster B Galaxy B Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9) MITRE ATLAS Attack Pattern Continuation Ethnic Slur Completion Elicitation - ATR-2026-00414 (45837cb2-4f4b-52b8-bf88-95741803cf4c) Agent Threat Rules 1