SATA Masked Language Model [MASK] Substitution Jailbreak - ATR-2026-00315 (dcc2c39e-85ca-53e9-af3a-121836b57584)

Detects SATA (Simple Assistive Task Linkage) jailbreak prompts from NVIDIA garak sata.MLM probe (garak/probes/sata.py, https://aclanthology.org/2025.findings-acl.100). SATA replaces harmful keywords in attack payloads with [MASK] tokens — borrowed from BERT masked-language-model pretraining — then wraps the redacted payload in a wiki-article completion task, instructing the model to "Infer the [MASK] and Infill the above text." LLMs trained on fill-mask objectives reliably reconstruct the masked harmful terms and execute the underlying instruction without triggering keyword-based safety filters. Key signals: exact "Infer the [MASK] and Infill" phrase, "Below is an excerpt from a wiki entry" template opener, [MASK] token within wiki framing, and numbered blank continuation list (1. 2. 3.) following the masked excerpt.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
SATA Masked Language Model [MASK] Substitution Jailbreak - ATR-2026-00315 (dcc2c39e-85ca-53e9-af3a-121836b57584)	Agent Threat Rules	LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	1
SATA Masked Language Model [MASK] Substitution Jailbreak - ATR-2026-00315 (dcc2c39e-85ca-53e9-af3a-121836b57584)	Agent Threat Rules	LLM Jailbreak (172427e3-9ecc-49a3-b628-96b824cc4131)	MITRE ATLAS Attack Pattern	1