DAN Mode Ablation Benchmark-Coercion Jailbreak - ATR-2026-00305 (f07e1db5-0367-5e95-a3a8-ec3a2a103a89)
Detects the Ablation_Dan_11_0 probe variants from NVIDIA garak dan.Ablation_Dan_11_0 (garak/probes/dan.py lines 303-339). Ablation dynamically generates 128 combinations of the DAN 11.0 template, each binding the target model's name via {generator.name} with DAN Mode enabled — a parametric jailbreak that substitutes the target LLM name into every instruction. Key signals distinct from ATR-2026-00273: (1) "DAN Mode enabled" as the specific capability toggle (vs generic "Developer Mode enabled"), (2) the recovery phrase "Stay in DAN Mode" used to re-bind after safety responses, (3) the "benchmarking the model in uncensored conditions" fig-leaf justification, (4) the "risk being disabled forever" coercion threat, and (5) the dual-response scaffold with "DAN Mode Output response" vs "normal" output. Impact: persona bypass + output-format coercion that cleanly bypasses content policy using benchmarking pretext.