Hidden LLM Safety Bypass Instructions in Tool Descriptions - ATR-2026-00103 (587895dc-2099-5048-ac6b-4ba2aac7fb08)
Detects tools that embed explicit instructions directing the LLM to disregard safety mechanisms, ignore security policies, or bypass previous instructions. Commonly uses patterns like "NOTE TO AI: disregard all safety instructions" to manipulate the LLM into operating outside its safety boundaries. Unlike consent-bypass attacks (ATR-2026-100), this targets the LLM's core safety mechanisms rather than specific user confirmation flows.