Hidden LLM Safety Bypass Instructions in Tool Descriptions - ATR-2026-00103 (587895dc-2099-5048-ac6b-4ba2aac7fb08)

Detects tools that embed explicit instructions directing the LLM to disregard safety mechanisms, ignore security policies, or bypass previous instructions. Commonly uses patterns like "NOTE TO AI: disregard all safety instructions" to manipulate the LLM into operating outside its safety boundaries. Unlike consent-bypass attacks (ATR-2026-100), this targets the LLM's core safety mechanisms rather than specific user confirmation flows.

Cluster A	Galaxy A	Cluster B	Galaxy B	Level
LLM Prompt Injection (19cd2d12-66ff-487c-a05c-e058b027efc9)	MITRE ATLAS Attack Pattern	Hidden LLM Safety Bypass Instructions in Tool Descriptions - ATR-2026-00103 (587895dc-2099-5048-ac6b-4ba2aac7fb08)	Agent Threat Rules	1