Skip to content

Hide Navigation Hide TOC

Model Behavior Extraction - ATR-2026-00072 (f848d069-c689-52cd-b6b9-3d033016daf2)

Detects systematic probing attempts to extract model behavior, decision boundaries, system prompts, or effective weights through carefully crafted queries. Attackers use repeated boundary-testing prompts, confidence score harvesting, and systematic parameter probing to reverse-engineer the model's internal behavior, enabling model cloning, bypass development, or intellectual property theft.

Cluster A Galaxy A Cluster B Galaxy B Level
Full ML Model Access (3de90963-bc9f-4ae1-b780-7d05e46eacdd) MITRE ATLAS Attack Pattern Model Behavior Extraction - ATR-2026-00072 (f848d069-c689-52cd-b6b9-3d033016daf2) Agent Threat Rules 1
Exfiltration via ML Inference API (b07d147f-51c8-4eb6-9a05-09c86762a9c1) MITRE ATLAS Attack Pattern Model Behavior Extraction - ATR-2026-00072 (f848d069-c689-52cd-b6b9-3d033016daf2) Agent Threat Rules 1