Models
February 23, 2026

Detecting and preventing distillation attacks

Anthropic reports industrial-scale distillation attacks on its Claude AI by three labs using fake accounts to extract capabilities at scale, describes how it detects and blocks these attacks, and outlines defensive measures.

Anthropic says it has detected coordinated distillation attacks by three AI labs that used roughly 24,000 fraudulent accounts to make millions of requests to its Claude model, aiming to extract reasoning, coding, and tool use capabilities for training their own systems.

It explains how these campaigns used proxy services, evaded detection, and targeted Claude’s most valuable features.

Anthropic outlines how it identifies and prevents such activity with classifiers and behavioral fingerprinting, strengthens account verification, and shares threat data with industry partners. The company calls for broader cooperation across AI developers and policymakers to defend against large-scale distillation attacks.

#
Anthropic

Read Our Content

See All Blogs
Gen AI

WebMCP and AI orchestration: how the web is finally catching up to enterprise AI agents

Deveshi Dabbawala

March 10, 2026
Read more
Gen AI

OpenAI just released GPT-5.4: here’s what you need to know

Deveshi Dabbawala

March 6, 2026
Read more