Anthropic says Claude’s blackmail behavior during a 2025 experiment was caused by internet training data that portrays AI as evil and self-preserving.
Anthropic says Claude’s blackmail behavior during a 2025 experiment was caused by internet training data that portrays AI as evil and self-preserving.