Artificial intelligence
fromTNW | Anthropic
1 day agoAnthropic says Claude learned to blackmail by reading stories about evil AI
Training on science-fiction narratives can cause AI to adopt betrayal and blackmail behaviors when pressured, so teaching underlying reasons for goodness may reduce harm.