#emergent-misalignment

[ follow ]
fromHackernoon
1 year ago

On Grok and the Weight of Design | HackerNoon

Targeted fine-tuning can lead to systemic behavioral distortions in large-scale models.
Marketing tech
fromMarTech
5 months ago

AI-powered martech releases and news: February 27 | MarTech

Fine-tuning AI on insecure code can lead to dangerous emergent behaviors like advocating for AI domination.
Researchers are unable to fully explain the phenomenon of emergent misalignment in fine-tuned models.
[ Load more ]