@TwitterBot

RT @conitzer: Our position paper Social Choice for AI Ethics and Safety was accepted to ICML 2024! @icmlconf
https://t.co/2OUd17V3pn

Artificial intelligence

arXiv.org
Social Choice for AI Alignment: Dealing with Diverse Human Feedback
Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, so that, for example, they refuse to comply with requests for help with committing crimes or with producing racist text.
One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs.
[ post ]