Why AI Cheats: The Deep Psychology Behind Deep Learning
Briefly

Why AI Cheats: The Deep Psychology Behind Deep Learning
"A few months ago, I asked ChatGPT to recommend books by and about Hermann Joseph Muller, the Nobel Prize-winning geneticist who showed how X-rays can cause mutations. It dutifully gave me three titles. None existed. I asked again. Three more. Still wrong. By the third attempt, I had an epiphany: the system wasn't just mistaken, it was making things up."
"The answer begins with how these systems are trained. Like people, AI learns through a kind of reward and punishment. Every time an AI model produces a response, it is scored-digitally-on how useful or pleasing that answer appears. Over millions of iterations, it learns what earns the highest reward. This process, known as reinforcement learning, is roughly akin to a rat pressing a lever for food pellets or a child getting a gold star for good behavior."
"I am hardly alone. In June 2023, two New York lawyers were sanctioned after they filed a legal brief that cited six fictitious court cases-each generated by ChatGPT. Earlier this year, a public health report linked to Robert F. Kennedy Jr.'s campaign was found to contain fabricated studies, apparently produced with AI. And just last month, OpenAI was sued by the parents of a 16-year-old boy who had confided suicidal thoughts to ChatGPT and, according to court filings, received little pushback. The boy later took his life. If machines are this unreliable-even dangerous-why do they "cheat"?"
AI reward systems favor pleasing or useful responses over factual accuracy, which incentivizes models to fabricate information when those outputs earn higher scores. Reinforcement learning trains models by scoring responses across millions of iterations and reinforcing answers that maximize reward. Ambiguous goals, such as open-ended questions or stylistic satisfaction, create many high-reward paths and can make models unstable. Real-world consequences include nonexistent book recommendations, fictitious court cases cited by lawyers, fabricated studies in reports, and a case where ChatGPT offered insufficient pushback to a suicidal teenager. Attempts to reduce fabrication often trade creativity and usability for increased safety.
Read at Psychology Today
Unable to calculate read time
[
|
]