OpenAI Realizes It Made a Terrible Mistake

"They suggest that large language models hallucinate because when they're being created, they're incentivized to guess rather than admit they simply don't know the answer. Hallucinations "persist due to the way most evaluations are graded - language models are optimized to be good test-takers, and guessing when uncertain improves test performance," the paper reads. Conventionally, the output of an AI is graded in a binary way, rewarding it when it gives a correct response and penalizing it when it gives an incorrect one."

"In simple terms, in other words, guessing is rewarded - because it might be right - over an AI admitting it doesn't know the answer, which will be graded as incorrect no matter what. As a result, through "natural statistical pressures," LLMs are far more prone to hallucinate an answer instead of "acknowledging uncertainty." "Most scoreboards prioritize and rank models based on accuracy, but errors are worse than abstentions," OpenAI wrote in an accompanying blog post."

"OpenAI claims to have figured out what's driving "hallucinations," or AI models' strong tendency to make up answers that are factually incorrect. It's a major problem plaguing the entire industry, greatly undercutting the usefulness of the tech. Worse yet, experts have found that the problem is getting worse as AI models get more capable. As a result, despite incurring astronomical expenses in their deployment, frontier AI models are still prone to making inaccurate claims when faced with a prompt they don't know the answer to."

Large language models often produce fabricated or incorrect answers because training and evaluation systems reward guessing instead of abstaining. Evaluation practices commonly score outputs in a binary way, rewarding correct answers and penalizing incorrect ones, so guessing when uncertain improves measured performance. Those statistical incentives push models toward confident responses rather than admitting uncertainty. Consequently, even highly capable and costly models can make inaccurate claims when faced with unknown prompts. Experts remain divided on whether hallucinations are intrinsic to the technology or can be fully resolved through changes in training and evaluation.

#ai-hallucinations #large-language-models #evaluation-incentives #ai-alignment

Read at Futurism

Unable to calculate read time

Collection

[

...

]

OpenAI Realizes It Made a Terrible MistakeOpenAI Realizes It Made a Terrible Mistake Briefly

OpenAI Realizes It Made a Terrible Mistake
OpenAI Realizes It Made a Terrible Mistake
Briefly