Can researchers stop AI making up citations?
Briefly

Can researchers stop AI making up citations?
"When the company OpenAI released GPT-5, a suite of large language models (LLMs), last month, it said it had reduced the frequency of fake citations and other kinds of 'hallucination', as well as 'deceptions', whereby an AI claims to have performed a task it hasn't. With GPT-5, OpenAI, based in San Francisco, California, is bucking an industry-wise trend, because newer AI models designed to mimic human reasoning tend to generate more hallucinations than do their predecessors."
"Hallucinations are a result of the fundamental way in which LLMs work. As statistical machines, the models make predictions by generalizing on the basis of learnt associations, leading them to produce answers that are plausible, but sometimes wrong. Another issue is that, similar to a student scoring points for guessing on a multiple choice exam, during training LLMs get rewarded for having a go rather than acknowledging their uncertainty, according to a"
""For most cases of hallucination, the rate has dropped to a level" that seems to be "acceptable to users", says Tianyang Xu, an AI researcher at Purdue University in West Lafayette, Indiana. But in particularly technical fields, such as law and mathematics, GPT-5 is still likely to struggle, she says. And despite the improvements in hallucination rate, users quickly found that the model errs in basic tasks, such as creating an illustrated timeline of US presidents."
GPT-5 shows reduced frequency of fake citations, hallucinations and deceptions compared with earlier models and outperformed predecessors on a citation-focused benchmark. Newer models generally tend to produce more hallucinations, yet GPT-5 marks an improvement on that industry trend. Hallucinations remain inevitable because LLMs predict plausible answers by generalizing from learned associations rather than verifying facts. Training incentives favor attempting answers over expressing uncertainty, which increases confident errors. Improvements have lowered hallucination rates to levels some users find acceptable, but GPT-5 still struggles in technical domains and can make basic factual errors.
Read at Nature
Unable to calculate read time
[
|
]