We show that LLMs - Gemma 3, GPT4o and o1-preview - exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind.
The research shows that large language models consistently advise women to ask for lower salaries than men, despite identical qualifications. For instance, a difference in advice led to a gap of $120K a year between genders in some fields.
"Nvidia's multi-million-token context window is an impressive engineering milestone, but for most companies, it's a solution in search of a problem," said Wyatt Mayham, CEO and cofounder at Northwest AI Consulting. "Yes, it tackles a real limitation in existing models like long-context reasoning and quadratic scaling, but there's a gap between what's technically possible and what's actually useful."
Fine-tuning large language models requires huge GPU memory, leading to challenges in acquiring larger models, but QDyLoRA addresses this by enabling dynamic low-rank adaptation.
This study highlights the critical issue of bias amplification in large language models (LLMs), demonstrating its impact predominantly through the lens of political bias in U.S. media.
The findings revealed a significant positive correlation between the M-IoU scores and the ratings from both individual coders, highlighting the reliability of our M-IoU metric in evaluating praise.