Artificial intelligencefromArs Technica3 months agoNew study accuses LM Arena of gaming its popular AI benchmarkLM Arena's ranking may favor large companies due to unfair testing practices, raising concerns about its reliability in assessing AI chatbots.
fromTechCrunch3 months agoMeta's vanilla Maverick AI model ranks below rivals on a popular chat benchmark | TechCrunchThe incident prompted the maintainers of LM Arena to apologize, change their policies, and score the unmodified, vanilla Maverick.Artificial intelligence
Marketing techfromTechCrunch3 months agoMeta's benchmarks for its new AI models are a bit misleading | TechCrunchMeta's Maverick AI model exhibits significant differences between its experimental and publicly available versions.