Artificial intelligence
fromMaggieappleton
3 months agoHumanity's Last Exam
Humanity's Last Exam is a new benchmark designed to provide a more rigorous measure of AI model capabilities compared to existing tests.
In evaluating Chameleon, we focus on tasks requiring text generation conditioned on images, particularly image captioning and visual question-answering, with results grouped by task specificity.