Gemini Flash model gets visual reasoning capability
Agentic Vision enables Gemini 3 Flash to perform iterative visual reasoning and code execution to actively inspect images, making image understanding agentic and stepwise.
A tiny, specialized AI (TRM) trained on limited data outperformed some large language models on visual logic puzzles, indicating efficient reasoning potential.
Baidu Releases Open AI Model Claiming to Outperform GPT-5, Raising Pressure on U.S. Tech Rivals - TipRanks.com
Baidu released ERNIE-4.5-VL-28B-A3B-Thinking, an open-source multimodal AI claiming superior visual reasoning and faster, memory-efficient performance than leading models.