AIThe Decoder1h ago
New math benchmark reveals AI models confidently solve problems
New math benchmark reveals AI models confidently solve problems that have no solution

A consortium of 64 mathematicians built SOOHAK, a new AI benchmark with 439 handwritten tasks, including 99 that are deliberately unsolvable. Google's Gemini 3 Pro leads on research-level problems at 30 percent. But no model cracks 50 percent on spotting broken tasks. More…
Read full articleSource: The Decoder · Opens in new tab