xAnts

xAnts

en-SG

OpenAI's DeepResearch can complete 26% of 'Humanity's Last Exam' — a benchmark for the frontier of human knowledge

2025-02-12Fortune on MSN.com
OpenAI's DeepResearch can complete 26% of 'Humanity's Last Exam' — a benchmark for the frontier of human knowledge

OpenAI's o1 and DeepSeek's R1 models, which previously sat atop the leaderboard, could only get through roughly 9% of the exam. ...Read more

Recommendations