Don’t panic: ‘Humanity’s last exam’ has begun

When artificial intelligence systems began acing long-standing academic assessments, researchers realized they had a problem: the tests were too easy. Popular evaluations, such as the Massive Multitask Language Understanding (MMLU) exam, once considered formidable, are no longer challenging enough to meaningfully test advanced AI systems.

This article is brought to you by this site.

Skip The Dishes Referral Code