AI Glossary
What is Humanity's Last Exam?
Overview
Humanity's Last Exam is a difficult expert-level benchmark for testing frontier AI systems across broad academic and professional knowledge. It matters because many standard benchmarks are saturated, so labs use harder exams like HLE to show whether models can answer questions that still challenge specialists.
Why it matters
As frontier models approach older benchmark ceilings, harder expert tests help separate memorized competence from robust reasoning.
Where it appears in AI research
- Frontier model launch reports
- AI evaluation research
- Reasoning benchmark comparisons
- Safety and capability discussions