AI Glossary
What is GPQA?
Overview
GPQA is a graduate-level science question-answering benchmark designed to test difficult expert reasoning. It matters because strong GPQA scores suggest a model can handle specialized physics, chemistry, and biology questions that are hard to solve by search or simple pattern matching.
Why it matters
GPQA is often used to compare frontier models on deep reasoning rather than broad factual recall.
Where it appears in AI research
- Model release scorecards
- Reasoning model evaluations
- Science QA research
- Benchmark saturation debates