OralAgent: Integrating Reasoning, Tools, and Knowledge for Interactive Dental Image Analysis

arXiv cs.CL·Jing Hao, Siyuan Dai, Yongxin Zhang, Yuci Liang, Jiamin Wu, Jiahao Bao, Yuxuan Fan, Zanting Ye, Yanpeng Sun, Xinyu Zhang, Ming Hu, Liang Zhan, James Kit Hon Tsoi, Linlin Shen, Junjun He, Kuo Feng Hung

2d ago

·~1 min·5/28/2026·en·1

Quick Take

OralAgent is a pioneering dental AI agent that integrates multimodal reasoning and tool-based decision-making, achieving state-of-the-art results on MMOral-Uni, MMOral-OPG, and OralQA-ZH benchmarks. It utilizes 22 visual analysis tools and 368 dental textbooks, enhancing clinical workflows in oral healthcare.

Key Points

OralAgent integrates 22 visual analysis tools for comprehensive dental image analysis.
It leverages 368 classical dental textbooks for knowledge-grounded retrieval.
Achieves state-of-the-art performance on multiple dental benchmarks.
Introduces OralCorpus, a bilingual resource with 134.8M tokens for RAG.
Constructed OralQA-ZH, a benchmark with 798 items across eleven subspecialties.

Article Content

From source RSS / original summary

arXiv:2605. 27378v1 Announce Type: new Abstract: Dental image analysis plays a pivotal role in supporting accurate diagnosis and treatment planning in oral healthcare. Although recent advances have produced dental AI models for specific tasks and individual imaging modalities, their isolated designs limit practical use in real-world clinical workflows.

In this paper, we present OralAgent, the first dental-specialized AI agent that unifies multimodal reasoning, tool-based decision-making, and knowledge-grounded retrieval within an end-to-end automated framework. It integrates 22 visual analysis tools and 368 widely-used classical dental textbooks, enabling autonomous reasoning, planning, tool use, knowledge retrieval, and multi-step workflow execution. Furthermore, we introduce OralCorpus, a large-scale, high-quality bilingual textual resource containing 134.

8M tokens curated for dental retrieval-augmented generation (RAG). To evaluate models' multidisciplinary dental knowledge, we construct OralQA-ZH, a Chinese multiple-choice question benchmark consisting of 798 items across eleven oral subspecialties. Extensive experiments demonstrate that OralAgent achieves state-of-the-art performance on the MMOral-Uni, MMOral-OPG, and OralQA-ZH benchmarks, highlighting its effectiveness, interpretability, and adaptability in real-world clinical settings.

The code and models are publicly available at https://github. com/isjinghao/OralAgent.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

OralAgent: Integrating Reasoning, Tools, and Knowledge for Interactive Dental Image Analysis

Quick Take

Key Points

Article Content

Want this in your inbox every morning?

More from arXiv cs.CL

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

What are They Thinking? Delineation, Probing and Tracking of Concepts in LLMs

In-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent Perspective