RareDxR1: Autonomous Medical Reasoning for Rare Disease Diagnosis Beyond Human Annotation
Quick Answer
RareDxR1 is a novel end-to-end large language model for rare disease diagnosis, achieving state-of-the-art accuracy without human annotation.
Quick Take
RareDxR1 is a novel end-to-end large language model for rare disease diagnosis, achieving state-of-the-art accuracy without human annotation. It utilizes Reflection-Enhanced Reasoning Sampling (RERS) and dual-level curriculum reinforcement learning, significantly improving diagnostic reasoning from unstructured clinical notes.
Key Points
- Introduces RareDxR1, an AI model for open-domain rare disease diagnosis.
- Employs Reflection-Enhanced Reasoning Sampling to synthesize expert diagnostic paths.
- Achieves state-of-the-art accuracy across various benchmarks.
- Utilizes dual-level curriculum reinforcement learning for improved training.
- Code and dataset will be publicly available for further research.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2607. 00147v1 Announce Type: new Abstract: Rare disease differential diagnosis is a critical yet arduous clinical task, requiring physicians to identify precise phenotypes from complex, unstructured patient symptoms and execute intricate reasoning within a vast search space.
However, existing AI approaches typically rely on pipeline-based phenotype extraction or , which suffer from critical information loss due to predefined ontologies, retrieval bottlenecks, and a lack of diagnostic logic. To address these challenges, we introduce RareDxR1, an end-to-end reasoning-centric large language model designed for open-domain rare disease diagnosis directly from unstructured clinical notes.
We design a progressive end-to-end training framework by synergizing knowledge internalization with autonomous evolutionary learning, thereby bypassing reliance on structured phenotypes and closed-set decision-making. To overcome the limitations of RAG and phenotype restriction, we enabled the deep internalization of fragmented rare-disease knowledge directly into the model's parameters.
Moreover, to bridge the gap between model generation and expert reasoning, we propose Reflection-Enhanced Reasoning Sampling (RERS), a strategy that synthesizes expert-level diagnostic trajectories by learning from failures without human annotation. Additionally, we propose a dual-level curriculum reinforcement learning approach for gradually mastering rare disease diagnosis.
Experimental results demonstrate that RareDxR1 achieves state-of-the-art accuracy across different benchmarks, marking a significant breakthrough in open-domain rare disease diagnosis. Our code and dataset will be publicly available.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Verification Horizon: No Silver Bullet for Coding Agent Rewards
As coding agents evolve, verifying solutions becomes more challenging than generating them, necessitating a focus on scalable, faithful, and robust verification methods. The study reveals that no fixed reward function can sustain effectiveness as model capabilities advance, emphasizing the need for verification to evolve alongside solution generation.