Hybrid-IR: Dual-Path Hybrid Retrieval with Iterative Reasoning for Complex Medical Question Answering
Quick Answer
The Hybrid-IR framework introduces a dual-path retrieval mechanism for complex medical question answering, combining graph-based and dense retrieval methods.
Quick Take
The Hybrid-IR framework introduces a dual-path retrieval mechanism for complex medical question answering, combining graph-based and dense retrieval methods. This iterative reasoning approach enhances semantic matching and knowledge exploration, outperforming existing models on three medical QA benchmarks.
Key Points
- Hybrid-IR integrates graph-based and dense retrieval for enhanced medical QA.
- The iterative reasoning mechanism refines the retrieval process progressively.
- Experiments show significant improvements on three medical QA benchmarks.
- Addresses limitations of traditional methods.
- Aims to reduce hallucinations and outdated knowledge in medical applications.
Paper Resources
📖 Reader Mode
~2 min readAbstract:Large language models (LLMs) have shown promising performance across a wide range of biomedical applications, including medical question answering (QA), yet they remain prone to hallucinations and outdated knowledge. Although retrieval-augmented generation (RAG) can alleviate this issue by incorporating external documents, there still exist two fundamental limitations. First, medical knowledge is often fragmented across documents, while most RAG methods rely on a single retrieval path, which makes it challenging to jointly preserve fine-grained semantic information and structured global associations. Second, static retrieval strategies are typically insufficient to support deep reasoning that is important in complex medical QA. In this paper, we present a dual-path retrieval framework with an iterative retrieval-reasoning mechanism termed "Hybrid-IR" for complex medical QA. The proposed Hybrid-IR integrates graph-based retrieval for exploration of structured knowledge and dense retrieval for fine-grained semantic matching. Moreover, the reasoning trajectory can be progressively refined through an iterative retrieve-reason loop. Experiments on three widely used medical QA benchmarks demonstrate the effectiveness of our Hybrid-IR.
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2606.25338 [cs.CL] |
| (or arXiv:2606.25338v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2606.25338 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Jiahui Zhang [view email]
[v1]
Wed, 24 Jun 2026 03:09:50 UTC (1,593 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Quantifying Prior Dominance in Systems
The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.