M3Net: A Macro-to-Meso-to-Micro Clinical-inspired Hierarchical 3D Network for Pulmonary Nodule Classification
Quick Take
M3Net is a hierarchical 3D network for improved pulmonary nodule classification using multi-scale contextual information.
Key Points
- Integrates fine-grained and global anatomical information.
- Achieves state-of-the-art accuracy on LIDC-IDRI and USTC-FHLN datasets.
- Code available on GitHub for further research.
📖 Reader Mode
~2 min readAbstract:The accurate classification of benign and malignant pulmonary nodules in CT scans is critical for early lung cancer screening, yet remains challenging due to the multi-scale and heterogeneous nature of pulmonary nodules. While deep learning offers potential for auxiliary diagnosis, most existing models act as "black boxes", lacking the transparency and explainability required for trustworthy clinical integration. To address this issue, we propose M3Net, a novel 3D network for pulmonary nodule classification inspired by the hierarchical diagnostic workflow of radiologists, which integrates multi-scale contextual information from fine-grained structures to global anatomical relationships. Our framework constructs a progressive multi-scale input, from fine-grained nodule structures to local semantics and global spatial relationships. M3Net employs scale-specific encoders and ensures cross-scale semantic consistency through latent space projection and mutual information maximization. Extensive experiments on the public LIDC-IDRI dataset and a self-collected clinical dataset (USTC-FHLN) demonstrate that our method achieves state-of-the-art performance, with accuracies of 86.96% and 84.24% respectively, outperforming the best baseline by 3.26% and 2.17%. The results validate that M3Net provides a more robust and clinically relevant solution for pulmonary nodule classification. The code is available at this https URL.
| Comments: | Published in Information Fusion (2026), 15 pages, 5 figures |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2605.12570 [cs.CV] |
| (or arXiv:2605.12570v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.12570 arXiv-issued DOI via DataCite |
|
| Journal reference: | Information Fusion, 2026 |
| Related DOI: | https://doi.org/10.1016/j.inffus.2026.104334
DOI(s) linking to related resources |
Submission history
From: Qiankun Li [view email]
[v1]
Tue, 12 May 2026 10:16:41 UTC (2,343 KB)
— Originally published at arxiv.org
More from arXiv cs.CV
See more →CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers
CoReDiT enhances Diffusion Transformers by optimizing token pruning for efficiency and quality.