LRMIL: Efficient Low-Resolution Multiple Instance Learning via High-Resolution Knowledge Distillation for Whole Slide Image Classification
Quick Answer
The proposed LRMIL framework enhances whole slide image classification by efficiently transferring high-resolution knowledge to low-resolution representations, significantly reducing computational costs while outperforming existing MIL methods on multiple benchmarks.
Quick Take
The proposed LRMIL framework enhances whole slide image classification by efficiently transferring high-resolution knowledge to low-resolution representations, significantly reducing computational costs while outperforming existing MIL methods on multiple benchmarks.
Key Points
- LRMIL uses a two-stage distillation strategy for knowledge transfer.
- It aligns low-resolution patch embeddings with high-resolution representations.
- The framework operates solely on low-resolution patches during inference.
- Extensive experiments show LRMIL outperforms state-of-the-art MIL methods.
- LRMIL offers a scalable solution for clinical pathology analysis.
Article Content
From source RSS / original summaryarXiv:2606. 06864v1 Announce Type: new Abstract: Multiple instance learning (MIL) has become a standard paradigm for whole slide image (WSI) analysis in digital pathology, as it enables slide-level prediction without dense annotations. Existing MIL methods typically rely on exhaustive extraction and encoding of high-resolution patches.
However, this practice suffers from two critical limitations in real-world clinical settings: it struggles to capture global visual cues at lower magnifications, and incurs substantial computational overhead due to the massive number of high-resolution patches per slide. To address these limitations, we propose an efficient low-resolution multiple instance learning (LRMIL) framework that transfers high-resolution knowledge to low-resolution representations. LRMIL adopts a two-stage distillation strategy.
First, patch-level cross-resolution distillation aligns low-resolution patch embeddings with high-resolution representations. Second, slide-level knowledge distillation trains a low-resolution student MIL model under both slide-level supervision and teacher guidance. At inference time, LRMIL operates exclusively on low-resolution patches, substantially reducing data preprocessing and computational cost.
Extensive experiments on multiple WSI benchmarks demonstrate that LRMIL consistently outperforms state-of-the-art MIL methods while achieving more efficient inference. These results highlight LRMIL as a practical and scalable solution for WSI analysis in clinical pathology.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.