Enhancing Layer Interaction Using Key-Correlated Layer Attention
Quick Answer
This paper shows that Key-Correlated Layer Attention (KCLA) improves inter-layer interactions in neural networks by achieving linear computational complexity while maintaining dynamic information updates.
Quick Take
Key-Correlated Layer Attention (KCLA) improves inter-layer interactions in neural networks by achieving linear computational complexity while maintaining dynamic information updates. This novel approach enhances long-range cross-layer connections and has shown strong performance in tasks like image recognition and medical image segmentation.
Key Points
- KCLA reduces computational complexity from quadratic to linear with respect to network depth.
- It preserves dynamic information updates, overcoming limitations of previous attention mechanisms.
- Empirical results show KCLA performs well in image recognition, object detection, and segmentation tasks.
- The code for KCLA is publicly available for further research and application.
- KCLA maintains fixed spatial complexity, independent of the number of layers.
Paper Resources
📖 Reader Mode
~2 min readAbstract:Recent advances in network architecture design have introduced layer attention to enhance inter-layer interactions. In such frameworks, each layer queries all preceding layers to establish cross-layer connections. However, layer attention results in quadratic computational complexity with respect to network depth. To mitigate this issue, prior works have proposed Recurrent Layer Attention (RLA) and linear attention mechanisms, which suffer from static information updates and limited long-range cross-layer dependency modeling. To overcome these limitations, we propose Key-Correlated Layer Attention (KCLA), inspired by our observation that Key representations in layer attention exhibit high cosine similarity. KCLA achieves linear computational complexity while preserving dynamic information updates, directly derived from the foundational definition of layer attention. Furthermore, KCLA maintains long-range cross-layer connections and features a fixed spatial complexity, independent of network depth. Empirical evaluations demonstrate that KCLA delivers good performance across diverse tasks, including image recognition, object detection, and medical image segmentation. The code is publicly available at this https URL.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2606.28405 [cs.CV] |
| (or arXiv:2606.28405v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2606.28405 arXiv-issued DOI via DataCite |
Submission history
From: Tao He [view email]
[v1]
Wed, 24 Jun 2026 13:54:52 UTC (449 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.