Attention-Guided Autoencoder Fusion for Insulator Defect Detection Using UAV Transmission-Line Imaging
Quick Answer
The AE-YOLO framework enhances insulator defect detection in UAV imagery, achieving 95.10% mAP with EfficientNetV2.
Quick Take
The AE-YOLO framework enhances insulator defect detection in UAV imagery, achieving 95.10% mAP with EfficientNetV2. It surpasses YOLO baselines by 5.0 points in mAP and 6.7 points in recall, addressing class imbalance and improving localization accuracy.
Key Points
- Integrates lightweight autoencoders in a Feature Pyramid Network for multi-scale feature fusion.
- Employs Convolutional Block Attention Modules to enhance feature discrimination.
- Achieves 96.40% precision and 93.80% recall on the Insulator-Defect Detection dataset.
- Introduces a variance-maximizing autoencoder regularization strategy for diverse representations.
- Utilizes Weighted Boxes Fusion for improved prediction accuracy during inference.
Article Content
From source RSS / original summaryarXiv:2606. 06536v1 Announce Type: new Abstract: Automated defect detection in high-voltage transmission-line insulators remains challenging due to severe class imbalance, large scale variation, and the small spatial extent of defect instances in Unmanned Aerial Vehicle (UAV) imagery. To address these challenges, this paper proposes AE-YOLO, an Attention-Guided AutoEncoder-Enhanced YOLO framework for robust insulator defect detection.
The architecture integrates lightweight bottleneck autoencoders within a Feature Pyramid Network-Path Aggregation Network (FPN-PAN) neck. This preserves anomaly-sensitive information during multi-scale feature fusion. Convolutional Block Attention Modules (CBAM) are used throughout the backbone, enhancing feature discrimination and suppressing background interference.
The framework also introduces a variance-maximizing autoencoder regularization strategy, which encourages diverse, defect-discriminative latent representations. The network trains using a unified objective that combines focal loss, Complete IoU (CIoU) loss, and autoencoder regularization to address foreground-background imbalance and improve localization accuracy. During inference, Weighted Boxes Fusion (WBF) combines predictions from YOLOv8, YOLOv10, and YOLO11.
An autoencoder-guided confidence boosting mechanism improves sensitivity to rare defect categories. Experiments on the Insulator-Defect Detection dataset show that AE-YOLO with an EfficientNetV2 backbone achieves 95. 10 percent mAP at 0. 5, 96. 40 percent precision, and 93. 80 percent recall. This performance surpasses the strongest YOLO-family baseline by 5. 0 points in mAP at 0. 5 and 6. 7 points in recall. These results confirm the effectiveness and adaptability of the framework.
The model is a practical and scalable solution for UAV-based transmission-line inspection and defect monitoring.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.
