Attention-Guided Autoencoder Fusion for Insulator Defect Detection Using UAV Transmission-Line Imaging

arXiv cs.CV·Malak Allam, Khaled Shaban, Ali Hamdi

3h ago

·~2 min·6/8/2026·en·0

Quick Answer

The AE-YOLO framework enhances insulator defect detection in UAV imagery, achieving 95.10% mAP with EfficientNetV2.

Quick Take

The AE-YOLO framework enhances insulator defect detection in UAV imagery, achieving 95.10% mAP with EfficientNetV2. It surpasses YOLO baselines by 5.0 points in mAP and 6.7 points in recall, addressing class imbalance and improving localization accuracy.

Key Points

Integrates lightweight autoencoders in a Feature Pyramid Network for multi-scale feature fusion.
Employs Convolutional Block Attention Modules to enhance feature discrimination.
Achieves 96.40% precision and 93.80% recall on the Insulator-Defect Detection dataset.
Introduces a variance-maximizing autoencoder regularization strategy for diverse representations.
Utilizes Weighted Boxes Fusion for improved prediction accuracy during inference.

Article Content

From source RSS / original summary

arXiv:2606. 06536v1 Announce Type: new Abstract: Automated defect detection in high-voltage transmission-line insulators remains challenging due to severe class imbalance, large scale variation, and the small spatial extent of defect instances in Unmanned Aerial Vehicle (UAV) imagery. To address these challenges, this paper proposes AE-YOLO, an Attention-Guided AutoEncoder-Enhanced YOLO framework for robust insulator defect detection.

The architecture integrates lightweight bottleneck autoencoders within a Feature Pyramid Network-Path Aggregation Network (FPN-PAN) neck. This preserves anomaly-sensitive information during multi-scale feature fusion. Convolutional Block Attention Modules (CBAM) are used throughout the backbone, enhancing feature discrimination and suppressing background interference.

The framework also introduces a variance-maximizing autoencoder regularization strategy, which encourages diverse, defect-discriminative latent representations. The network trains using a unified objective that combines focal loss, Complete IoU (CIoU) loss, and autoencoder regularization to address foreground-background imbalance and improve localization accuracy. During inference, Weighted Boxes Fusion (WBF) combines predictions from YOLOv8, YOLOv10, and YOLO11.

An autoencoder-guided confidence boosting mechanism improves sensitivity to rare defect categories. Experiments on the Insulator-Defect Detection dataset show that AE-YOLO with an EfficientNetV2 backbone achieves 95. 10 percent mAP at 0. 5, 96. 40 percent precision, and 93. 80 percent recall. This performance surpasses the strongest YOLO-family baseline by 5. 0 points in mAP at 0. 5 and 6. 7 points in recall. These results confirm the effectiveness and adaptability of the framework.

The model is a practical and scalable solution for UAV-based transmission-line inspection and defect monitoring.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

3d ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup

Attention-Guided Autoencoder Fusion for Insulator Defect Detection Using UAV Transmission-Line Imaging

Quick Answer

Quick Take

Key Points

Article Content

Want this in your inbox every morning?

More from arXiv cs.CV

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

Biomazon: A Multimodal Dataset for 3D Forest Structure and Biomass Modeling in the Amazon Basin

Optimal Transport Flow Matching by Design

Related in this space

The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane

Aptiv to Deliver Production-Ready Edge AI with Long-Term Support with NVIDIA