SoccerNet 2026 Player-Centric Ball Action Spotting: Per-Player Attention with Agreement-Based Ensembling
Quick Answer
This paper shows that The SoccerNet 2026 submission introduces a two-stage pipeline for player-centric ball action spotting, achieving a Macro-F1 score of 58.94, up from a baseline of 48.6.
Quick Take
The SoccerNet 2026 submission introduces a two-stage pipeline for player-centric ball action spotting, achieving a Macro-F1 score of 58.94, up from a baseline of 48.6. Key innovations include a Track-Aware Action Detector (TAAD) enhanced with a temporal transformer and a Denoising Sequence Transduction (DST) transformer employing a novel per-player attention mechanism. The ensemble approach effectively reduces false positives while maintaining recall.
Key Points
- Introduced Track-Aware Action Detector (TAAD) for per-player action logits.
- Enhanced TAAD with a temporal transformer for cross-frame context.
- Achieved 1.87% improvement in Macro-F1 with spatial-first attention ordering.
- Utilized a Weighted Event Fusion ensemble to reduce false positives.
- Final system improved challenge Macro-F1 from 48.6 to 58.94.
Paper Resources
📖 Reader Mode
~2 min readAbstract:We present our submission to the SoccerNet 2026 Player-Centric Ball Action Spotting challenge, which uses a two-stage pipeline: a Track-Aware Action Detector (TAAD) produces per-player action logits from broadcast video, and a Denoising Sequence Transduction (DST) transformer converts game-state features and TAAD logits into structured event sequences. We improve the TAAD with a temporal transformer that adds cross-frame context, alongside several training fixes. For the DST stage, we introduce a two-stage per-player attention mechanism operating on game-state features, and show that a spatial-first attention ordering (cross-player attention before temporal attention) improves validation Macro-F1 by 1.87%. To exploit architectural diversity, we train four model variants and combine them with a Weighted Event Fusion ensemble that applies agreement filtering to suppress single-model false positives while preserving recall, plus a dedicated exception for the rare tackle class. Our final system improves the challenge Macro-F1 from a baseline of 48.6 to 58.94.
| Comments: | 2 pages, 1 figure, 2 tables. SoccerNet 2026 challenge technical report |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2606.28389 [cs.CV] |
| (or arXiv:2606.28389v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2606.28389 arXiv-issued DOI via DataCite |
Submission history
From: Faisal Altawijri [view email]
[v1]
Tue, 23 Jun 2026 09:16:04 UTC (5 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.