SoccerNet 2026 Player-Centric Ball Action Spotting: Per-Player Attention with Agreement-Based Ensembling

arXiv cs.CV·Faisal Altawijri, Ismail Mathkour

1d ago

·~2 min·6/30/2026·en·0

Quick Answer

This paper shows that The SoccerNet 2026 submission introduces a two-stage pipeline for player-centric ball action spotting, achieving a Macro-F1 score of 58.94, up from a baseline of 48.6.

Quick Take

The SoccerNet 2026 submission introduces a two-stage pipeline for player-centric ball action spotting, achieving a Macro-F1 score of 58.94, up from a baseline of 48.6. Key innovations include a Track-Aware Action Detector (TAAD) enhanced with a temporal transformer and a Denoising Sequence Transduction (DST) transformer employing a novel per-player attention mechanism. The ensemble approach effectively reduces false positives while maintaining recall.

Key Points

Introduced Track-Aware Action Detector (TAAD) for per-player action logits.
Enhanced TAAD with a temporal transformer for cross-frame context.
Achieved 1.87% improvement in Macro-F1 with spatial-first attention ordering.
Utilized a Weighted Event Fusion ensemble to reduce false positives.
Final system improved challenge Macro-F1 from 48.6 to 58.94.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

📖 Reader Mode

~2 min read

[Submitted on 23 Jun 2026]

View PDF HTML (experimental)

Abstract:We present our submission to the SoccerNet 2026 Player-Centric Ball Action Spotting challenge, which uses a two-stage pipeline: a Track-Aware Action Detector (TAAD) produces per-player action logits from broadcast video, and a Denoising Sequence Transduction (DST) transformer converts game-state features and TAAD logits into structured event sequences. We improve the TAAD with a temporal transformer that adds cross-frame context, alongside several training fixes. For the DST stage, we introduce a two-stage per-player attention mechanism operating on game-state features, and show that a spatial-first attention ordering (cross-player attention before temporal attention) improves validation Macro-F1 by 1.87%. To exploit architectural diversity, we train four model variants and combine them with a Weighted Event Fusion ensemble that applies agreement filtering to suppress single-model false positives while preserving recall, plus a dedicated exception for the rare tackle class. Our final system improves the challenge Macro-F1 from a baseline of 48.6 to 58.94.

Comments:	2 pages, 1 figure, 2 tables. SoccerNet 2026 challenge technical report
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2606.28389 [cs.CV]
	(or arXiv:2606.28389v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2606.28389 arXiv-issued DOI via DataCite

Submission history

From: Faisal Altawijri [view email]
[v1] Tue, 23 Jun 2026 09:16:04 UTC (5 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

3w ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup