Explaining RhythmFormer: A Systematic XAI Analysis of Periodic Sparse Attention for Remote Photoplethysmography

arXiv cs.CV·Louis Chen, Torbj\"orn E. M. Nordling

6h ago

·~2 min·6/15/2026·en·0

Quick Answer

Quick Take

RhythmFormer enhances explainable AI (XAI) for remote photoplethysmography (rPPG) by introducing quantitative metrics for attribution methods, achieving a median refined skin coverage of 0.83 and a faithfulness score of 0.92 on UBFC-rPPG. This addresses the gap in existing qualitative analyses, providing a more trustworthy framework for clinical heart rate estimation.

Key Points

Adapted four attribution methods for RhythmFormer's bi-level routing attention.
Introduced a skin coverage metric to quantify attribution mass on skin regions.
Achieved highest median refined skin coverage of 0.83 compared to vanilla rollout.
Demonstrated a multi-hop leakage effect under sparse top-k routing.
Validated need for diverse datasets and model variants for robust results.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2606. 13839v1 Announce Type: new Abstract: Remote photoplethysmography (rPPG) transformers achieve low heart-rate error on benchmarks, yet their decisions remain opaque--a growing concern as rPPG moves toward clinical heart rate estimation. Existing rPPG XAI is dominated by qualitative heatmap inspection without quantitative faithfulness metrics or physiology-grounded validation, leaving a gap between visual plausibility and auditable evidence. We address this gap.

First, we adapt four attribution methods (raw attention, rollout, flow, Beyond Intuition) to RhythmFormer's bi-level routing attention with top-$k$ selection. Second, we introduce a skin coverage metric quantifying how much attribution mass falls on skin regions. Third, we adapt the SaCo faithfulness coefficient from its original classification setting to rPPG regression by using the MAE between original and perturbed predicted rPPG waveforms as the perturbation impact.

Applying these tools, we quantify a multi-hop leakage effect under sparse top-$k$ routing: attention rollout and flow almost completely restores the connections that individual refined-attention layers explicitly set to zero. Beyond Intuition mitigates this via its value-projection-weighted rollout and gradient-supported mask, attaining the highest median refined skin coverage ($0. 83$ vs. $0. 57$ for vanilla rollout) and faithfulness ($F=0. 92$) among the evaluated methods on UBFC-rPPG.

Validation across diverse datasets and model variants is needed. A case study on a low-SaCo outlier further shows all four methods recovering consistently once an artefactual region is replaced, suggesting consistent SaCo behavior across attribution families in this illustrative case. Together, these metrics move XAI for rPPG toward auditable numerical evidence about spatial alignment and perturbation faithfulness, i. e. trustworthy rPPG XAI.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

1w ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup