Explaining RhythmFormer: A Systematic XAI Analysis of Periodic Sparse Attention for Remote Photoplethysmography
Quick Answer
RhythmFormer enhances explainable AI (XAI) for remote photoplethysmography (rPPG) by introducing quantitative metrics for attribution methods, achieving a median refined skin coverage of 0.83 and a faithfulness score of 0.92 on UBFC-rPPG.
Quick Take
RhythmFormer enhances explainable AI (XAI) for remote photoplethysmography (rPPG) by introducing quantitative metrics for attribution methods, achieving a median refined skin coverage of 0.83 and a faithfulness score of 0.92 on UBFC-rPPG. This addresses the gap in existing qualitative analyses, providing a more trustworthy framework for clinical heart rate estimation.
Key Points
- Adapted four attribution methods for RhythmFormer's bi-level routing attention.
- Introduced a skin coverage metric to quantify attribution mass on skin regions.
- Achieved highest median refined skin coverage of 0.83 compared to vanilla rollout.
- Demonstrated a multi-hop leakage effect under sparse top-k routing.
- Validated need for diverse datasets and model variants for robust results.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 13839v1 Announce Type: new Abstract: Remote photoplethysmography (rPPG) transformers achieve low heart-rate error on benchmarks, yet their decisions remain opaque--a growing concern as rPPG moves toward clinical heart rate estimation. Existing rPPG XAI is dominated by qualitative heatmap inspection without quantitative faithfulness metrics or physiology-grounded validation, leaving a gap between visual plausibility and auditable evidence. We address this gap.
First, we adapt four attribution methods (raw attention, rollout, flow, Beyond Intuition) to RhythmFormer's bi-level routing attention with top-$k$ selection. Second, we introduce a skin coverage metric quantifying how much attribution mass falls on skin regions. Third, we adapt the SaCo faithfulness coefficient from its original classification setting to rPPG regression by using the MAE between original and perturbed predicted rPPG waveforms as the perturbation impact.
Applying these tools, we quantify a multi-hop leakage effect under sparse top-$k$ routing: attention rollout and flow almost completely restores the connections that individual refined-attention layers explicitly set to zero. Beyond Intuition mitigates this via its value-projection-weighted rollout and gradient-supported mask, attaining the highest median refined skin coverage ($0. 83$ vs. $0. 57$ for vanilla rollout) and faithfulness ($F=0. 92$) among the evaluated methods on UBFC-rPPG.
Validation across diverse datasets and model variants is needed. A case study on a low-SaCo outlier further shows all four methods recovering consistently once an artefactual region is replaced, suggesting consistent SaCo behavior across attribution families in this illustrative case. Together, these metrics move XAI for rPPG toward auditable numerical evidence about spatial alignment and perturbation faithfulness, i. e. trustworthy rPPG XAI.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.