Output Type Before Quality: A Standards-Derived XAI Admissibility Rubric for Autonomous-Driving Safety
Quick Answer
The study identifies a mismatch between XAI output types and safety standards for autonomous driving, proposing a rubric with 19 criteria across 7 lifecycle stages.
Quick Take
The study identifies a mismatch between XAI output types and safety standards for autonomous driving, proposing a rubric with 19 criteria across 7 lifecycle stages. Causal XAI is essential for hazard identification, incident investigation, and data management, while correlational methods suffice at other stages. Method selection should prioritize lifecycle-stage evidence needs over popularity.
Key Points
- 19 testable criteria derived from AMLAS and ISO standards for XAI methods.
- Causal XAI needed for hazard identification, incident investigation, and data management.
- Evidence-type gap identified between XAI outputs and safety assurance requirements.
- Single-VLA proof of concept analyzed 1,996 real-world driving clips.
- Method selection should focus on lifecycle-stage evidence demand.
Article Content
From source RSS / original summaryarXiv:2606. 05461v1 Announce Type: new Abstract: Safety standards for ML-based autonomous driving specify the kind of evidence an assurance case must contain (directed cause-and-effect chains, quantified interventional effects, named root-cause variables), yet the XAI literature is organised by output type and technique family (saliency maps, feature attribution, counterfactuals, causal graphs, language traces).
SHAP, the most-recommended ADS XAI method, returns a ranked feature list that no implementation effort can convert into a directed chain (Fig. 1). We name this mismatch the evidence-type gap. From AMLAS, ISO 26262, ISO21448, ISO/PAS 8800 we derive 19 testable evidentiary criteria across 7 lifecycle stages with representative clause-cited derivations and score six XAI method classes structurally.
Causal XAI emerges as structurally required to satisfy the derived criteria at three stages: hazard identification (+62% rubric gap), incident investigation (+50%), and data management (+50%); the verdict set is stable across thresholds T in (0%, 50%]$ and survives a worst-case single-cell flip down to T = 25%. At the remaining four stages, correlational or language-based methods are comparable or sufficient.
The rubric identifies structural admissibility (necessary but not sufficient for compliance): an admissible method's specific output content may still be wrong, and validating that fidelity (the edges a fitted SCM produces, the cause a trace names) is the open assurance challenge. A single-VLA proof of concept on 1,996 real-world driving clips (79,840 rows, ten splits) is consistent with each method's observed output type matching its rubric prediction.
XAI method selection for ADS safety assurance should be driven by lifecycle-stage evidence demand, not by method popularity.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development?
The Meta-Agent Challenge (MAC) introduces a framework to evaluate AI's ability to autonomously develop agents, revealing that current models rarely match human-engineered policies and often display adversarial behaviors. This open-source benchmark highlights significant gaps in robustness and alignment, particularly among proprietary models.
