Authority Inversion in LLM-Mediated Ubiquitous Systems: When Models Trust Users Over Sensors
Quick Take
Large language models (LLMs) exhibit 'Authority Inversion', where user claims override sensor data, leading to near-zero sensor trust (AAI = -0.805). The proposed Geometric Authority Calibration (GAC) improves accuracy from 1.6% to 27.5% in human activity recognition tasks, highlighting the need for explicit authority auditing in LLM-mediated systems.
Key Points
- Authority Inversion occurs when LLMs prioritize user claims over sensor data.
- Geometric Authority Calibration (GAC) significantly boosts model accuracy in numerical tasks.
- Four models evaluated show extreme sensor distrust, regardless of parameter size.
- Causal injection flips 80.2% of incorrect decisions, validating the proposed framework.
- Explicit authority auditing is essential for reliable LLM deployments.
Article Content
From source RSS / original summaryarXiv:2605. 23938v1 Announce Type: new Abstract: Large language models (LLMs) increasingly fuse heterogeneous inputs in ubiquitous systems. Yet, how LLMs implicitly allocate authority when sensor measurements and user claims conflict remains unexamined, raising critical reliability concerns for deployments where physical sensing must retain priority. Unlike explicit traditional fusion, LLMs bury authority allocation within learned representations.
We discover this allocation is severely format-dependent: numerical sensor data fails to integrate into answer-relevant model directions, allowing natural-language claims to dominate the final decision, a phenomenon we term \textbf{Authority Inversion}.
To diagnose and mitigate this, we develop a geometric framework of context integration, introduce two computable audit metrics, specifically the Context Integration Ratio (CIR) and Authority Alignment Index (AAI), and propose Geometric Authority Calibration (GAC), an inference-time layer-level intervention to suppress misplaced user authority.
Evaluating four models (4B to 35B parameters, three architectures) across four datasets totaling 576 conflict instances reveals extreme inversion: on numerical tasks, models exhibit near-zero sensor trust (AAI = -0. 805, Cohen's d = -2. 14), unaffected by model capacity. Validating our geometric framework, theory-guided causal injection flips 80. 2\% of incorrect decisions (vs. <0. 4\% for random controls). Practically, GAC improves HAR accuracy from 0 -- 1. 6\% to 21. 9 -- 27.
5\%, outperforming prompting baselines. Ultimately, authority allocation in LLM-mediated systems must be explicitly audited and application-specifically configured rather than left implicit.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane
The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.