When Should Agent Trust Be Conditional? Characterizing and Attacking Skill-Conditional Reputation in Agent Swarms
Quick Answer
This study introduces skill-conditional trust R(i | k) for heterogeneous LLM agents, revealing that conditional trust is beneficial under high agent diversity and sparse evidence.
Quick Take
This study introduces skill-conditional trust R(i | k) for heterogeneous LLM agents, revealing that conditional trust is beneficial under high agent diversity and sparse evidence. However, attackers can exploit this system, leading to significant routing errors, with a potential regret increase from 0 to 0.94, despite a zero-cost trust rating of +0.19 being contaminated to -0.06.
Key Points
- Conditional trust is advantageous in high heterogeneity and sparse evidence scenarios.
- Attackers can hijack the conditional router, causing routing regret to spike.
- The study uses a benchmark of 14 heterogeneous AppWorld agents.
- Cross-skill evidence borrowing can lead to trust contamination.
- A zero-evidence gate limits but does not eliminate the attack risk.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 14200v1 Announce Type: new Abstract: Open platforms increasingly route tasks among heterogeneous LLM agents--differing in base model, scaffold, and tool stack--whose competence varies sharply by skill: an agent excellent at one skill may be useless at another. The standard reputation approach summarizes each agent by a single global trust score, but that scalar is the wrong object here, because routing every task to the globally most-trusted agent leaves the value of specialization unclaimed.
We study skill-conditional trust R(i | k)--the trust to place in agent i for a task requiring skill k, rather than one score per agent--and pose three falsifiable questions: when is conditioning worth it, how much cross-skill evidence should be borrowed, and whether that borrowing is safe.
A controlled phase-diagram analysis answers the first two: conditional trust wins only in a specific regime--high agent heterogeneity, sparse per-skill evidence, and correlated skills--and the coupling strength beta that buys this data efficiency is dual-use, because the same cross-skill borrowing is also a laundering channel.
On a public benchmark of 14 genuinely heterogeneous AppWorld agents, real pools land inside the beneficial regime--a small but genuine gain, with the per-skill best agent genuinely changing across skills. We then show that an attacker with cheap evidence in one skill and none in a target skill hijacks the conditional router, driving routing regret from 0 to 0. 94 on a pool our zero-cost Conditional Information Value Test (CIVT) rates GREEN--while the ungated trust verdict it contaminates reads -0.
06 instead of the honest +0. 19. A zero-evidence gate bounds the attack but does not eliminate it; we characterize the residual cost under an explicit budget. We do not claim Sybil-resistance--we quantify the trade-off.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Arbor: Tree Search as a Cognition Layer for Autonomous Agents
Arbor introduces a multi-agent framework utilizing structured tree search for optimizing LLM inference, achieving up to 193% throughput-latency improvement compared to vendor-optimized systems. It employs an Orchestrator and Critic agent for stability and coordination, demonstrating hardware-agnostic performance with minimal variance.
