Progressive Autonomy as Preference Learning: A Formalization of Trust Calibration for Agentic Tool Use

arXiv cs.AI·Changkun Ou

17h ago

·~2 min·5/20/2026·en·0

Quick Take

The paper formalizes trust calibration in agentic tool use as a preference-learning problem.

Key Points

Introduces a policy gateway for risk tolerance assessment.
Utilizes Gaussian-process posterior for feedback analysis.
Classifies actions into allow/block/ask regions.

📖 Reader Mode

~2 min read

[Submitted on 18 May 2026]

View PDF HTML (experimental)

Abstract:We formalize trust calibration for agentic tool use (deciding when an automated agent's proposed action may execute autonomously versus require human approval) as a preference-learning problem. A policy gateway maintains a Gaussian-process posterior over a latent human risk-tolerance function, observed through a probit likelihood on binary approve/deny feedback, and escalates to the human exactly where the approval outcome is most uncertain. We show this is structurally an instance of Preferential Bayesian Optimization, inheriting its inference machinery (approximate Gaussian-process classification) and its sample-efficiency argument (uncertainty-targeted querying), while differing in objective: classifying an action space into allow/block/ask regions rather than optimizing a design.

Subjects:	Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2605.19151 [cs.AI]
	(or arXiv:2605.19151v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2605.19151 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Changkun Ou [view email]
[v1] Mon, 18 May 2026 22:11:15 UTC (128 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Progressive Autonomy as Preference Learning: A Formalization of Trust Calibration for Agentic Tool Use

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.AI

From Prompts to Protocols: An AI Agent for Laboratory Automation

Agentic Trading: When LLM Agents Meet Financial Markets

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems

Related in this space

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Verifiable Agentic Infrastructure: Proof-Derived Authorization for Sovereign AI Systems

MedFM-Robust: Benchmarking Robustness of Medical Foundation Models