Knowing When to Ask: Self-Gated Clarification for Hierarchical Language Agents
Quick Answer
The study introduces ACTION-RATING, integrating clarification into the action space of hierarchical language agents, enhancing Information-Seeking Effectiveness from 50% to 74% on Harmonized Tariff Schedule classification with 30,000 nodes across 9 LLMs.
Quick Take
The study introduces ACTION-RATING, integrating clarification into the action space of hierarchical language agents, enhancing Information-Seeking Effectiveness from 50% to 74% on Harmonized Tariff Schedule classification with 30,000 nodes across 9 LLMs. This approach reveals a shift from mandatory to opportunistic clarification, indicating improved decision-making capabilities.
Key Points
- ACTION-RATING allows agents to seek clarification as a competitive action.
- Information-Seeking Effectiveness improved from 50% to 74% across 30,000 nodes.
- Two modes of information-seeking: mandatory and opportunistic, were identified.
- Accuracy gains of +16.2% were observed under controlled answer conditions.
- The study indicates a separation between help-seeking behavior and answer quality.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 11349v1 Announce Type: new Abstract: In hierarchical reasoning, failures often originate at intermediate decision points where the agent commits to a wrong branch without recognizing that it lacks critical information.
Rather than treating clarification as an external uncertainty trigger, we propose ACTION-RATING, a formulation that places it inside the agent's action space on a shared ordinal scale with navigation, so that asking competes directly with acting at every decision point and help-seeking becomes observable at intermediate states. Two structurally distinct information-seeking modes emerge from the agent's own ratings: mandatory (no viable branch) and opportunistic (residual uncertainty despite a leading candidate).
On Harmonized Tariff Schedule classification (30,000-node taxonomy, three benchmarks, 9~LLMs across 4 families), we observe a regime shift from mandatory to opportunistic clarification, with Information-Seeking Effectiveness (ISE), a local diagnostic defined as the fraction of help interactions followed by a correct next navigation step (not a final-task metric), rising from 50% to 74%. Three diagnostic contrasts fail to reproduce this structure.
A separability test shows that the information-seeking pattern (mode split, ISE ranking) persists when answer quality is degraded (-18. 8% accuracy), supporting an empirical separation between where an agent seeks help and the quality of the help it receives. Under the controlled answer channel, accuracy gains reach +16. 2% at 10-digit; we read this as an upper bound on what better localization could unlock, not a deployment estimate.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Arbor: Tree Search as a Cognition Layer for Autonomous Agents
Arbor introduces a multi-agent framework utilizing structured tree search for optimizing LLM inference, achieving up to 193% throughput-latency improvement compared to vendor-optimized systems. It employs an Orchestrator and Critic agent for stability and coordination, demonstrating hardware-agnostic performance with minimal variance.