Declarative Skills for AI Agents in Knowledge-Grounded Tool-Use Workflows
Quick Answer
This study evaluates declarative AI agents against imperative ones in customer-service workflows, revealing that declarative skills enhance accuracy and reduce errors under high-quality retrieval, while imperative agents struggle with task success.
Quick Take
This study evaluates declarative AI agents against imperative ones in customer-service workflows, revealing that declarative skills enhance accuracy and reduce errors under high-quality retrieval, while imperative agents struggle with task success. The analysis includes three agent types formalized in a decentralized Markov decision process, highlighting retrieval quality as a critical bottleneck.
Key Points
- Declarative agents use natural-language skill files for improved orchestration in workflows.
- Retrieval quality significantly impacts all agent types, leading to performance degradation.
- Under high-quality retrieval, declarative skills enhance procedural task accuracy.
- Imperative agents show brittleness, failing to reliably improve task success rates.
- The study formalizes agents within a decentralized partially-observable Markov decision process.
Article Content
From source RSS / original summaryarXiv:2606. 06923v1 Announce Type: new Abstract: We study orchestration mechanisms for tool-using AI agents in realistic customer-service workflows over an unstructured knowledge base. We argue that declarative agents -- AI agents equipped with natural-language skill files appended to the system prompt -- are an effective orchestration paradigm.
Concretely, we compare (i) a DeclarativeAgent that reads three domain-specific skill files at inference time and decides its own control flow, (ii) an ImperativeAgent based on a programmatic state machine with explicit phases, and (iii) an unscaffolded baseline agent modeled after the $\tau$-Knowledge benchmark agent. Our ImperativeAgent is motivated by externalised-control inference as in Recursive Language Models and graph-based orchestration frameworks.
We formalise the three agents as policy classes within a decentralised partially-observable Markov decision process and analyse their information-theoretic and structural properties; we then test the predicted differences empirically on five language models and two retrieval regimes. Our results show that retrieval quality is a dominant bottleneck for AI agents: when evidence is incomplete or skewed, all agents degrade substantially, and skill files cannot recover lost performance.
Under high-quality retrieval, however, declarative skills consistently improve accuracy on procedural tasks and reduce orchestration errors, while the imperative state machine's brittleness does not reliably improve task success or compliance.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective
This paper addresses the sim-to-real gap for foundation model agents by framing it within a Markov Decision Process (MDP) structure. It advocates for established solutions like domain randomization to enhance agent robustness, aiming to create standardized benchmarks for reliable real-world applications.