Toward Reliable Design of LLM-Enabled Agentic Workflows: Optimizing Latency-Reliability-Cost Tradeoffs
Quick Take
This paper presents a framework for optimizing latency, reliability, and cost in LLM-enabled workflows, introducing performance models for both LLM and non-LLM agents. Key findings include a water-filling token allocation policy and optimal workflow reliability characterized by shadow prices, which can significantly impact the design of sequential workflows under given constraints.
Key Points
- Introduces performance models for LLM and non-LLM agents to analyze tradeoffs.
- Develops a water-filling token allocation policy for optimal resource distribution.
- Characterizes optimal workflow reliability using shadow prices for better decision-making.
- Focuses on sequential workflows under latency and cost constraints.
- Impacts the design of AI systems that integrate multiple interacting agents.
Article Excerpt
From source RSS / original summaryarXiv:2605. 23929v1 Announce Type: new Abstract: Modern AI systems increasingly rely on workflows composed of multiple interacting agents, some powered by large language models (LLMs) and others by conventional computational modules. This paper analyzes the fundamental tradeoffs between latency, reliability, and cost in LLM-enabled agentic workflows.
We introduce performance models for both LLM and non-LLM agents that capture the relationship between computational effort and output quality, incorporating the impact of reasoning and output tokens for LLM agents using a parametric exponential reliability function. Then, we study the design of sequential workflows under latency and cost constraints. Main results include a water-filling token allocation policy and characterizations of optimal workflow reliability in terms of shadow prices.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane
The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.