Workflow-to-Skill: Skill Creation via Routing-Workflow-Semantics-Attachments Decomposition
Quick Answer
The paper introduces RWSA, a workflow-oriented representation for automatic Skill construction in large language model agents, enhancing behavioral replay consistency by 10.5% over existing methods.
Quick Take
The paper introduces RWSA, a workflow-oriented representation for automatic Skill construction in large language model agents, enhancing behavioral replay consistency by 10.5% over existing methods. The W2S framework effectively segments traces and reconciles branches while preserving evidence, addressing the challenges of fragmented and redundant interaction data.
Key Points
- RWSA decomposes Skills into Workflow structure, execution Semantics, and runtime Attachments.
- W2S framework improves behavioral replay consistency by 10.5% over summarization-based methods.
- The approach treats traces as executable runtime specifications, enhancing Skill quality.
- Experiments conducted on 70 Skills demonstrate significant improvements in Skill construction.
Article Content
From source RSS / original summaryarXiv:2606. 06893v1 Announce Type: new Abstract: Large language model agents increasingly rely on Skills to encode procedural knowledge, yet high-quality Skills remain costly to hand-write. This paper studies automatic Skill construction from heterogeneous interaction evidence, including demonstrations, agent trajectories, tool traces, and execution logs.
We argue that trace-to-skill construction is not simple summarization tasks, because traces are fragmented, redundant, and may miss rare but safety-critical behaviors. To address this, we introduce RWSA, a workflow-oriented intermediate representation that decomposes Skills into Workflow structure, execution Semantics, and runtime Attachments, capturing task decomposition, control flow, verification, safety, rollback, and state management.
Building on RWSA, we propose W2S, a framework that segments traces, induces local Skill drafts, aligns shared structures, reconciles branches, and compresses redundancy while preserving evidence and confidence annotations. Experiments on 70 Skills show that W2S improves behavioral replay consistency by 10. 5% over summarization- and prompting-based baselines, highlighting the need to treat traces as executable runtime specifications rather than compressible text.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Sim-to-Real Gap of Foundation Model Agents: A Unified MDP Perspective
This paper addresses the sim-to-real gap for foundation model agents by framing it within a Markov Decision Process (MDP) structure. It advocates for established solutions like domain randomization to enhance agent robustness, aiming to create standardized benchmarks for reliable real-world applications.