Inducing Reasoning Primitives from Agent Traces
Quick Take
The Reasoning Primitive Induction method enhances ReAct-style LLM agents by mining and clustering successful reasoning traces into a library of pseudo-tools, achieving performance improvements of up to +44pp on benchmarks like RuleArena NBA, MuSR, and NatPlan, while reducing inference costs.
Key Points
- Induced libraries outperform the generating agent by significant margins on multiple benchmarks.
- Performance gains include +44pp on RuleArena NBA and +30pp on MuSR team allocation.
- A fixed configuration improves over zero-shot Chain-of-Thought on all subtasks.
- Matches or surpasses expert-authored decompositions in reasoning tasks.
- Achieves lower average inference costs compared to AWM.
Article Excerpt
From source RSS / original summaryarXiv:2606. 02994v1 Announce Type: new Abstract: ReAct-style LLM agents often rediscover the same reasoning routines across problems, yet leave those routines trapped in transient scratchpads. We introduce Reasoning Primitive Induction, a single-pass method that mines successful ReAct traces, clusters recurrent reasoning moves, and converts the most frequent moves into a compact library of typed pseudo-tools.
Each pseudo-tool is specified by a natural-language docstring interpreted by an LLM at invocation time, and a standard ReAct loop composes these primitives at test time. The central result is that induced libraries outperform the very agent that generated their traces: by +44pp on RuleArena NBA (30 -> 74), +30pp on MuSR team allocation (38 -> 68), and +22pp on NatPlan meeting planning (7 -> 29).
Across five comparable subtasks spanning narrative deduction, rule application, and constraint-satisfaction planning, a single fixed configuration improves over zero-shot Chain-of-Thought on every subtask, matches or surpasses expert-authored decompositions, and outperforms AWM at lower average inference cost.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →AUDITFLOW: Executable Symbolic Environments for Structured Financial Reporting Verification
AuditFlow introduces a multi-agent framework for structured financial reporting verification, achieving 82.09% accuracy with GPT-5.5, outperforming the baseline by 14.93 points. It utilizes a symbolic environment for effective audit processes, demonstrating the necessity of deterministic checks for reliable verification.