Skills on the Fly: Test-Time Adaptive Skill Synthesis for LLM Agents

arXiv cs.CL·Jingxing Wang, Chenyu Zhou, Zhihui Fu, Jun Wang, Weiwen Liu, Weinan Zhang, Jianghao Lin

1d ago

·~2 min·5/19/2026·en·2

Quick Take

SkillTTA enables LLM agents to adaptively synthesize task-specific skills at test time.

Key Points

Retrieves relevant training trajectories for task-specific skill synthesis.
Outperforms static synthesis methods in multiple benchmarks.
Utilizes failed trajectories to improve skill generation.

📖 Reader Mode

~2 min read

[Submitted on 16 May 2026]

View PDF HTML (experimental)

Abstract:LLM agents benefit from reusable skills, yet test-time tasks often require guidance more specific than a static skill library can provide. We propose \emph{SkillTTA}, a Test-Time Adaptive Skill Synthesis method that retrieves a small set of training trajectories relevant to the current task and synthesizes them into a temporary, task-specific textual skill. The solver model is kept fixed, so adaptation happens entirely through generated context rather than parameter updates. We evaluate the method on SpreadsheetBench, ALFWorld, and BigCodeBench. Compared with static trajectory-to-skill synthesis using GPT-5.5, task-specific skills improve SpreadsheetBench Pass@1 from 0.397 to 0.505 and BigCodeBench Pass@1 from 0.517 to 0.651. On ALFWorld, the method matches a heavier memory-learning baseline within four points of success rate while producing the shortest successful trajectories among reported methods. Ablations on SpreadsheetBench further show that synthesized skills outperform raw trajectory prompting, that top-$k$ retrieval should stay small, and that failed trajectories are especially useful because they expose recurring evaluator-facing mistakes.

Comments:	10 pages, 4 figures
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2605.16986 [cs.CL]
	(or arXiv:2605.16986v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2605.16986 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Jingxing Wang [view email]
[v1] Sat, 16 May 2026 13:14:15 UTC (446 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Skills on the Fly: Test-Time Adaptive Skill Synthesis for LLM Agents

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.CL

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution

MMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-Agent

Related in this space

From Prompts to Protocols: An AI Agent for Laboratory Automation

Agentic Trading: When LLM Agents Meet Financial Markets