SkillSmith: Compiling Agent Skills into Boundary-Guided Runtime Interfaces

arXiv cs.AI·Duling Xu, Zheng Chen, Zaifeng Pan, Jiawei Guan, Dong Dong, Jialin Li, Bangzheng Pu

5/18/2026

·~2 min·5/18/2026·en·5

Quick Answer

SkillSmith introduces a boundary-first compiler-runtime framework that reduces token usage by 57.44% and solve time by 50.57% in LLM-based agent systems.

Quick Take

SkillSmith introduces a boundary-first compiler-runtime framework that reduces token usage by 57.44% and solve time by 50.57% in LLM-based agent systems. By compiling skills into minimal executable interfaces, it minimizes context injection and reasoning overhead, enhancing task accuracy and efficiency. The framework allows for the reuse of compiled artifacts across models, improving performance even with less capable runtimes.

Key Points

SkillSmith reduces solve-stage token usage by 57.44% compared to raw skills.
Achieves a 50.57% reduction in solve time, making it 2.02x faster.
Minimizes irrelevant context injection and redundant reasoning overhead.
Compiled artifacts can be reused by smaller or more efficient runtime models.
Evaluated on SkillsBench benchmark, demonstrating significant performance improvements.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

📖 Reader Mode

~2 min read

[Submitted on 12 May 2026]

View PDF HTML (experimental)

Abstract:Recently, skills have been widely adopted in large language model (LLM)-based agent systems across various domains. In existing frameworks, skills are typically injected into the agent reasoning loop as contextual guidance once matched to a runtime task, enabling specialized task-solving capabilities. We find that this execution paradigm introduces two major sources of redundancy: irrelevant context injection and repeated skill-specific reasoning and planning. To this end, we propose SkillSmith, a boundary-first compiler-runtime framework that compiles skill packages offline into minimal executable interfaces. By extracting fine-grained operational boundaries from skills, SkillSmith enables agents to dynamically access and execute only the relevant components at runtime, thereby minimizing unnecessary context injection and redundant reasoning overhead. In the evaluation on SkillsBench benchmark, SkillSmith reduces solve-stage token usage by 57.44%, thinking iterations by 42.99%, solve time by 50.57% (2.02x faster), and token-proportional monetary cost by 57.44% compared with using raw-skills. Moreover, compiled artifacts produced by a stronger model can be reused by a smaller or more efficient runtime model, improving task accuracy in cases where raw skill interpretation fails. The source code and data are available at this https URL.

Subjects:	Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
Cite as:	arXiv:2605.15215 [cs.AI]
	(or arXiv:2605.15215v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2605.15215 arXiv-issued DOI via DataCite

Submission history

From: Zaifeng Pan [view email]
[v1] Tue, 12 May 2026 09:25:25 UTC (464 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Ye Liu, Srijan Bansal, Bo Pang, Yang Li, Zeyu Leo Liu, Yifei Ming, Zixuan Ke, Shafiq Joty, Semih Yavuz

1d ago

FeaturedOriginal

Procedural Memory Distillation: Online Reflection for Self-Improving Language Models

AI Summary

Procedural Memory Distillation (PMD) enhances reinforcement learning by converting cross-episode signals into reusable memory, improving Qwen3-8B and OLMo3-Instruct-7B models by 3.8-5.5% on SCIKNOWEVAL and 7.9-13.6% on . The co-evolution of policy and memory allows for more effective self-supervision, demonstrating significant performance gains when both components are active.

#LLM #AI Coding #Inference #Policy