OpenSkill: Open-World Self-Evolution for LLM Agents | AI Deep Signal

OpenSkill: Open-World Self-Evolution for LLM Agents

arXiv cs.AI·Zhiling Yan, Dingjie Song, Hanrong Zhang, Wei Liang, Yuxuan Zhang, Yutong Dai, Lifang He, Philip S. Yu, Ran Xu, Xiang Li, Lichao Sun

6/8/2026

·~1 min·6/8/2026·en·3

Quick Answer

OpenSkill introduces a self-evolving framework for LLM agents that builds skills and verification signals from open-world resources without supervision.

Quick Take

It achieved the highest automated pass rate across three benchmarks while ensuring skills transfer across models and aligning self-built verifiers with ground-truth outcomes.

Key Points

OpenSkill operates without target-task supervision, relying solely on open-world resources.
The framework synthesizes knowledge from documentation and the web into transferable skills.
It achieved the best automated pass rate across three benchmarks with two target agents.
Skills transfer across models without requiring model-specific adaptations.
Self-built verifiers align with ground-truth outcomes despite no prior access.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Source Excerpt

arXiv:2606. 06741v1 Announce Type: new Abstract: Self-evolving agents requires adaptation after deployment, but existing approaches assume a usable learning loop, such as curated skills, successful trajectories, or verifier signals. Real open-world deployments may provide none of these, offering only a task prompt. In this work, we study open-world self-evolution, where an agent must build both its skills and its own verification signals from scratch, using open-world resources but no target-task supervision. …

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Sumit Verma, Pritam Prasun, Pritish Kumar

2d ago

FeaturedOriginal

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

AI Summary

RAIL Guard introduces a closed-loop AI pipeline for large language models (LLMs) that evaluates outputs across eight dimensions and iteratively remediates failures, achieving 96.9% convergence compared to 49.1% for traditional block-and-retry methods. The system reduces unsafe agent executions by 33% without impacting task completion and is available as open-source SDKs.

#LLM #Agent #Open Source #Policy

OpenSkill: Open-World Self-Evolution for LLM Agents

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Powered Agentic System

The Emerging Paradigm of Geospatial Foundation Models: From Pre-Training to Agentic Reasoning

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for LLM Agents

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Large Language Model Powered Agentic System

The Emerging Paradigm of Geospatial Foundation Models: From Pre-Training to Agentic Reasoning

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Powered Agentic System