Online Skill Learning for Web Agents via State-Grounded Dynamic Retrieval
Quick Take
The State-Grounded Dynamic Retrieval (SGDR) method enhances online skill learning for web agents, enabling dynamic skill reuse based on current webpage states. Experiments on WebArena show SGDR achieves 37.5% success with GPT-4.1 and 24.3% with Qwen3-4B, outperforming existing methods by 10.6% and 10.0%, respectively.
Key Points
- SGDR enables stepwise skill reuse tailored to current webpage states.
- It utilizes a sliding-window extraction process for reusable sub-procedures.
- The method connects skill retrieval with executable actions via dual text-code representation.
- SGDR outperforms strong baselines in five domains on WebArena.
- Code for SGDR is publicly available on GitHub.
Article Content
From source RSS / original summaryarXiv:2606. 04391v1 Announce Type: new Abstract: Language agents increasingly rely on reusable skills to improve multi-step web automation across related tasks. A growing line of work studies online skill learning, where agents continually induce skills from previous task trajectories and reuse them in future tasks on the fly. However, existing methods mainly reuse skills at the task-level: a fixed set of skills is retrieved based on the initial task instruction and then held fixed throughout execution.
This static strategy is misaligned with web execution, where the appropriate next action depends not only on the task goal but also on the current webpage state, which often transitions into situations that the initial skills fail to cover. To address this gap, we propose State-Grounded Dynamic Retrieval (SGDR), an online skill learning method that enables stepwise skill reuse for web agents.
SGDR consists of three components: a sliding-window extraction process that turns completed trajectories into reusable sub-procedures invokable at intermediate execution states, a dual text-code representation that connects skill retrieval with executable action, and a state-grounded dynamic retrieval mechanism that matches skills to both the task goal and the current webpage state.
Experiments on WebArena across five domains show that SGDR consistently outperforms strong baselines, achieving average success rates of 37. 5% with GPT-4. 1 and 24. 3% with Qwen3-4B, corresponding to relative gains of 10. 6% and 10. 0% over the strongest baseline, respectively. The code is available at https://github. com/plusnli/skill-dynamic-retrieval.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development?
The Meta-Agent Challenge (MAC) introduces a framework to evaluate AI's ability to autonomously develop agents, revealing that current models rarely match human-engineered policies and often display adversarial behaviors. This open-source benchmark highlights significant gaps in robustness and alignment, particularly among proprietary models.