APEX: Adaptive Principle EXtraction A Three-Layer Self-Evolution Framework for Production AI Agents
Quick Answer
APEX introduces a three-layer co-evolution framework for AI agents, achieving a 90% improvement in health score over baseline.
Quick Take
APEX introduces a three-layer co-evolution framework for AI agents, achieving a 90% improvement in health score over baseline. Implemented on Joe, a production-grade super AI agent, it distills six novel principles and optimizes workflow topology, demonstrating superior performance compared to traditional single-axis optimization methods.
Key Points
- APEX achieves an APEX Health Score of 0.570, up from 0.300 baseline.
- Implemented on Joe, designed for NVIDIA Agent Challenge 2026.
- Distilled six novel reusable principles during the evolutionary run.
- Selected a workflow topology scoring 0.900, a 20% improvement.
- Cost of implementation was only 4 LLM calls, approximately 270 seconds.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 15363v1 Announce Type: new Abstract: Self-improvement in AI agents has emerged as a key research frontier: systems that modify their own prompts, workflows, and decision rules based on accumulated operational experience. The state-of-the-art Self-Harness framework [1] achieves 14--21% improvement on Terminal-Bench-2. 0 by mining failure clusters and patching the agent harness.
However, Self-Harness optimises only one dimension -- the prompt harness -- leaving behavioural principles and workflow topology unchanged. We propose APEX (Adaptive Principle EXtraction), a three-layer co-evolution framework that simultaneously evolves: (L1) the harness via failure-mode patching, (L2) behavioural principles via success-trace distillation [2], and (L3) the agent workflow topology via structural fitness-based selection [6].
We implement APEX on Joe [13], a production-grade super AI Agent built on NVIDIA Nemotron and designed as an Edge AI Agent Factory for the NVIDIA Agent Challenge 2026, managing a 15-node compute fleet using 114 real task traces collected over 18 days. APEX achieves an APEX Health Score of 0. 570 (+90% vs. baseline 0. 300) in a single evolutionary run, distilling 6 novel reusable principles and selecting a research-first workflow topology scoring 0. 900 (+20%).
Our results demonstrate that multi-dimensional co-evolution substantially outperforms single-axis harness optimisation, at a cost of only 4 LLM calls (~270 s) on a local qwen2. 5-coder:32b instance.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Arbor: Tree Search as a Cognition Layer for Autonomous Agents
Arbor introduces a multi-agent framework utilizing structured tree search for optimizing LLM inference, achieving up to 193% throughput-latency improvement compared to vendor-optimized systems. It employs an Orchestrator and Critic agent for stability and coordination, demonstrating hardware-agnostic performance with minimal variance.