Self-GC: Self-Governing Context for Long-Horizon LLM Agents
Quick Answer
Self-GC introduces a self-governing context for long-horizon LLM agents, improving context management by pruning 43.95% of prefix tokens with minimal impact on future continuations.
Quick Take
Self-GC introduces a self-governing context for long-horizon LLM agents, improving context management by pruning 43.95% of prefix tokens with minimal impact on future continuations. In production, it reduces average input tokens by 10-15%, achieving no-impact rates of 91.27% to 94.58% across various sessions.
Key Points
- Self-GC governs agent context lifecycle, improving efficiency over traditional heuristics.
- Prunes 43.95% of prefix tokens while preserving 84.85% of future continuations.
- Achieves no-impact rates of 91.27% to 94.58% in a 332-session suite.
- Reduces daytime average input tokens by 10% to 15%, with peak reductions near 20%.
- Focuses on runtime lifecycle control rather than post hoc text cleanup.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2607. 00692v1 Announce Type: new Abstract: Long-horizon LLM agents accumulate tool results, files, plans, and user constraints that are too structured to be treated as a disposable text suffix. Current systems mostly rely on in-run heuristics such as chronological pruning and tool-output masking, or on final self-summary near a context limit. Heuristics are cheap but blind to future dependencies; summaries preserve narrative state but often hide exact evidence, locators, and editable artifacts.
We present Self-GC, where GC denotes self-governing context while deliberately echoing garbage collection: the system does not merely reclaim unused tokens, but governs the lifecycle of agent context objects. Self-GC turns user turns, tool spans, and skill state into indexed objects; asks a side-channel planner to propose fold, mask, and prune actions; and lets the harness enforce recoverable sidecars, safe commit boundaries, and cache-aware commit. On a 33-session Hard Set, Self-GC prunes 43.
95% of prefix tokens while leaving 84. 85% of future continuations unaffected, compared with no-impact rates of 54. 55% to 69. 70% for heuristic baselines. On a 332-session production-derived suite, three planner backbones reach no-impact rates of 91. 27% to 94. 58%, while baselines remain at 77. 71% to 87. 46%. In production, an online account-level split reduces daytime average input tokens by 10% to 15%, with peak reductions near 20%.
These results point to context management as runtime lifecycle control over indexed, recoverable objects rather than post hoc text cleanup.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Agentic Analysis for Agentic Infrastructure: An LLM-Powered Pipeline for Comparative Governance of DAO and Corporate AI Protocols
This study introduces an LLM-powered pipeline for analyzing governance structures of DAO and corporate AI protocols, revealing that while governance forms influence thematic focus, both ERC-8004 and Google A2A exhibit similar participation inequality and community fragmentation. The findings suggest that open governance may enhance thematic convergence despite decentralized participation.