Agentic evolution of physically constrained foundation models
Quick Answer
This paper shows that A new multi-agent discovery engine autonomously designs hardware-compliant systems, evolving methods like Q-Enhance and MoE-Salient-AQ that outperform human heuristics.
Quick Take
A new discovery engine autonomously designs hardware-compliant systems, evolving methods like Q-Enhance and MoE-Salient-AQ that outperform human heuristics. It successfully deployed a 235-billion-parameter model on a dual-A100 server, reducing memory needs by 75% with only a 0.64% accuracy drop.
Key Points
- Engine evolved two hardware-aware compression methods: Q-Enhance and MoE-Salient-AQ.
- Q-Enhance reduces long-context accuracy loss in dense models effectively.
- MoE-Salient-AQ outperforms manual sparse designs by 3.7% in sub-3-bit regimes.
- Successfully deployed a 235-billion-parameter model on a dual-A100 server.
- Achieved 75% memory reduction with a minimal accuracy degradation of 0.64%.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 25532v1 Announce Type: new Abstract: Artificial intelligence increasingly drives automated scientific discovery, yet contemporary generalist agents lack physical grounding, frequently hallucinating hardware-incompatible designs. Here, we present a physically grounded, discovery engine that autonomously architects hardware-compliant computing systems.
Anchored by an Evolutionary Knowledge Graph structuring past scientific innovations, the framework extracts an "algorithmic Chain-of-Thought" to transform blind stochastic search into directed structural evolution.
Applied to the extreme testbed of foundation model deployment, the engine evolved two hardware-aware compression methodologies surpassing human-engineered heuristics: Q-Enhance mitigates long-context accuracy loss in dense models, and MoE-Salient-AQ outperforms state-of-the-art manual sparse Mixture-of-Experts designs by 3. 7% at sub-3-bit regimes.
Utilizing a bandwidth-efficient Sensitivity Profile, we successfully deployed a massive 235-billion-parameter model onto a constrained dual-A100 server, reducing memory requirements by 75% with a marginal 0. 64% accuracy degradation. By transforming unconstrained combinatorial search into knowledge-driven autonomy, this establishes a scalable hardware-software co-design paradigm for machine-driven discovery within strict physical boundaries.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Arbor: Tree Search as a Cognition Layer for Autonomous Agents
Arbor introduces a framework utilizing structured tree search for optimizing LLM inference, achieving up to 193% throughput-latency improvement compared to vendor-optimized systems. It employs an Orchestrator and Critic agent for stability and coordination, demonstrating hardware-agnostic performance with minimal variance.


