PEAM: Parametric Embodied Agent Memory through Contrastive Internalization of Experience in Minecraft

arXiv cs.AI·Yuchen Guo, Junli Gong, Hongmin Cai, Yiu-ming Cheung, Weifeng Su

2d ago

·~1 min·5/28/2026·en·0

Quick Take

PEAM introduces a novel memory framework for Minecraft agents, combining a slow deliberative LLM with a fast parametric module, enhancing task performance and mitigating forgetting. This approach utilizes a unique self-triggered consolidation mechanism and treats failure as a training signal, leading to improved efficiency over traditional retrieval-based methods.

Key Points

PEAM uses a multimodal Mixture-of-Experts LoRA architecture for continual learning.
The framework improves long-horizon task performance in Minecraft experiments.
Failure-correction pairs are internalized through behavioral-cloning and contrastive objectives.
PEAM's self-triggered consolidation mechanism adapts across task distributions.
It enhances parametric efficiency compared to retrieval-based embodied agents.

Article Content

From source RSS / original summary

arXiv:2605. 27762v1 Announce Type: new Abstract: We present PEAM, a Parametric Embodied Agent Memory framework in Minecraft that transforms agent memory from inference-time retrieval into parameter-resident skills internalized through experience. PEAM pairs a slow deliberative LLM for open-ended reasoning with a fast parametric module for reflexive execution of consolidated skills.

The fast module is a multimodal Mixture-of-Experts LoRA architecture with per-category physically isolated adapters, enabling parameter-level continual learning without catastrophic forgetting. We treat failure as a first-class training signal: failure--correction trajectory pairs are internalized through a joint behavioral-cloning and contrastive objective, so the agent learns not only what succeeds but also how corrected actions differ from failed ones.

To govern consolidation, PEAM introduces a parameterization-worthiness score for deciding which experience should be internalized, and a scale-free self-triggered consolidation mechanism for deciding when to internalize without task-specific hand-tuned thresholds, making the agent self-evolving as the trigger transfers across task distributions without re-tuning.

Experiments in Minecraft show that PEAM improves long-horizon task performance, mitigates forgetting on previously consolidated skills, and improves parametric-versus-retrieval efficiency over retrieval-based embodied agents and parametric memory variants.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Tyler Akidau, Tyler Rockwood, Johannes Br\"uderl, Marc Millstone

1d ago

FeaturedOriginal

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane

AI Summary

The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.

#Agent #Robotics #Security #Policy

PEAM: Parametric Embodied Agent Memory through Contrastive Internalization of Experience in Minecraft

Quick Take

Key Points

Article Content

Want this in your inbox every morning?

More from arXiv cs.AI

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane

Got a Secret? LLM Agents Can't Keep It: Evaluating Privacy in Multi-Agent Systems

From Prompts to Protocols: An AI Agent for Laboratory Automation

Related in this space

TorqueAGI Announces Collaborations with NVIDIA, John Deere, and Dexterity to Advance Physical AI for Enterprise-Grade Robots

FORT Robotics Acquires Mapless AI to Expand Its Trust Platform with Remote Supervision and Active Safety Capabilities