CoreMem: Riemannian Retrieval and Fisher-Guided Distillation for Long-Term Memory in Dialogue Agents
Quick Answer
CoreMem introduces a resource-efficient memory architecture for dialogue agents, utilizing Riemannian retrieval and Fisher-guided distillation to enhance long-term memory on 8 GB VRAM devices.
Quick Take
CoreMem introduces a resource-efficient memory architecture for dialogue agents, utilizing Riemannian retrieval and Fisher-guided distillation to enhance long-term memory on 8 GB VRAM devices. It achieves significant accuracy improvements on LOCOMO and LongMemEval-S benchmarks, with gains of +4.51 pp in Open-domain and +4.17 pp in Temporal reasoning, effectively addressing memory constraints.
Key Points
- CoreMem employs Riemannian retrieval to enhance memory efficiency and reduce hubness issues.
- Fisher-guided discrete token distillation enables hierarchical sentence-to-token compression.
- Achieves strong performance on LOCOMO and LongMemEval-S benchmarks with significant accuracy gains.
- Operates within an 8 GB VRAM budget, suitable for consumer-grade edge devices.
- Addresses the hubness problem and syntactic fragmentation in high-dimensional retrieval.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 18406v1 Announce Type: new Abstract: Personalized dialogue agents require continuous long-term memory to maintain coherent interactions across multiple sessions. However, deploying these capabilities on consumer-grade hardware (e. g. , 8 GB VRAM edge devices) introduces severe memory and compute bottlenecks. Existing systems typically rely on isotropic cosine similarity for retrieval and heuristic rules for context compression.
These approaches lack a unified theoretical foundation, frequently suffering from the hubness problem in high-dimensional retrieval and syntactic fragmentation during compression. To overcome these limitations, we propose CoreMem, a resource-efficient edge-cloud memory architecture fundamentally unified by information geometry.
First, Riemannian retrieval replaces cosine matching with a locally adaptive Fisher-Rao metric, effectively penalizing hub memories via Mahalanobis distance with O(Ndr) Woodbury acceleration for real-time search. Second, Fisher-guided discrete token distillation (FDTD) introduces a hierarchical sentence-to-token compression mechanism. It derives sensitivity scores from Fisher information traces, providing a principled compression-KL tradeoff augmented with explicit structural syntax protection.
Evaluated on the LOCOMO and LongMemEval-S benchmarks, CoreMem achieves strong accuracy improvements, yielding substantial gains in Open-domain (+4. 51 pp) and Temporal (+4. 17 pp) reasoning. Extensive profiling confirms that CoreMem operates seamlessly within a strict 8 GB VRAM budget, successfully bridging the gap between resource-constrained edge devices and the demand for theoretically grounded, lifelong memory agents.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.


