User as Engram: Internalizing Per-User Memory as Local Parametric Edits
Quick Answer
This paper shows that The 'User as Engram' model proposes storing user-specific content as local edits in a memory table, enhancing reasoning accuracy by 5.6x while maintaining a 33,000x smaller memory footprint compared to traditional per-user LoRA adapters.
Quick Take
The 'User as Engram' model proposes storing user-specific content as local edits in a memory table, enhancing reasoning accuracy by 5.6x while maintaining a 33,000x smaller memory footprint compared to traditional per-user LoRA adapters.
Key Points
- User-specific facts stored as surgical edits in a hash-keyed memory table.
- Achieves 5.6x higher indirect reasoning accuracy than per-user LoRA.
- Maintains a 33,000x smaller memory footprint compared to traditional methods.
- Allows multiple users to coexist with additive and lossless edits.
- Retrieval efficiency improves significantly after ~100 facts.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 19172v1 Announce Type: new Abstract: Personal memory in a language model is two problems: content and reasoning skill. The brain keeps the two apart (a sparse, local engram in the hippocampus for each episode, a slow neocortex for the shared skills that interpret it), so a new fact need not overwrite everything else. Most personalization today keeps a user's facts outside the weights, in a natural-language memory file or a retrieval index.
When facts are written into the model instead, the standard recipe is the per-user LoRA adapter, which does the opposite of the brain, folding content and skill into one global weight delta. Writing a user's facts as a LoRA contaminates text unrelated to them; writing the same facts as local Engram rows leaves it mathematically untouched, resulting in a roughly 33,000x smaller memory footprint.
We therefore propose User as Engram: store a user's content as surgical edits to the hash-keyed memory table of an Engram model, and carry the reasoning skill in one shared adapter. This layered design matches per-user LoRA's direct recall while delivering 5. 6x higher indirect-reasoning accuracy on average, and never makes a single user worse at reasoning than the untouched base.
The edit is a glass box: writing a fact switches on its lookup at exactly the trigger, adds the value the answer needs, leaves every other position unchanged to the last bit, and fails if written into the wrong layer. Because different users' facts land in disjoint hash slots, their edits compose: many users live in one shared table at once, stacking additively and losslessly, where a per-user LoRA, a single global weight delta, admits only one.
Upon retrieval, a per-user Engram table does not grow with the population the retriever must search, so past ~100 facts it overtakes a retrieval pipeline on a 2. 5x larger model.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Arbor: Tree Search as a Cognition Layer for Autonomous Agents
Arbor introduces a multi-agent framework utilizing structured tree search for optimizing LLM inference, achieving up to 193% throughput-latency improvement compared to vendor-optimized systems. It employs an Orchestrator and Critic agent for stability and coordination, demonstrating hardware-agnostic performance with minimal variance.