Persona Without Substrate: Regime-Dependence and the LLM Individuation Problem
Quick Answer
This paper shows that Beckmann & Butlin's framework for LLM individuation is challenged by empirical findings from Qwen3-4B-Instruct and Mistral-7B-Instruct-v0.2, revealing that identity in LLMs is regime-dependent.
Quick Take
Beckmann & Butlin's framework for LLM individuation is challenged by empirical findings from Qwen3-4B-Instruct and Mistral-7B-Instruct-v0.2, revealing that identity in LLMs is regime-dependent. The study presents four key results that undermine the assumption of consistent content across different training regimes, advocating for a (vehicle, regime) pair as the identity unit for representational content.
Key Points
- Four empirical results from persona-topology experiments challenge the cross-regime co-reference assumption.
- Non-collinearity observed in prompt-extracted vectors and fine-tune basins indicates regime dependence.
- Fictional personas influence model behavior more than real anchors in certain contexts.
- Contradictory mixtures favor attractors determined by training history.
- Proposes a new identity unit for LLMs as a (vehicle, regime) pair.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2607. 00006v1 Announce Type: new Abstract: Beckmann & Butlin's (2026) ontological framework for the LLM individuation problem inherits an unargued cross-regime co-reference assumption from the persona-vectors literature: that the same direction picks out the same content under prompt-conditioning, gradient-descent fine-tuning, and inference-time steering. We present four empirical wedges from persona-topology experiments on Qwen3-4B-Instruct and Mistral-7B-Instruct-v0.
2 - non-collinearity of prompt-extracted vectors and fine-tune basins; fictional personas displacing the model along real-anchor directions more strongly than real anchors do; contradictory-valenced mixtures biased toward a training-history-determined attractor; and asymmetric compositional algebra under inference-time arithmetic versus fine-tune-time chimera training - that jointly undermine the assumption.
We propose regime-indexed individuation: the identity unit for representational content is a (vehicle, regime) pair, not a vehicle alone. Under this framework, Beckmann & Butlin's three candidate positions describe three different regime-internal objects rather than competing for the same referent; the same diagnosis applies to Mollo & Milli\`ere, Chalmers, and Cerullo.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Quantifying Prior Dominance in Systems
The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.