Universal Quantum Transformer
Quick Take
The Universal Quantum Transformer (UQT) introduces a quantum-native architecture that achieves exact mathematical reasoning using multi-qubit systems, outperforming classical networks by eliminating stochastic instability and over-parameterization. Demonstrated on a 5-qubit substrate, it successfully learns cyclic modular arithmetic and non-Abelian algebra, showcasing superior computational efficiency on IBM Quantum hardware.
Key Points
- UQT uses parameterized geometric phase embedding for exact mathematical reasoning.
- Achieves deterministic generalization, termed crystallization, beyond classical grokking.
- Successfully learns cyclic modular arithmetic and non-Abelian algebra on 5 qubits.
- Offers significant computational and memory advantages over classical self-attention.
- Proven viable on current IBM Quantum computers, demonstrating practical applicability.
Article Content
From source RSS / original summaryarXiv:2606. 00045v1 Announce Type: new Abstract: Classical continuous-space neural networks fundamentally struggle to lock into exact mathematical symmetries, such as modular arithmetic and non-commutative algebra. To approximate these discrete logical rules, they often rely on massive parameter scaling, resulting in stochastic instability even after delayed generalization phenomena known as grokking.
Here, we introduce the Universal Quantum Transformer (UQT), a fundamentally novel, quantum-native computing architecture that uses the physical properties of multi-qubit systems as a universal inductive bias for exact mathematical and algebraic reasoning. Rather than translating classical neural mechanisms, our framework relies entirely on parameterized geometric phase embedding and $SU(2)$ wave-interference.
We demonstrate that the quantum attention circuit, operating on a highly compact 5-qubit substrate, perfectly learns two highly distinct formal classes: cyclic modular arithmetic ($\mathbb{Z}_{11}$) and non-Abelian algebra (the $S_4$ permutation group). While classical attention-based networks exhibit stochastic instability at convergence, the UQT achieves mathematically exact, deterministic generalization. We refer to this phenomenon as crystallization: a step beyond the well-known phenomenon of grokking.
Crucially, this framework yields massive computational and memory advantages by theoretically bypassing the quadratic bottleneck of classical self-attention, and by logarithmically compressing the required representation dimension to eliminate the massive over-parameterization inherent to classical networks. Finally, we deploy this architecture on noisy intermediate-scale quantum (NISQ) hardware, proving its viability on current IBM Quantum computers.
These results establish parameterized quantum topology as a universally superior physical substrate for exact artificial intelligence.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →MindGames Arena Generalization Track: In2AI Solution with Delayed Per-Step Reward Attribution
The In2AI solution introduces delayed per-step reward attribution for training language model agents in multi-agent environments, achieving top performance on the MindGames Arena benchmark at NeurIPS 2025. An 8-billion-parameter model outperformed larger proprietary systems, including GPT-5, in competitive play, demonstrating enhanced stability and sample efficiency in reinforcement learning.
