When Irregularity Helps: A Subclass Analysis of Inductive Bias in Neural Morphology
Quick Take
Study reveals that rare morphological subclasses significantly influence neural model errors in Japanese verb inflection.
Key Points
- Irregular subtypes can cause disproportionate model errors.
- Removing specific irregularities improves generalization more than removing all irregular verbs.
- Morphological evaluation needs finer subclass analysis.
📖 Reader Mode
~2 min readAbstract:Neural morphological generation systems often achieve high aggregate accuracy on benchmark datasets, yet such performance can conceal systematic errors concentrated in rare morphological subclasses. We examine Japanese past-tense verb inflection and show that a very small, structurally specific irregular subtype (<1% of data) accounts for a disproportionate share of model errors. Controlled ablation experiments demonstrate that removing this subtype yields larger improvements in generalization than removing all irregular verbs, indicating that not all irregularity contributes equally to model instability. These findings suggest that error concentration is driven by the interaction between extreme low-frequency morphological patterns and specific morphophonological processes, particularly gemination. We argue that morphological evaluation should incorporate finer-grained subclass analysis beyond standard conjugation categories.
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2605.20558 [cs.CL] |
| (or arXiv:2605.20558v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2605.20558 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Wen Zhang [view email]
[v1]
Tue, 19 May 2026 23:18:47 UTC (33 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The reliability of LLM judges for evaluating deep research agents is critically assessed using the REFLECT benchmark.