The Holistic Storage of Verb+Up Phrases in Text-based and Audio-based Language Models

arXiv cs.CL·Zachary Nicholas Houghton, Yu Zhou, Dan Pluth, Vijay K. Gurbani

6h ago

·~1 min·6/15/2026·en·0

Quick Answer

This study investigates the holistic storage of verb+up phrases in text-based LLMs and ASR models, revealing that frequency and predictability influence distinct representations.

Quick Take

This study investigates the holistic storage of verb+up phrases in text-based LLMs and ASR models, revealing that frequency and predictability influence distinct representations. The findings support usage-based theories of language, indicating that both model types exhibit evidence of holistic storage.

Key Points

Holistic storage of verb+up phrases is influenced by frequency and predictability.
Text-based LLMs and ASR models show distinct internal representations.
The study supports usage-based theories of language acquisition.
Previous research focused more on abstract knowledge than multi-word units.
Findings suggest a need for further exploration of holistic storage in language models.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Excerpt

From source RSS / original summary

arXiv:2606. 13993v1 Announce Type: new Abstract: A crucial aspect of linguistic capability is the ability to trade off between stored representations and abstract knowledge: one must retrieve learned representations, but also generate novel ones by applying productive rules. While recent work has examined abstract knowledge in language models, holistic storage of multi-word units has received far less attention.

We probe internal representations in text-based LLMs and an ASR model, testing whether V+up phrasal verbs develop distinct representations as a function of frequency and predictability. All models show evidence of holistic storage driven by frequency and predictability, further supporting usage-based theories of language.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Leyao Wang, Yanan He, Peng Chen, Asaf Yehudai, Yixin Liu, Rex Ying, Michal Shmueli-Scheuer, Arman Cohan

3w ago

FeaturedOriginal

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

AI Summary

The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.

#LLM #Agent #Inference #Policy