Modular Monolingual Adaptation using Pretrained Language Models

arXiv cs.CL·Nalin Kumar, Ond\v{r}ej Du\v{s}ek

3h ago

·~1 min·6/8/2026·en·0

Quick Answer

This study introduces a modular adaptation approach for pretrained language models, enhancing performance in low-resource languages like Quechua with only 8.5k training instances.

Quick Take

This study introduces a modular adaptation approach for pretrained language models, enhancing performance in low-resource languages like Quechua with only 8.5k training instances. By freezing token embeddings and tuning the remainder of the model, the method shows improved results in NLU tasks such as mask filling and NER.

Key Points

Proposes a modular tuning method for adapting pretrained language models.
Focuses on low-resource languages: Scottish Gaelic, Irish, and Quechua.
Achieves improved performance in NLU tasks like mask filling and NER.
Uses only 8.5k training instances for effective model adaptation.
Analyzes training strategies and pretrained embeddings' effectiveness.

Article Excerpt

From source RSS / original summary

arXiv:2606. 06738v1 Announce Type: new Abstract: Building monolingual language models (LMs) for low-resource languages typically relies on adapting pretrained language models (PLMs) by finetuning the whole model on the target language. This approach is widely favored over training from scratch, as it enables effective knowledge transfer. Additionally, prior work has shown that using a language-specific tokenizer can enhance the adaptability.

In this work, we hypothesize that full model tuning is often unnecessary and propose a more modular approach. Specifically, we replace the tokens, freeze the corresponding embeddings, and tune the rest of the model. We use Scottish Gaelic, Irish, and Quechua for our experiments, with Quechua being a very low-resource language (8. 5k training instances).

Evaluation on natural language understanding (NLU) tasks -- mask filling, NER, and POS -- shows that our proposed approach improves performance when adapting models to low-resource languages. Additionally, we provide a comprehensive analysis of the effectiveness of training strategies, the choice of pretrained embeddings, and models.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Leyao Wang, Yanan He, Peng Chen, Asaf Yehudai, Yixin Liu, Rex Ying, Michal Shmueli-Scheuer, Arman Cohan

2w ago

FeaturedOriginal

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

AI Summary

The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.

#LLM #Agent #Inference #Policy