Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology
Quick Take
This study explores domain adaptation in language models using historical cosmology, revealing that fine-tuning a larger model shifts explanatory framing significantly towards premodern perspectives while maintaining stable cosmological stances. The findings indicate that domain adaptation primarily alters linguistic frameworks rather than directly modifying cosmological stances.
Key Points
- Phase 1 involved training a small model on a pre-Copernican corpus.
- Fine-tuning in Phase 2 led to a significant shift towards premodern explanatory framing.
- Model outputs were evaluated using an LLM-as-judge framework.
- Geocentric outputs increased due to redistribution over explanatory regimes.
- Domain adaptation reshapes linguistic frameworks, influencing reasoning outcomes.
Article Content
From source RSS / original summaryarXiv:2605. 30415v1 Announce Type: new Abstract: We investigate how domain adaptation reshapes explanatory behavior in language models using historical cosmology as a controlled setting. In Phase 1, we train a small language model from scratch on a pre-Copernican corpus from which explicit heliocentric references were removed, and evaluate whether Earth-motion or heliocentric continuations nevertheless emerge.
In Phase 2, we fine-tune a larger pretrained model using QLoRA on the same corpus in order to study how adaptation modifies explanatory framing and cosmological stance. Model outputs are evaluated using an LLM-as-judge framework that labels both cosmological stance (geocentric, heliocentric, or ambiguous) and explanatory frame (premodern versus modern).
In the constrained setting of Phase 1, the smaller models occasionally generate local Earth-motion continuations, but these remain globally unstable and insufficient to support coherent cosmological reasoning. In Phase 2, fine-tuning induces a large and statistically significant shift toward premodern explanatory framing, while the conditional cosmological stance distributions remain comparatively stable within those frames.
As a result, increases in geocentric outputs arise primarily from redistribution over explanatory regimes rather than from direct modification of stance. These results suggest that domain adaptation may primarily reshape the linguistic frameworks from which continuations are generated, with changes in stance emerging secondarily from those shifts.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.