Compatibility-Aware Dynamic Fine-Tuning for Large Language Models
Quick Answer
The paper introduces Compatibility-Aware Dynamic Fine-Tuning (CADFT), enhancing Dynamic Fine-Tuning by controlling optimization variance in large language models.
Quick Take
The paper introduces Compatibility-Aware Dynamic Fine-Tuning (CADFT), enhancing Dynamic Fine-Tuning by controlling optimization variance in large language models. CADFT utilizes a policy-dependent compatibility signal to suppress high-variance updates from incompatible demonstrations, leading to improved stability and generalization in supervised fine-tuning tasks.
Key Points
- CADFT extends Dynamic Fine-Tuning by addressing sample-level optimization variance.
- It derives compatibility signals from model likelihoods to improve supervised updates.
- The method shows enhanced stability and generalization in extensive experiments.
- CADFT remains fully supervised, independent of explicit reward modeling.
- It introduces a low-frequency rewriting strategy for incompatible demonstrations.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 11206v1 Announce Type: new Abstract: Supervised Fine-Tuning (SFT) is the predominant paradigm for aligning large language models (LLMs), yet it suffers from optimization instability and limited generalization. Recent work attributes this issue to pathological gradient scaling and proposes Dynamic Fine-Tuning (DFT) to correct it at the token level.
However, DFT assumes all demonstrations are equally suitable learning targets, an assumption violated by the strong heterogeneity of large-scale instruction data, where demonstration-policy mismatch induces high-variance updates at the sample level. We introduce Compatibility-Aware Dynamic Fine-Tuning (CADFT), a principled extension of DFT that controls sample-level optimization variance.
CADFT derives a dynamic, policy-dependent compatibility signal from model likelihoods to modulate supervised updates, suppressing high-variance gradients from incompatible demonstrations. We further propose a delayed, low-frequency compatibility-guided rewriting strategy to transform persistently incompatible demonstrations into learnable targets. We show that CADFT can be interpreted as a variance-controlled estimator that generalizes token-level stabilization in DFT to the sample level.
Extensive experiments demonstrate improved stability, generalization, and cold-start reinforcement learning initialization, while remaining fully supervised and independent of explicit reward modeling.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.