SLAP: Stratified Loss-based Pruning for On-Policy Data-Efficient Instruction Tuning

arXiv cs.CL·Run Zou, Jianhang Ding, Yifan Ding, Wen Wu, Hao Chen, Renshu Gu

5/26/2026

·~1 min·5/26/2026·en·1

Quick Answer

Quick Take

SLAP introduces a batch-aware data selection framework that optimizes instruction tuning for large language models like LLaMA and ChatGLM, achieving 20-40% less training data while maintaining performance. This method significantly reduces computational costs and improves model capabilities across diverse tasks such as multi-turn dialogue and multilingual translation.

Key Points

SLAP evaluates entire batch compositions for improved data selection efficiency.
Utilizes distribution-aware stratified sampling for comprehensive data coverage.
Achieves superior results with 20-40% less training data than traditional methods.
Outperforms state-of-the-art techniques across various model architectures.
Reduces computational costs while enhancing model performance.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2605. 23969v1 Announce Type: new Abstract: Instruction tuning has optimized the specialized capabilities of large language models (LLMs), but it often requires extensive datasets and prolonged training times. The challenge lies in developing specific capabilities by identifying useful data and efficiently fine-tuning. High-quality and diverse pruned data can help models achieve lossless performance at a lower cost.

In this paper, we propose \textbf{SLAP}, a novel batch-aware data selection framework that evaluates the learnability of entire batch compositions rather than individual. SLAP ensures comprehensive data distribution coverage through distribution-aware stratified sampling while maximizing intra-batch diversity through relative distance optimization.

By leveraging Hessian-approximated gradient information for dynamic batch selection, SLAP significantly outperforms existing state-of-the-art methods across multiple model architectures (LLaMA, ChatGLM) and diverse downstream tasks including multi-turn dialogue, multilingual translation, and question answering.

Most notably, SLAP achieves superior performance with 20-40\% less training data compared to full dataset training, substantially reducing computational costs while maintaining or improving model capabilities. These results establish SLAP as a powerful approach for efficient and effective instruction tuning of large language models.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Miguel Arana-Catania, Catherine Conisbee, Matthew Kidd

2d ago

FeaturedOriginal

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

AI Summary

The study evaluates three NLP approaches—Named Entity Recognition, Keyword Extraction, and Topic Modelling—using the Their Finest Hour Online Archive to automate keyword extraction from crowdsourced WWII collections. Findings suggest that while NLP methods show promise, no single approach is sufficient, and ethical considerations in automated keyword extraction are crucial for responsible stewardship.

#AI Coding #Inference #Open Source #Policy

SLAP: Stratified Loss-based Pruning for On-Policy Data-Efficient Instruction Tuning

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.CL

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

Quantifying Prior Dominance in Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.CL

Letting the Data Speak: Extracting Keywords from Crowdsourced Collections with AI

Quantifying Prior Dominance in RAG Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Quantifying Prior Dominance in Systems