SLAT: Segment-Level Adaptive Trimming for Efficient CoT Reasoning
Quick Take
The SLAT framework introduces Segment-Level Adaptive Trimming to enhance chain-of-thought reasoning in large models, reducing reasoning length by 50% while maintaining accuracy. This approach selectively targets redundant segments, addressing inefficiencies that arise from structural redundancy in reasoning chains. Empirical results show SLAT establishes a superior accuracy-efficiency balance on standard benchmarks.
Key Points
- SLAT reduces reasoning length by 50% compared to uncompressed baselines.
- The framework targets high-probability segments with low marginal utility.
- Empirical results indicate a superior accuracy-efficiency Pareto frontier.
- Traditional methods rely on coarse, segment-agnostic length penalties.
- SLAT is grounded in a theoretical characterization of segment suboptimality.
Article Content
From source RSS / original summaryarXiv:2605. 30832v1 Announce Type: new Abstract: Recent advances in Large Reasoning Models have significantly improved chain-of-thought (CoT) capabilities via reinforcement learning (RL). However, generated reasoning chains frequently suffer from structural redundancy (i. e. , \emph{overthinking}), incurring high computational overhead without improving answer correctness.
Existing mitigation strategies typically rely on token-uniform length penalties, which provide coarse, segment-agnostic pressure toward shorter outputs and can inadvertently suppress useful reasoning alongside redundancy. To address this, we demonstrate that inefficiency concentrates in high-probability segments with low marginal utility.
We derive a theoretical characterization of segment suboptimality under the correctness-length trade-off objective and propose \textsc{SLAT} (Segment-Level Adaptive Trimming), an RL framework that selectively suppresses redundant segments based on this criterion. Empirical results on standard benchmarks indicate that \textsc{SLAT} establishes a superior accuracy-efficiency Pareto frontier, reducing reasoning length by $50\%$ relative to uncompressed baselines while maintaining competitive accuracy.
Overall, our results suggest that theoretically grounded, segment-aware trimming is a promising direction for efficient CoT reasoning in large language models.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane
The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.