SLAT: Segment-Level Adaptive Trimming for Efficient CoT Reasoning

arXiv cs.AI·Jian Yao, Xiongcai Luo, Ran Cheng, Kay Chen Tan

4h ago

·~1 min·6/1/2026·en·0

Quick Take

The SLAT framework introduces Segment-Level Adaptive Trimming to enhance chain-of-thought reasoning in large models, reducing reasoning length by 50% while maintaining accuracy. This approach selectively targets redundant segments, addressing inefficiencies that arise from structural redundancy in reasoning chains. Empirical results show SLAT establishes a superior accuracy-efficiency balance on standard benchmarks.

Key Points

SLAT reduces reasoning length by 50% compared to uncompressed baselines.
The framework targets high-probability segments with low marginal utility.
Empirical results indicate a superior accuracy-efficiency Pareto frontier.
Traditional methods rely on coarse, segment-agnostic length penalties.
SLAT is grounded in a theoretical characterization of segment suboptimality.

Article Content

From source RSS / original summary

arXiv:2605. 30832v1 Announce Type: new Abstract: Recent advances in Large Reasoning Models have significantly improved chain-of-thought (CoT) capabilities via reinforcement learning (RL). However, generated reasoning chains frequently suffer from structural redundancy (i. e. , \emph{overthinking}), incurring high computational overhead without improving answer correctness.

Existing mitigation strategies typically rely on token-uniform length penalties, which provide coarse, segment-agnostic pressure toward shorter outputs and can inadvertently suppress useful reasoning alongside redundancy. To address this, we demonstrate that inefficiency concentrates in high-probability segments with low marginal utility.

We derive a theoretical characterization of segment suboptimality under the correctness-length trade-off objective and propose \textsc{SLAT} (Segment-Level Adaptive Trimming), an RL framework that selectively suppresses redundant segments based on this criterion. Empirical results on standard benchmarks indicate that \textsc{SLAT} establishes a superior accuracy-efficiency Pareto frontier, reducing reasoning length by $50\%$ relative to uncompressed baselines while maintaining competitive accuracy.

Overall, our results suggest that theoretically grounded, segment-aware trimming is a promising direction for efficient CoT reasoning in large language models.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Tyler Akidau, Tyler Rockwood, Johannes Br\"uderl, Marc Millstone

3d ago

FeaturedOriginal

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane

AI Summary

The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.

#Agent #Robotics #Security #Policy