Query-Adaptive Semantic Chunking for Retrieval-Augmented Generation: A Dynamic Strategy with Contextual Window Expansion
Quick Take
Query-Adaptive Semantic Chunking (QASC) enhances Retrieval-Augmented Generation systems by dynamically creating contextually relevant document chunks. Evaluated on 100 technical documents and 200 queries, QASC achieved an F1-score of 0.85, outperforming fixed chunking by 18-27% and semantic methods by 8-12%. Human evaluations confirm QASC's superior relevance and coherence.
Key Points
- QASC integrates user queries into the chunking process for improved relevance.
- Achieved an F1-score of 0.85 across diverse technical queries.
- Outperformed fixed chunking by 18-27% and semantic methods by 8-12%.
- Utilizes cosine similarity and contextual window expansion for chunk creation.
- Human evaluators rated QASC chunks as more coherent than existing methods.
Article Content
From source RSS / original summaryarXiv:2605. 22834v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) systems depend critically on document chunking quality for retrieving relevant context. Fixed chunking segments documents into uniform units irrespective of semantics or user intent, producing a precision-recall trade-off unresolvable by tuning chunk size alone. Semantic and agentic methods partially address these limitations but do not integrate user queries at the chunking stage.
We present Query-Adaptive Semantic Chunking (QASC), which dynamically constructs chunks by integrating queries into segmentation through three mechanisms: cosine similarity scoring between sentence and query embeddings to identify seed sentences, contextual window expansion around seeds to preserve coherence, and chunk-level score aggregation to ensure holistic relevance.
We evaluate QASC on 100 technical documents across 200 queries spanning four types, comparing against fixed chunking at five granularities, recursive splitting, semantic chunking, and agentic chunking. QASC achieves an F1-score of 0. 85, a relative improvement of 18-27% over fixed chunking and 8-12% over semantic and agentic alternatives. Ablation studies confirm each component contributes meaningfully. Human evaluation by three annotators (Cohen kappa = 0.
82) corroborates that QASC produces more relevant and coherent chunks than existing methods.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The reliability of LLM judges for evaluating deep research agents is critically assessed using the REFLECT benchmark.