MASF: A Multi-Model Adaptive Selection Framework for Abstractive Text summarization
Quick Answer
This paper shows that The Multi-Model Adaptive Selection Framework (MASF) enhances abstractive text summarization by integrating multiple fine-tuned transformer models, achieving a BERTScore of 88.63%, outperforming LLMs like GPT3-D2 and Falcon-7b.
Quick Take
The Multi-Model Adaptive Selection Framework (MASF) enhances abstractive text summarization by integrating multiple fine-tuned transformer models, achieving a BERTScore of 88.63%, outperforming LLMs like GPT3-D2 and Falcon-7b. This framework addresses the inconsistency in summarization quality across diverse articles, ensuring robust and high-quality outputs.
Key Points
- MASF integrates multiple transformer-based models for improved summarization quality.
- Achieved a BERTScore of 88.63%, the highest among compared methods.
- Outperformed several LLMs, including GPT3-D2 and Falcon-7b.
- Utilizes an adaptive selection mechanism for final summary output.
- Evaluated on the CNN/DailyMail news summarization dataset.
Article Content
From source RSS / original summaryarXiv:2606. 05494v1 Announce Type: new Abstract: Automatic text summarization has become increasingly important due to the rapid growth of digital textual information. This paper presents a Multi-Model Adaptive Summarization Framework designed to improve the robustness and quality of abstractive text summarization. Relying on a single model often leads to inconsistent summarization quality across articles with varying structures and topics.
To address this limitation, the proposed framework integrates multiple fine-tuned transformer-based summarization models and introduces an adaptive selection mechanism. In this framework, each model independently generates a candidate summary for the same input article. The generated summaries are then evaluated using automatic evaluation metrics that capture both lexical similarity and semantic relevance. Based on these scores, the framework selects the highest-quality summary as the final output.
The models are fine-tuned and evaluated on the widely used CNN/DailyMail news summarization dataset. Experimental results demonstrate that the proposed framework achieves the highest BERTScore among all compared methods with a score of 88. 63%. It also outperforms several LLMs such as GPT3-D2, Falcon-7b, and Mpt-7b, highlighting its effectiveness and robustness.
These findings highlight the effectiveness of leveraging multiple transformer-based models within an adaptive selection strategy to improve the quality and robustness of automatic text summarization systems.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.