Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study

arXiv cs.CL·Shahana Akter, Yatharth Vohra, Ankita Shukla, Souvika Sarkar

4h ago

·~1 min·6/1/2026·en·0

Quick Take

This study presents a zero-shot multi-label topic classification framework enhanced with knowledge graphs, revealing that keyword-enhanced classification outperforms others, with six out of fifteen LLMs exceeding baseline performance. However, graph augmentation negatively impacts larger models, while self-consistency decoding does not improve performance despite increasing computational costs fivefold.

Key Points

Framework includes four variants: article-only, keyword-enhanced, and two self-consistency decoding methods.
Keyword-enhanced classification (AK) is the top performer among the base methods.
Graph augmentation shows mixed effects, benefiting smaller models but hindering larger ones.
Self-consistency decoding variant increases computation costs without improving performance.
Tested on fifteen LLMs and eight multi-label datasets across various domains.

Article Content

From source RSS / original summary

arXiv:2605. 30465v1 Announce Type: new Abstract: Multi-label topic classification without labeled training data is a challenging task, specially when documents contain complex relational information. We present a zero-shot multi-label topic classification framework and systematically investigate how per-article knowledge graph augmentation affects its performance.

The base framework classifies topics in documents without labeled training data and has four variants: article-only classification, keyword-enhanced classification, and self-consistency decoding variants of both. Then, we augment each base variant with per article knowledge graph. This graph is extracted from the input document through a pipeline similar to KGGen based on subject-predicate-object triples.

We test all eight methods, four base and four graph augmented on fifteen LLMs and eight multi-label datasets across different domains. For the base framework, keyword-enhanced classification (AK) is the best performing method, and six out of fifteen LLMs surpass the sentence-encoder baseline. Graph augmentation has positive and negative impacts on small and large models, respectively. This shows that larger models already contain enough relational information from pretraining.

Furthermore, the self-consistency decoding variant does not show performance improvements in any experiment while increasing computation costs about fivefold.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Leyao Wang, Yanan He, Peng Chen, Asaf Yehudai, Yixin Liu, Rex Ying, Michal Shmueli-Scheuer, Arman Cohan

1w ago

FeaturedOriginal

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

AI Summary

The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.

#LLM #Agent #Inference #Policy