Graph-Augmented Retrieval for Cross-Entity Financial Sentiment Analysis: A Comparative Study
Quick Take
The study compares a two-hop Graph-RAG architecture against a vector-only baseline for financial sentiment analysis, achieving a 6.4% improvement in entity recall and 11.7% in answer relevancy for complex queries. Despite a 22.6% increase in latency, the model maintains answer quality, providing insights for RAG system development in multi-entity financial contexts.
Key Points
- Graph-RAG outperforms vector-only systems with a 6.4% increase in entity recall.
- Achieves an 11.7% improvement in answer relevancy for complex multi-entity queries.
- Latency increases by 22.6%, but variance in latency reduces by 80%.
- Optimal graph traversal intensity identified at tau = 0.5 for answer quality.
- Study provides guidance for practitioners in multi-entity financial analysis.
Article Content
From source RSS / original summaryarXiv:2606. 00062v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has become foundational for grounding large language models in domain-specific corpora, yet conventional vector-based RAG systems are fundamentally limited in their ability to capture the structured, multi-entity relationships that underpin financial market analysis.
This paper presents a comprehensive comparative study of a novel two-hop Graph-RAG architecture versus a standard vector-only baseline for cross-entity financial sentiment analysis. Our system constructs a sentiment-weighted knowledge graph of 59 equity entities from 255 news articles covering 10 major technology stocks, then augments dense retrieval with intensity-filtered graph traversal over INFLUENCES edges to surface relational evidence inaccessible to vector search alone.
We evaluate both architectures on 100 grounded queries (30 Direct, 70 Relational) using semantic similarity, entity recall, RAGAS metrics, latency benchmarks, and ablation studies. Graph-RAG achieves a statistically significant improvement in entity recall (+6. 4%, p < 0. 001, Wilcoxon signed-rank) and delivers substantially more relevant answers for complex multi-entity queries (+11. 7% Answer Relevancy), with gains concentrating in relational question types (+16. 1%).
Critically, these improvements come at no measurable cost to answer quality (delta = +0. 001 semantic similarity, Cohen's d = 0. 078), with a modest 22. 6% increase in mean latency offset by an 80% reduction in latency variance. An ablation study on the graph traversal intensity threshold reveals an inverted-U relationship with answer quality, identifying tau = 0. 5 as optimal over the production default of tau = 0. 7.
These findings characterize a precision-for-coverage trade-off inherent to graph-augmented retrieval and provide actionable architectural guidance for practitioners building RAG systems for multi-entity financial analysis.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.