Beyond Sentiment Classification: A Generative Framework for Emotion Intensity Evaluation in Text
Quick Take
A new generative framework evaluates emotion intensity in text, enhancing sentiment analysis beyond classification.
Key Points
- Focus shifts from emotion identification to intensity evaluation.
- Generative models output continuous emotional intensity scores.
- Framework shows superior generalization and transfer capabilities.
📖 Reader Mode
~2 min readAbstract:We introduce a novel approach to emotion modeling that shifts the focus from
identification to evaluation, addressing the limitations of discrete classification in
applied domains such as finance. By constructing a dataset of emotional intensity
scores and fine-tuning open-weight generative language models to output continuous
values from 0-100, we demonstrate a more expressive, generalizable framework for
sentiment and emotion analysis. Our findings not only outperform classification
baselines but also reveal surprising generalization capabilities and transfer effects
to related constructs such as sentiment and arousal. This work contributes to the
interdisciplinary recontextualization of NLP by introducing emotion intensity
evaluation as an alternative to classification, arguing that this shift better aligns
with the needs of domains--such as finance--where the degree of emotional content is
central to interpretation and decision-making.
| Comments: | 10 pages, no figures, 5 tables |
| Subjects: | Computation and Language (cs.CL); General Economics (econ.GN); General Finance (q-fin.GN) |
| Cite as: | arXiv:2605.16613 [cs.CL] |
| (or arXiv:2605.16613v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2605.16613 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: William Goetzmann [view email]
[v1]
Fri, 15 May 2026 20:32:29 UTC (34 KB)
— Originally published at arxiv.org
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The reliability of LLM judges for evaluating deep research agents is critically assessed using the REFLECT benchmark.