Constrained Semantic Decompression in LLMs through Persian Proverb-Conditioned Story Generation

arXiv cs.CL·Zahra Habibzadeh, Paria Khoshtab, Amir Mesbah, Yadollah Yaghoobzadeh

1d ago

·~1 min·6/12/2026·en·0

Quick Answer

This study introduces the Proverb Aligned Narrative Dataset (PAND) for proverb-conditioned story generation in Persian, revealing a significant 'decompression gap' in LLMs.

Quick Take

This study introduces the Proverb Aligned Narrative Dataset (PAND) for proverb-conditioned story generation in Persian, revealing a significant 'decompression gap' in LLMs. Current models excel in fluency but struggle to accurately convey the moral and causal structures of proverbs, indicating a need for improved reasoning and refinement techniques.

Key Points

PAND pairs Persian proverbs with human-written stories and meanings.
LLMs show strong fluency but fail to capture underlying moral structures.
Explicit reasoning can partially reduce decompression errors in narratives.
The task can extend to other forms of compressed cultural knowledge.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2606. 12599v1 Announce Type: new Abstract: Transforming a dense, abstract proverb into an engaging and morally faithful narrative requires deep cultural understanding and robust semantic grounding. We frame this problem as a \emph{constrained semantic decompression} task and study proverb-conditioned story generation as a testbed for abstraction-to-realization in large language models (LLMs).

Focusing on Persian, we introduce the Proverb Aligned Narrative Dataset (PAND), pairing proverbs with human-written stories and explicit meanings. By a hybrid evaluation framework that combines human-calibrated LLM-as-a-Judge with structural metrics, we analyze model behavior across multiple prompting regimes.

Our findings reveal a persistent \emph{decompression gap}: current LLMs often achieve strong surface-level fluency while failing to faithfully instantiate the underlying moral and causal structure encoded in proverbs. We further show that explicit reasoning and iterative refinement can partially mitigate these failures, suggesting that many decompression errors arise from difficulties in translating abstract meaning into narrative form rather than a complete lack of relevant knowledge.

Our proposed task naturally extends to other forms of compressed cultural knowledge.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Leyao Wang, Yanan He, Peng Chen, Asaf Yehudai, Yixin Liu, Rex Ying, Michal Shmueli-Scheuer, Arman Cohan

3w ago

FeaturedOriginal

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

AI Summary

The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.

#LLM #Agent #Inference #Policy