Cross-Lingual Steering for Figurative Language Generation

arXiv cs.CL·Linfeng Liu, Tiffany Zhan, Louie Hong Yao, Saptarshi Ghosh, Tianyu Jiang

4h ago

·~1 min·6/1/2026·en·0

Quick Take

This study demonstrates that multilingual large language models can generate figurative language effectively across languages, with activation steering revealing reusable signals. Notably, directions learned in one language enhance figurative generation in others, particularly benefiting German, and can outperform native signals.

Key Points

Activation steering estimates figurative categories from activation differences in one language.
Five figurative categories were tested across six languages and four multilingual LLMs.
Metaphor and simile showed the most robust steering results.
Directions learned in one language effectively transfer to enhance generation in another.
German was identified as one of the most receptive target languages.

Article Excerpt

From source RSS / original summary

arXiv:2605. 30443v1 Announce Type: new Abstract: Multilingual large language models can generate figurative language, but whether the internal signals driving this behavior are language-specific or reusable across languages is unclear. Using activation steering as a probe, we estimate a direction for a figurative category from figurative--literal activation differences in one language and apply it during generation.

Across five figurative categories, six languages, and four multilingual LLMs, these directions steer reliably within their own language, most robustly for metaphor and simile. More importantly, they transfer across languages: a direction learned in one increases the target behavior when applied to another, with German among the most receptive targets.

Going further, directions assembled from other languages can match or even surpass a target language's own native direction, while removing this shared component weakens native steering. Together, these results provide direct evidence of a reusable but target-dependent cross-lingual signal for figurative generation.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Leyao Wang, Yanan He, Peng Chen, Asaf Yehudai, Yixin Liu, Rex Ying, Michal Shmueli-Scheuer, Arman Cohan

1w ago

FeaturedOriginal

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

AI Summary

The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.

#LLM #Agent #Inference #Policy