Korean Culture into LLM Alignment: Toward Cultural Coherence
Quick Answer
This study proposes a culturally coherent response framework for Korean contexts in LLMs, enhancing safety without degrading general capabilities.
Quick Take
This study proposes a culturally coherent response framework for Korean contexts in LLMs, enhancing safety without degrading general capabilities. By fine-tuning three frontier models, the approach improves the Korean cultural safe rate across six open-weight LLMs while maintaining performance on Korean benchmarks.
Key Points
- Developed a Korean harm taxonomy for culturally coherent LLM responses.
- Implemented a prompt-based LLM seed generator for alignment data.
- DPO fine-tuning improved cultural safety rates without significant performance loss.
- Fine-tuned models referenced Korean statutes and provided contextually relevant information.
- Aligned responses with Korean legal frameworks and social norms.
Article Excerpt
From source RSS / original summaryarXiv:2606. 06797v1 Announce Type: new Abstract: Cultural-aspect work on large language models is dominated by a negative target: which outputs to suppress. We argue that a constructive counterpart is also needed, a working definition of what a culturally coherent response is rather than only what it must avoid, and instantiate it for Korean.
We design an alignment-data pipeline around a prompt-based LLM seed generator that expands a Korean harm taxonomy, with a Korean-culturally-adapted safe-response policy at its centre: a per-category guideline grounded in Korean legal frameworks, social norms, and interpretive conventions, against which three frontier models each produce a candidate response.
DPO fine-tuning on the resulting triplets improves the Korean cultural safe rate across six open-weight LLMs while causing no large degradation on Korean general-capability benchmarks, and qualitative outputs show fine-tuned models naming Korean statutes and institutional procedures and, where appropriate, supplying constructive Korean-context information alongside refusal.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.