Closing the Gap at CRAC 2026: Two-Stage Adaptation for LLM-Based Multilingual Coreference Resolution
Quick Take
The CRAC 2026 submission achieved top rankings using a two-stage multilingual coreference resolution approach.
Key Points
- First in LLM track with 74.32 CoNLL F1 score.
- Utilized Gemma-3-27b model with fine-tuning strategy.
- Effective across various languages and document types.
📖 Reader Mode
~1 min readAbstract:We present our submission to the LLM track of the 2026 Computational Models of Reference, Anaphora and Coreference (CRAC 2026) shared task. With an average CoNLL F1 score of 74.32 on the official test set, our system ranked first in the LLM track, and third overall. Our system is based on the Gemma-3-27b model, fine-tuned using a two-stage strategy with a multilingual base adapter followed by dataset-specific adapters. We represent mention spans by their headword using an XML-inspired format with local reindexing and annotate documents iteratively. These design choices proved effective across languages, document lengths, and annotation guidelines.
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2605.16984 [cs.CL] |
| (or arXiv:2605.16984v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2605.16984 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Antoine Bourgois [view email]
[v1]
Sat, 16 May 2026 13:07:07 UTC (108 KB)
— Originally published at arxiv.org
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The reliability of LLM judges for evaluating deep research agents is critically assessed using the REFLECT benchmark.