BEiTScore: Reference-free Image Captioning Evaluation with an Efficient Cross-Encoder Model
Quick Take
BEiTScore introduces a reference-free metric for efficient image captioning evaluation using a cross-encoder model.
Key Points
- Addresses limitations of existing evaluation metrics.
- Utilizes a lightweight cross-encoder for efficiency.
- Achieves state-of-the-art performance in diverse scenarios.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning
GeoSym127K introduces a scalable neuro-symbolic framework for enhanced geometric reasoning in multimodal models.