RETUYT-INCO at BEA 2026 Shared Task 2: Meta-prompting in Rubric-based Scoring for German · DeepSignal
RETUYT-INCO at BEA 2026 Shared Task 2: Meta-prompting in Rubric-based Scoring for German arXiv cs.CL · Ignacio Sastre, Ignacio Remersaro, Facundo D\'iaz, Nicol\'as De Horta, Luis Chiruzzo, Aiala Ros\'a, Santiago G\'ongora 4d ago · ~1 min· 5/13/2026· en· 1RETUYT-INCO developed a Meta-prompting method for scoring German short answers in BEA 2026.
Key Points Participated in three tracks of BEA 2026 shared task. Introduced Meta-prompting for dynamic rubric-based scoring. Achieved 6th, 4th, and 4th places in respective tracks. Reader Mode is being prepared.
arXiv cs.CL · Luis Lara, Aristides Milios, Zhi Hao Luo, Aditya Sharma, Ge Ya Luo, Christopher Beckham, Florian Golemo, Christopher Pal 2d ago Generative Floor Plan Design with LLMs via Reinforcement Learning with Verifiable Rewards AI Summary
A new LLM-based approach generates floor plans while adhering to numerical and topological constraints using reinforcement learning.
📰 Read Original Signal Score
Low signal — niche or repeat coverage.
Weight Score
Source authority 20% 80
Community heat 20% 0
Technical impact 30% 67
📰 Read Original arXiv cs.CL · Mokshit Surana, Archit Rathod, Akshaj Satishkumar 2d ago Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study AI Summary
This study evaluates DExperts for mitigating toxicity in LLMs, revealing strengths and weaknesses in safety and latency.
arXiv cs.CL · Chengzhi Liu, Yichen Guo, Yepeng Liu, Yuzhe Yang, Qianqi Yan, Xuandong Zhao, Wenyue Hua, Sheng Liu, Sharon Li, Yuheng Bu, Xin Eric Wang 2d ago Auditing Agent Harness Safety AI Summary
HarnessAudit framework evaluates safety in LLM agent execution, revealing risks in multi-agent systems.
Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems AI Summary
Invisible orchestrators in multi-agent LLM systems pose significant safety risks and affect behavior dynamics.
Enhanced and Efficient Reasoning in Large Learning Models AI Summary
The paper proposes an efficient reasoning method for large language models, enhancing trust in generated content.
≥75 high · 50–74 medium · <50 low
Why Featured
This AI news highlights a novel scoring method that can enhance automated assessment tools, benefiting developers, PMs, and investors by improving efficiency and accuracy in language evaluation systems.