Ishigaki-IDS-Bench: A Benchmark for Generating Information Delivery Specification from BIM Information Requirements
Quick Take
Ishigaki-IDS-Bench evaluates LLMs' ability to generate IDS XML from BIM requirements.
Key Points
- Includes 166 expert-verified BIM/IDS examples.
- Evaluates 10 LLMs with macro F1 scores up to 65.6%.
- Supports structured generation methods conforming to domain standards.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The reliability of LLM judges for evaluating deep research agents is critically assessed using the REFLECT benchmark.