How Can AI Find My Model? A Model-Finding Experimental Study Considering Data Formats, Embeddings, and Retrieval Strategies
Quick Answer
This study explores how data representation, transformer-based embeddings, and retrieval strategies impact the discovery of simulation models through natural language queries.
Quick Take
This study explores how data representation, transformer-based embeddings, and retrieval strategies impact the discovery of simulation models through natural language queries. Results indicate that open-source embedding models perform well, and reranking methods are crucial as query complexity increases, providing a baseline for AI-driven model discovery.
Key Points
- Data representation significantly affects model discovery performance.
- Open-source embedding models achieve high performance in retrieval tasks.
- Reranking methods are essential for complex queries.
- The study uses recall@5 and nDCG@5 as evaluation metrics.
- Findings contribute to AI-driven composability and interoperability.
Paper Resources
Article Excerpt
From source RSS / original summaryarXiv:2606. 30846v1 Announce Type: new Abstract: Discovering simulation models for reuse remains a fundamental challenge in Modeling and Simulation (M&S). When many models coexist, identifying those that align with a given modeling intent remains difficult. Recent advances in Artificial Intelligence (AI), particularly retrieval-based approaches, offer a promising pathway to operate at this semantic layer.
In this paper, we present an experimental study investigating the impact of data representation, transformer-based embedding models, and retrieval strategies on the discovery of simulation models using natural language queries. We evaluated performance across multiple query types using standard information retrieval metrics, including recall@5 and nDCG@5.
Results show that data representation matters, open-source embedding models can achieve high performance, and reranking methods are important, especially as query complexity increases. This work provides a baseline for AI-driven model discovery and discusses its role in advancing toward AI-driven composability and interoperability.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Verification Horizon: No Silver Bullet for Coding Agent Rewards
As coding agents evolve, verifying solutions becomes more challenging than generating them, necessitating a focus on scalable, faithful, and robust verification methods. The study reveals that no fixed reward function can sustain effectiveness as model capabilities advance, emphasizing the need for verification to evolve alongside solution generation.