FactoryLLM: A Safe and Open-Source AI Playground for Evaluating… | AI Deep Signal

FactoryLLM: A Safe and Open-Source AI Playground for Evaluating LLMs in Smart Factories

arXiv cs.AI·Yash Pulse, Yong-Bin Kang, Abhik Banerjee, Abdur Forkan, Prem Prakash Jayaraman

6/15/2026

·~1 min·6/15/2026·en·2

Quick Answer

FactoryLLM is an open-source AI platform for evaluating retrieval-augmented generation models in smart factories, achieving groundedness scores above 0.88 across three LLMs on 30 maintenance queries from 600 pages of documentation.

Quick Take

It ensures data safety by allowing local execution without sharing sensitive information.

Key Points

FactoryLLM evaluates using RAGAS and NVIDIA's LLM-as-a-Judge metrics.
Users can configure LLMs to analyze documents from multiple machines.
The platform demonstrated effectiveness with a case study involving an Autonomous Intelligent Vehicle.
All evaluated models achieved groundedness scores above 0.88.
Full code and documentation are publicly available for community use.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Source Excerpt

arXiv:2606. 14119v1 Announce Type: new Abstract: Fault diagnostics and recovery in smart factories is challenging because critical information is dispersed across manuals of multiple machines which are interconnected through the manufacturing process. (LLMs) can provide a promising approach.

In this paper, we propose FactoryLLM, a safe and open-source AI playground designed for evaluating different LLM-based (RAG) models by analysing documents from multiple machines across the manufacturing process. …

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Ji Wu, Yunshan Peng, Wentao Bai, Yunke Bai, Wenzheng Shu, Jinan Pang, Yanxiang Zeng, Xialong Liu

1d ago

FeaturedOriginal

HOBA: Hierarchical On-Policy Bidding Agents for Adaptive Online Advertising

AI Summary

HOBA (Hierarchical On-policy Bidding Agents) is a novel hierarchical reinforcement learning framework that enhances online advertising bidding systems by improving adaptability and reducing hyperparameter tuning costs. It utilizes a for hyperparameter inference, a SARSA agent for expert model selection, and a dynamic expert pool for bid execution, achieving a +3.6% increase in target cost during large-scale deployment and outperforming state-of-the-art baselines on AuctionNet.

#LLM #Agent #Inference #AI Startup

FactoryLLM: A Safe and Open-Source AI Playground for Evaluating LLMs in Smart Factories

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

HOBA: Hierarchical On-Policy Bidding Agents for Adaptive Online Advertising

AINTMA: Agentic AI Architecture for Autonomous Test Management with Generative Intelligence, Secure Cloud Communication and Adaptive Quality Analytics

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

HOBA: Hierarchical On-Policy Bidding Agents for Adaptive Online Advertising

AINTMA: Agentic AI Architecture for Autonomous Test Management with Generative Intelligence, Secure Cloud Communication and Adaptive Quality Analytics

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for LLM Agents

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents