Bridging the Sim-to-Real Gap in Reinforcement Learning-Based Industrial Dispatching through Execution Semantics

arXiv cs.AI·Jonathan Hoss, Noah Klarmann

5/29/2026

·~2 min·5/29/2026·en·1

Quick Answer

Quick Take

The proposed policy-neutral execution layer enhances industrial dispatching by addressing execution errors and decision inconsistencies, transforming execution uncertainty into structured data for better policy refinement. Evaluated through discrete-event simulation, it shows significant operational benefits, particularly under low observation lag, preventing avoidable errors before commitment.

Key Points

Introduces a policy-neutral execution layer for industrial scheduling.
Constructs decision-valid snapshots from asynchronous event streams.
Defines standardized execution contracts with explicit action admissibility.
Transforms execution failures into structured outcomes with full attribution.
Demonstrates strongest benefits under low observation lag conditions.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2605. 29078v1 Announce Type: new Abstract: Event-driven scheduling policies are increasingly deployed in industrial environments, where decisions are made under asynchronous and partially observed system states. As a result, decision states are not temporally consistent, action admissibility is not explicitly defined, and the origin of execution errors remains ambiguous. These issues limit both reliability and interpretability.

To address this gap, a policy-neutral execution and measurement layer is proposed to mediate between scheduling policies and the industrial execution environment. The layer constructs decision-valid snapshots from asynchronous event streams, defines a standardized execution contract with explicit action admissibility, and records outcomes as divergences between policy intent, transactional outcomes, physical execution, and human intervention.

This enables a separation between decision semantics and execution behavior and makes deployment mismatch observable and structurally attributable. The proposed framework is evaluated using a discrete-event simulation. The results show analytical benefits across all observation lag regimes, as undifferentiated execution failures are transformed into structured, typed outcomes with full attribution coverage.

Operational benefits are strongest under low observation lag, where avoidable execution errors can be prevented before commitment. Overall, the layer turns execution uncertainty into supervisory data for evaluation and policy refinement.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Mihnea C. Moldoveanu, Joel A. C. Baum

4d ago

FeaturedOriginal

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

AI Summary

The paper introduces Adversarial Social Epistemology (ASE) to analyze how agents manipulate trust in public communications, highlighting mechanisms that undermine the reliability of testimony and inference. It critiques existing frameworks like epistemic bubbles and misinformation diffusion, proposing a new language for understanding trust breaches and auditing inferential chains in densely interactive environments involving humans and large language models.

#LLM #Agent #Inference #Policy

Bridging the Sim-to-Real Gap in Reinforcement Learning-Based Industrial Dispatching through Execution Semantics

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.AI

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

Information Limits and Attractor Dynamics in Economies of Frontier LLM Agents: A Pre-Registered Test

Onnes: A Physics-Grounded LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.AI

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

Information Limits and Attractor Dynamics in Economies of Frontier LLM Agents: A Pre-Registered Test

Onnes: A Physics-Grounded Multi-Agent LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure

Onnes: A Physics-Grounded LLM Simulator for Cryogenic Fault Diagnosis in Quantum Computing Infrastructure