Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

arXiv cs.CV·Chenxi Tao, Seung-Kyum Choi

5h ago

·~2 min·6/1/2026·en·0

Quick Take

This review categorizes industrial visual sim-to-real into CAD-available and CAD-unavailable settings, highlighting that CAD render count alone doesn't ensure successful transfer. Empirical analysis on benchmarks like T-LESS/BOP and MVTec AD reveals that factors such as source-distribution design and detector capacity are critical for effective deployment.

Key Points

Distinguishes between CAD-available and CAD-unavailable regimes in sim-to-real.
Empirical anchors include T-LESS/BOP and MVTec AD benchmarks.
CAD render count does not guarantee successful transfer; other factors matter more.
CAD at test time enables distinct verification through mask and pose consistency.
Calls for a nuanced approach to deployment decisions rather than a single leaderboard.

Article Content

From source RSS / original summary

arXiv:2605. 30581v1 Announce Type: new Abstract: Industrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually involves a broader mismatch between available evidence and required decisions.

A system may be built from CAD renderings, simulated RGB-D observations, normal reference images, synthetic defects, pretrained feature spaces, or language prompts, yet deployed under different sensors, lighting, materials, fixtures, calibration, production variation, and rare defect modes. This review reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability.

We distinguish CAD-available settings, where explicit object geometry can support rendering, calibration, pose estimation, segmentation, and test-time geometric verification; CAD-unavailable settings, where geometry is replaced by normal-reference appearance, feature distributions, teacher-student residuals, synthetic anomaly assumptions, foundation features, or vision-language priors; and boundary-prior settings, where approximate models, templates, reference views, or semantic correspondences preserve only part of the CAD role.

This framing connects CAD-based detection and 6D pose-estimation literature with industrial anomaly and surface-inspection literature that is usually reviewed separately. To make the taxonomy concrete, we use empirical anchors on T-LESS/BOP, MVTec AD, and VisA. The anchors show that CAD render count alone does not close transfer; source-distribution design, detector capacity, and small real calibration can matter more.

They also show that CAD at test time creates a distinct verification channel through mask, pose, and depth consistency, whereas CAD-unavailable inspection relies on calibrated normality and feature deviation. The review therefore argues against a single cross-task leaderboard and instead asks what prior grounds the deployment decision.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Taha Koleilat, Hassan Rivaz, Yiming Xiao

5d ago

FeaturedOriginal

Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning

AI Summary

Evi-Steer introduces a novel evidential tuning framework for BiomedCLIP, enabling efficient fine-tuning with only 0.11% parameter updates. It significantly enhances performance in few-shot learning and domain shifts across 15 biomedical imaging datasets, demonstrating robustness for clinical applications.

#AI Coding #Inference #Open Source

Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

Quick Take

Key Points

Article Content

Want this in your inbox every morning?

More from arXiv cs.CV

Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning

Deep Learning-Based Automated Quantification of TIMI Myocardial Perfusion Frame Count (DL-TMPFC) from Coronary Angiography: A Novel Framework for Rapid Assessment of Microvascular Dysfunction

GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

Related in this space

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane

TorqueAGI Announces Collaborations with NVIDIA, John Deere, and Dexterity to Advance Physical AI for Enterprise-Grade Robots

FORT Robotics Acquires Mapless AI to Expand Its Trust Platform with Remote Supervision and Active Safety Capabilities