GPU-Accelerated Inverse Structural Anastylosis from Block Collapse Dynamics
Quick Answer
This paper shows that The Jenga Inverse Predictor (JIP-2) is a GPU-accelerated deep learning framework that reconstructs collapsed architectural structures using a physics engine and dual-stream ResNet-18 model.
Quick Take
The Jenga Inverse Predictor (JIP-2) is a GPU-accelerated deep learning framework that reconstructs collapsed architectural structures using a physics engine and dual-stream ResNet-18 model. It predicts block removal probabilities and generates a 3D video of the reconstruction process, enhancing conservation efforts at sites like Uxmal, Yucatan.
Key Points
- JIP-2 employs a complete rigid-body physics engine for accurate structural predictions.
- The model integrates a dual-stream ResNet-18 to predict block removal and torque risks.
- It simulates 450 episodes across three friction levels for robust training.
- The framework generates smooth 3D videos of the reconstruction process.
- Implications for UNESCO site conservation are discussed, particularly for Uxmal.
Paper Resources
📖 Reader Mode
~2 min readAbstract:The physical anastylosis of collapsed architectural monuments -- the meticulous reassembly of fallen stone elements into their original structural configuration -- represents one of the most intellectually demanding challenges in conservation science. Traditional approaches depend heavily on expert archaeologist judgement and manual block-by-block correspondence, a process that is both labour-intensive and inherently subjective. Inspired by the combinatorial complexity of this problem as manifested in the game of Jenga, we present Jenga Inverse Predictor , a GPU-accelerated deep learning framework that addresses structural anastylosis as an inverse prediction task. Given an image of a collapsed block assembly, JIP-2 reconstructs the most probable prior tower configuration by: (1) implementing a complete rigid-body physics engine with OBB/SAT collision detection and a Projected Gauss-Seidel (PGS) contact solver accelerated with Numba JIT and CuPy CUDA; (2) applying the analytical force thresholds of Ziglar (CMU, 2006) -- F_app = 3*mu_s*m*g (Y-axis, torque-free) and F_app = 4*mu_s*m*g (X-axis, torque risk) -- over three friction levels (mu_s in {0.25, 0.40, 0.60}) across 450 simulated episodes; (3) training a dual-stream ResNet-18 that injects a friction one-hot vector and jointly predicts block removal count, per-position removal probabilities, centre-of-mass imbalance, and Ziglar torque risk; and (4) generating a smooth 3-D video of the block-by-block reverse reconstruction. We discuss implications for computer-assisted anastylosis at the UNESCO Maya site of Uxmal, Yucatan, and provide a detailed technical description of the full pipeline, architecture, and loss formulation.
| Comments: | 20 pges, github link included, 6 figures |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG) |
| Cite as: | arXiv:2606.28394 [cs.CV] |
| (or arXiv:2606.28394v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2606.28394 arXiv-issued DOI via DataCite |
Submission history
From: Alberto Munoz Dr. [view email]
[v1]
Tue, 23 Jun 2026 23:59:31 UTC (862 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.


