RigPAPR: Rig-Based Animation of Static Neural Point Clouds from a Fixed-Viewpoint Video
Quick Answer
RigPAPR introduces a novel method for animating static neural point clouds using proximity attention point rendering, achieving superior performance over traditional methods.
Quick Take
RigPAPR introduces a novel method for animating static neural point clouds using proximity attention point rendering, achieving superior performance over traditional methods. It outperforms mesh-based and Gaussian-splatting baselines by over 3 dB PSNR in novel views while eliminating joint-boundary artifacts, making it a significant advancement in 3D asset animation from fixed-viewpoint videos.
Key Points
- RigPAPR auto-rigs static PAPR clouds for animation from fixed-viewpoint videos.
- Achieves over 3 dB PSNR improvement over mesh-based and Gaussian-splatting methods.
- Eliminates joint-boundary artifacts common in traditional linear blend skinning.
- Utilizes proximity attention point rendering for natural surface reformation.
- Demonstrated effectiveness on both synthetic and real subjects.
Article Content
From source RSS / original summaryarXiv:2606. 06685v1 Announce Type: new Abstract: Static neural point reconstructions capture a subject at high fidelity from posed images. Given such a reconstruction, we aim to animate it to follow a monocular fixed-viewpoint driving video of the subject, whether captured or produced by image-to-video (I2V) generation, and to recover a rigged, re-posable 3D asset.
Existing methods deform Gaussian splats through direct linear blend skinning (LBS) or mesh proxies, both of which are prone to joint-boundary artifacts under articulation, even with per-primitive corrections. We trace the artifact to the representation: each splat carries an individual shape calibrated in the canonical pose to tile with its neighbours. Under rigid LBS, each splat moves with its bone but cannot bend, so the canonical tiling breaks at joint boundaries into gaps and spikes.
Proximity attention point rendering (PAPR) instead carries no per-primitive shape; each pixel is recomposed at render time from the deformed primitives' positions, so the surface re-forms naturally with the articulation. We present RigPAPR, which auto-rigs a static PAPR cloud and drives it under direct LBS from a single fixed-viewpoint video, without mesh proxy, pose-dependent correction, or category template.
On synthetic subjects, RigPAPR matches the strongest baseline at the supervised view and exceeds mesh-based and Gaussian-splatting baselines at novel views by 3+dB PSNR, with cleaner joint-boundary renderings of both synthetic and real subjects.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.