Making Time Editable in Video Diffusion Transformers

arXiv cs.CV·Konstantin Kuklev, Viacheslav Vasilev, Alexander Kunitsyn, Andrei Ivaniuta, Denis Dimitrov

3d ago

·~1 min·6/10/2026·en·0

Quick Answer

Quick Take

The proposed methodology enhances pretrained Diffusion Transformers (DiT) by integrating a lightweight temporal module, enabling explicit control over motion speed and temporal structure in video generation. This approach maintains the original generative capabilities while significantly expanding the controllable dynamic range, allowing for more nuanced video editing without redesigning the backbone architecture.

Key Points

Introduces a temporal-control methodology for video generation in Diffusion Transformers.
Augments pretrained DiT with a lightweight temporal module for enhanced editing capabilities.
Enables control over motion speed and temporal structure without redesigning the backbone.
Preserves the original generative prior while expanding the dynamic range.
Improves user control in video editing applications.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Excerpt

From source RSS / original summary

arXiv:2606. 10183v1 Announce Type: new Abstract: Modern Diffusion Transformers for video generation provide limited control over the progression of time and the editing of temporal dynamics. We propose a temporal-control methodology that extends a pretrained DiT with explicit time editing, allowing control over motion speed and temporal structure without redesigning the backbone.

Its core implementation augments the pretrained model with a lightweight temporal module, preserving the original generative prior while expanding its controllable dynamic range.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

1w ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup