Crayotter: Traceable Multi-Agent Workflows for Long-Form Video Editing

arXiv cs.CV·Lecheng Yan, Yichong Zhang, Ben Pan, Xiaoyu Zheng, Jiawei Qian, Anqi Wu, Wenxi Li, Chenyang Lyu

2h ago

·~1 min·6/9/2026·en·0

Quick Answer

Crayotter is an open-source multi-agent system for long-form video editing that enhances narrative coherence and editing smoothness.

Quick Take

Crayotter is an open-source multi-agent system for long-form video editing that enhances narrative coherence and editing smoothness. It achieves an average human evaluation score of 3.40/5, outperforming CapCut-Mate and CutClaw, which scored 2.44 and 1.70, respectively. The system's traceable workflows allow for selective revisions without complete restarts.

Key Points

Crayotter organizes video editing into three phases: material preparation, editing research, and timeline execution.
The system externalizes inspectable artifacts like coverage reports and editing blueprints.
It allows for diagnosing failures in editing runs without needing a full restart.
Crayotter was evaluated on 23 themes, showing consistent gains in theme alignment and narrative coherence.
Code and examples are publicly available on GitHub.

Article Content

From source RSS / original summary

arXiv:2606. 07636v1 Announce Type: new Abstract: Editing a long-form video from heterogeneous footage requires more than selecting clips: an agent must preserve narrative intent across material preparation, timeline construction, post-production, and revision while leaving enough evidence to diagnose failures. We present \textbf{Crayotter}, an open-source multimodal multi-agent system for prompt-driven video editing.

Crayotter organizes production into three phases: coverage-aware material preparation, artifact-based editing research, and tool-grounded timeline execution. Each phase externalizes inspectable artifacts, including coverage reports, multimodal analyses, editing blueprints, tool calls, and intermediate renders. These artifacts make an editing run traceable and allow failed segments to be diagnosed and selectively revised instead of requiring a full restart.

We evaluate Crayotter on 23 editing themes against CapCut-Mate and CutClaw. Under human evaluation, Crayotter achieves an average score of 3. 40/5, compared with 2. 44 and 1. 70 for the two baselines, with consistent gains in theme alignment, narrative coherence, and editing smoothness. We additionally describe a replayable trajectory schema and verifiable reward design that prepare these workflows for future policy optimization. Code, traces, and examples are publicly available at https://github.

com/idwts/Crayotter.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

4d ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup