Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

arXiv cs.CV·Haozhe Zhao, Shuzheng Si, Zhenhailong Wang, Zheng Wang, Liang Chen, Xiaotong Li, Zhixiang Liang, Maosong Sun, Minjia Zhang

5h ago

·~1 min·6/1/2026·en·0

Quick Take

Crafter is a multi-agent system for generating editable scientific figures from diverse inputs, outperforming standalone generators and agentic baselines on benchmarks like PaperBanana-Bench and CraftBench. It includes CraftEditor for converting raster outputs to editable SVGs, with significant improvements in quality and flexibility.

Key Points

Crafter generalizes across figure types without architectural changes.
CraftEditor converts raster outputs into editable SVGs, enhancing usability.
Experiments show Crafter outperforms standalone generators significantly.
CraftBench includes three figure types and four input conditions with human quality annotation.
Code and benchmark available at https://github.com/HaozheZhao/Crafter.

Article Content

From source RSS / original summary

arXiv:2605. 30611v1 Announce Type: new Abstract: Scientific figures are among the most effective means of communicating complex research ideas, yet producing publication-quality illustrations remains one of the most labor-intensive parts of paper preparation. Existing automated systems each target a single figure type under text-only input, leaving the diversity of types and conditions researchers actually use unaddressed; their raster outputs further cannot be locally revised.

Because scientific figures are structured compositions of discrete semantic components, the localized errors generators produce on such layouts demand not a stronger backbone but a harness. We instantiate this harness in two complementary systems: Crafter, a multi-agent harness for figure generation that generalizes across figure types and input conditions without architectural changes, and CraftEditor, which applies the same pattern to convert raster outputs into editable SVGs.

Moreover, we introduce CraftBench, a benchmark spanning three figure types and four input conditions with human quality annotation. Experiments show that Crafter substantially outperforms both standalone generators and the agentic baseline on PaperBanana-Bench and CraftBench, with ablations confirming each component's independent contribution; CraftEditor faithfully converts outputs into editable SVGs that surpass all baselines. Our code and benchmark are available at https://github. com/HaozheZhao/Crafter.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Taha Koleilat, Hassan Rivaz, Yiming Xiao

5d ago

FeaturedOriginal

Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning

AI Summary

Evi-Steer introduces a novel evidential tuning framework for BiomedCLIP, enabling efficient fine-tuning with only 0.11% parameter updates. It significantly enhances performance in few-shot learning and domain shifts across 15 biomedical imaging datasets, demonstrating robustness for clinical applications.

#AI Coding #Inference #Open Source