TaskTok: Delving into Task Tokens for Task-driven Image Restoration

arXiv cs.CV·Hongjae Lee, Sojung Kang, Jaeseong Yu, Seung-Won Jung

5d ago

·~2 min·6/26/2026·en·0

Quick Answer

Quick Take

TaskTok introduces a framework for Task-Driven Image Restoration (TDIR) that selectively refines task-relevant tokens, improving computational efficiency and performance in image classification, semantic segmentation, and object detection. By focusing on unevenly distributed visual information, TaskTok enhances task performance significantly while minimizing unnecessary updates to latent tokens.

Key Points

TaskTok selectively restores task-relevant tokens for improved performance.
Framework shows significant efficiency gains in image restoration tasks.
Extensive experiments validate TaskTok's effectiveness across multiple vision tasks.
Source code available on GitHub for further research and development.
Focus on index-wise specialization in latent token space enhances results.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

📖 Reader Mode

~2 min read

[Submitted on 25 Jun 2026]

View PDF HTML (experimental)

Abstract:While traditional image restoration focuses on perceptual quality, Task-Driven Image Restoration (TDIR) aims to maximize the performance of downstream high-level vision tasks. Recent approaches leveraging generative priors have shown promise for TDIR; however, they typically suffer from computational inefficiency and potential semantic alteration by indiscriminately updating all latent tokens. In this paper, we posit that not all visual information is equally important for machine perception. Through an analysis of the latent token space, we observe that task-relevant cues are unevenly distributed across the token sequence, exhibiting index-wise specialization. This suggests that selectively refining a subset of tokens can be sufficient for task-driven objectives. Leveraging this insight, we propose TaskTok, a novel framework that selectively restores only task-relevant tokens via a learnable token switch and a lightweight token refinement module. Extensive experiments across image classification, semantic segmentation, and object detection demonstrate that TaskTok significantly enhances task performance with high computational efficiency. The source code is available at this https URL

Comments:	ECCV 2026
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
Cite as:	arXiv:2606.26615 [cs.CV]
	(or arXiv:2606.26615v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2606.26615 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Hongjae Lee [view email]
[v1] Thu, 25 Jun 2026 05:20:01 UTC (42,275 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

3w ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup