SSDA: Bridging Spectral and Structural Gaps via Dual Adaptation for Vision-Based Time Series Forecasting

arXiv cs.CV·Mingrui Zhang, Hanchen Yang, Wengen Li, Xudong Jiang, Yichao Zhang, Jihong Guan, Shuigeng Zhou

3d ago

·~2 min·5/14/2026·en·2

Quick Take

SSDA enhances time series forecasting by bridging spectral and structural gaps in large vision models.

Key Points

Introduces dual-branch network for adaptation.
Uses Spectral Magnitude Aligner for data enhancement.
Achieves superior performance on real-world benchmarks.

📖 Reader Mode

~2 min read

[Submitted on 10 May 2026]

View PDF HTML (experimental)

Abstract:Large vision models (LVMs) have recently proven to be surprisingly effective time series forecasters, simply by rendering temporal data as images. This success, how ever, rests on a largely unexamined premise: the rendered time series images are sufficiently close to natural images for knowledge in pre-trained models to transfer effectively. We argue that two gaps still remain, i.e., spectral and structural gaps, fundamentally limiting the potential of LVMs for time series forecasting. Spectrally, we systematically reveal that rendered time series images exhibit a markedly shallower power spectrum than the natural images LVMs are pre-trained to recognize. Structurally, reshaping 1D temporal sequences into 2D grids fabricates spurious spatial adjacencies while severing genuine temporal continuities, misleading the spatial inductive biases of pre-trained LVMs. To bridge these gaps, we propose SSDA, a dual-branch network that spectrally and structurally adapts to unlock the full potential of LVMs for time series forecasting. At the data level, a Spectral Magnitude Aligner (SMA) applies 2D FFT to selectively enhance the magnitude spectrum toward natural-image statistics while preserving phase. At the model level, a Structural-Guided Low-Rank Adaptation (SG-LoRA) injects position-aware temporal encodings into patch embeddings and adapts at tention via low-rank updates. The two branches are further adaptively fused to produce the final forecast. Extensive experiments on seven real-world benchmarks demonstrate that SSDA consistently outperforms strong LVM- and LLM-based baselines under both full-shot and few-shot settings. Code is publicly available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2605.12550 [cs.CV]
	(or arXiv:2605.12550v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2605.12550 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Wengen Li [view email]
[v1] Sun, 10 May 2026 07:17:08 UTC (15,133 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

SSDA: Bridging Spectral and Structural Gaps via Dual Adaptation for Vision-Based Time Series Forecasting

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.CV

CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers

ProtoMedAgent: Multimodal Clinical Interpretability via Privacy-Aware Agentic Workflows

Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers

Related in this space

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems

Generative Floor Plan Design with LLMs via Reinforcement Learning with Verifiable Rewards

Distribution-Aware Algorithm Design with LLM Agents