Towards Large Model Feature Coding
Quick Take
The paper introduces LaMoFC, a systematic framework for coding features from large models, addressing the challenges of heterogeneous feature distributions and split execution. It presents LaMoFCBench, a benchmark dataset for evaluating feature coding across diverse tasks, revealing significant misalignment with existing coding paradigms. This work emphasizes the need for a fundamental shift in feature coding approaches to accommodate the unique characteristics of large model outputs.
Key Points
- LaMoFCBench covers 4 categories and 16 scenarios for diverse task requirements.
- The framework establishes a unified pipeline for fair comparisons of feature coding.
- Existing coding paradigms misalign with the heterogeneous nature of large model features.
- The study calls for a fundamental shift in feature coding approaches.
- Data and code will be available on GitHub for further research.
Article Content
From source RSS / original summaryarXiv:2605. 24025v1 Announce Type: new Abstract: Large models have delivered remarkable performance across a wide range of perception and generation tasks, yet practical deployment is increasingly constrained by computational and memory budgets, as well as privacy requirements. Split execution alleviates these constraints by partitioning computation across devices, but it inevitably introduces intensive transmission and storage of intermediate features.
Unlike conventional feature coding for CNNs that typically targets homogeneous spatial activation maps, modern large models generate heterogeneous features with varying statistical distributions and compression tolerances, e. g. , multi-level/multi-modal representations and autoregressive context caches. These characteristics necessitate treating large model feature coding (LaMoFC) as a fundamental system component and call for a systematic evaluation framework.
In this paper, we present a comprehensive benchmark and evaluation framework for LaMoFC. We first build the feature dataset LaMoFCBench, covering diverse task requirements across 4 categories and 16 scenarios while integrating widelyadopted architectures and various split-computing settings. We then specify representative split points according to practical application scenarios to extract intermediate features, establishing a unified pipeline for fair and reproducible comparisons.
Finally, we benchmark mainstream universal feature codecs, exposing the profound misalignment between existing coding paradigms and the heterogeneous nature of large model features. These findings reveal that LaMoFC demands a fundamental departure from existing paradigms, and LaMoFCBench provides the shared empirical foundation to drive this transition. The data and code will be available at https://github. com/lartpang/LaMoFCBench.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning
Evi-Steer introduces a novel evidential tuning framework for BiomedCLIP, achieving 0.11% parameter updates while enhancing uncertainty-aware fine-tuning. It outperforms state-of-the-art methods across 15 biomedical imaging datasets, proving effective in few-shot learning and domain shifts for clinical applications.