FS-DVS: A Frequency-Selective Dynamic Visual Sensing Paradigm for Enhancing Information Completeness
Quick Answer
This paper shows that The FS-DVS (Frequency-Selective Dynamic Vision Sensor) enhances information completeness by integrating a learnable spatial filter that mimics retinal ganglion cell aggregation, achieving improved object detection and action recognition performance.
Quick Take
The FS-DVS (Frequency-Selective Dynamic Vision Sensor) enhances information completeness by integrating a learnable spatial filter that mimics retinal ganglion cell aggregation, achieving improved object detection and action recognition performance. This paradigm emphasizes mid-frequency components, aligning with human contrast sensitivity functions, and offers high noise resilience without merely increasing sensor sensitivity.
Key Points
- FS-DVS integrates a learnable spatial filter before event triggering.
- The spatial filter evolves into center-surround patterns emphasizing mid-frequency components.
- Significant performance gains in object detection and action recognition are achieved.
- The model aligns with human contrast sensitivity functions across various tasks.
- FS-DVS provides a robust blueprint for next-generation neuromorphic sensors.
Article Content
From source RSS / original summaryarXiv:2606. 06856v1 Announce Type: new Abstract: Dynamic vision sensors (DVS) offer exceptional temporal resolution and dynamic range by asynchronously reporting pixel-level intensity changes. However, conventional DVS rely on a per-pixel independent triggering mechanism, ignoring the spatial integration performed by biological retinal ganglion cells (RGCs).
Consequently, they lack the contrast sensitivity function (CSF) and its inherent sensitivity to mid-spatial frequencies, which inevitably leads to information incompleteness due to sub-threshold signal loss. To bridge this gap, we propose FS-DVS (Frequency-Selective Dynamic Vision Sensor), a novel paradigm that integrates a learnable spatial filter strictly preceding the event triggering process to mimic the RGC aggregation mechanism.
By developing a differentiable event simulation framework, the spatial filter can be optimized end-to-end with downstream tasks. Our study reveals that starting from a delta function, the learned spatial filters spontaneously evolve into center-surround patterns that emphasize mid-frequency components, consistently aligning with human CSF.
Beyond achieving substantial performance gains in object detection and action recognition, the consistent convergence to human-like CSF characteristics across different tasks underscores the universality of this mid-frequency selective mechanism. Compared to naively increasing sensor sensitivity or relying on post-processing, our paradigm achieves selective information enhancement with high noise resilience, providing a robust, biologically plausible blueprint for next-generation neuromorphic sensors.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.
