CFCamo: A Counterfactual Detect-or-Abstain Framework for Camouflaged Object Detection
Quick Answer
CFCamo introduces a novel Counterfactual Detect-or-Abstain framework for camouflaged object detection, enhancing performance by coupling target-present detection with target-absent abstention.
Quick Take
CFCamo introduces a novel Counterfactual Detect-or-Abstain framework for camouflaged object detection, enhancing performance by coupling target-present detection with target-absent abstention. It achieves a Pair Accuracy of 80.0-90.8% on CF-COD and improves S_alpha by +3.7 pp over previous RL-based baselines. This framework addresses the over-detect bias in standard COD evaluations.
Key Points
- CFCamo optimizes a Qwen3-VL-4B-Instruct agent using Counterfactual Sequence Policy Optimization.
- The framework improves Pair Accuracy significantly, reaching 80.0-90.8% on CF-COD.
- Removing counterfactual coupling reduces Pair Accuracy to 1.4-5.2%, highlighting its importance.
- CFCamo enhances camouflaged object detection beyond traditional target-present localization.
- Code and data for CFCamo are available at GitHub.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 11231v1 Announce Type: new Abstract: Vision-language reinforcement learning has recently shown strong target-present localization for camouflaged object detection (COD). Yet localization is only one side of the decision: when the agent faces an ordinary image with no camouflaged target, will it still claim that a camouflaged object exists?
Standard COD training and evaluation data are positive-only, so agents optimized under this setting can acquire an over-detect bias, a task-specific form of object hallucination that standard COD evaluation leaves unmeasured. To quantify this target-absent behavior, we construct Counterfactual COD (CF-COD), a paired benchmark that removes the camouflaged target from each held-out COD evaluation image while preserving a plausible background.
CF-COD evaluates whether a model detects the target on the original image and abstains on the target-absent counterfactual, summarized by Pair Accuracy (PA). We further introduce CFCamo, a paired counterfactual framework for COD with abstention.
For training, CFCamo optimizes a Qwen3-VL-4B-Instruct agent with Counterfactual Sequence Policy Optimization (CSPO), which samples paired original-counterfactual rollouts and uses a Counterfactual Paired Reward (CPR) to couple original-image detection with counterfactual abstention. On CAMO-test, CFCamo improves S_alpha by +3. 7 pp over the prior RL-based COD baseline; across CF-COD, it reaches 80. 0-90. 8% PA. Ablations show that removing counterfactual coupling reduces PA to 1. 4-5.
2% despite strong target-present COD scores, showing that target-present evaluation alone does not characterize detect-or-abstain behavior. Overall, these results indicate that CFCamo improves COD agents by coupling target-present detection with target-absent abstention, rather than merely strengthening target-present localization. Code and data are available at https://github. com/suhang2000/CFCamo.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.