Orthogonal Concept Erasure for Diffusion Models
Quick Take
Orthogonal Concept Erasure (OCE) enhances concept erasure in diffusion models by using multiplicative parameter updates, achieving up to 100 concept erasures in 4.3 seconds while preserving generative capacity. This method overcomes limitations of existing editing-based techniques, which struggle with precision and scalability due to reliance on additive updates.
Key Points
- OCE reformulates concept erasure as multiplicative updates for better precision.
- Empirical analysis shows concept semantics depend on neuron direction, not magnitude.
- OCE achieves multi-concept erasure with structured subspace manipulation.
- Extensive experiments demonstrate OCE's superiority over existing methods.
- OCE can erase up to 100 concepts in just 4.3 seconds.
Article Content
From source RSS / original summaryarXiv:2605. 28902v1 Announce Type: new Abstract: Concept erasure has emerged as a promising approach to mitigate undesired or unsafe content in diffusion models, yet existing methods still face significant limitations. While training-based methods are effective, their high computational cost limits scalability. Editing-based methods are more efficient and deployment-friendly, yet they struggle to simultaneously achieve precise concept erasure and preserve overall generative capacity.
We identify this core limitation of the editing-based methods as reliance on additive parameter updates. Our empirical analysis reveals that concept semantics primarily depend on neuron direction rather than neuron magnitude, while overall generative capacity relies on the angular geometry of neurons. As additive updates inherently entangle direction, magnitude, and angular geometry, they inevitably introduce unintended interference between concept erasure and overall generation performance.
To address this, we propose Orthogonal Concept Erasure (OCE), which reformulates editing-based erasure as multiplicative parameter updates from a geometric perspective. Specifically, OCE applies layer-wise orthogonal transformations derived from a closed-form solution to the parameters, enabling precise concept erasure while preserving the neuron magnitude and angular geometry.
Furthermore, to address conflicting constraints in multi-concept erasure, OCE introduces a subspace-level objective with structured subspace manipulation, yielding a more effective and scalable erasure. Extensive experiments on single- and multi-concept erasure demonstrate that OCE outperforms existing methods in concept erasure and non-target preservation, erasing up to 100 concepts in 4. 3 s. Code: https://github. com/HansSunY/OCE.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane
The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.