Tail-Aware HiFloat4: W4A4 Post-Training… · DeepSignal

Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2

arXiv cs.AI·Zhanfeng Feng, Shuai Guo, Xin Di, Long Peng, Yang Cao, Zhengjun Zha

3d ago

·~1 min·5/27/2026·en·1

Quick Take

Tail-Aware HiFloat4 introduces a novel post-training quantization method for Wan2.2, enhancing low-bit text-to-video generation. By adapting the ViDiT-Q pipeline, it maintains high precision in sensitive modules while reducing calibration outlier effects, ensuring consistent runtime performance.

Key Points

Adapts ViDiT-Q pipeline for Wan2.2 using HiFloat4 format.
Quantizes main linear layers with W4A4 HiFloat4 fake quantization.
Maintains high precision in numerically sensitive boundary modules.
Introduces activation-tail-aware percentile calibration for channel-mask construction.
Reduces influence of rare calibration outliers effectively.

Article Excerpt

From source RSS / original summary

arXiv:2605. 26628v1 Announce Type: new Abstract: This report describes Tail-Aware HiFloat4, our submission to the low-bit text-to-video generation quantization challenge. Our method adapts the public ViDiT-Q post-training quantization pipeline to Wan2. 2 under the HiFloat4 numerical format. We quantize the main linear layers in both Wan2.

2 transformer modules with W4A4 HiFloat4 fake quantization, keep numerically sensitive boundary modules in high precision, and introduce an activation-tail-aware percentile calibration module for channel-mask construction. Together with compact PTQ-state restoration, this design reduces the influence of rare calibration outliers while keeping the runtime HiFloat4 arithmetic and sampling pipeline unchanged.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Tyler Akidau, Tyler Rockwood, Johannes Br\"uderl, Marc Millstone

1d ago

FeaturedOriginal

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane

AI Summary

The Redpanda Agentic Data Plane (ADP) introduces out-of-band metadata channels to enhance the safety of autonomous AI agents, ensuring secure data access and tamper-proof audit trails. This architecture mitigates risks associated with unpredictable AI behavior by enforcing governance throughout the agent lifecycle, demonstrated in a multi-agent trading system with strict data scoping and approval thresholds.

#Agent #Robotics #Security #Policy