Skip to yearly menu bar Skip to main content


Poster

Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion

Anle Ke · Xu Zhang · Tong Chen · Ming Lu · Chao Zhou · Jiawen Gu · Zhan Ma

West Exhibition Hall B2-B3 #W-308
[ ] [ ] [ Project Page ]
Thu 17 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Existing multimodal large model-based image compression frameworks often rely on a fragmented integration of semantic retrieval, latent compression, and generative models, resulting in suboptimal performance in both reconstruction fidelity and coding efficiency. To address these challenges, we propose a residual-guided ultra lowrate image compression named ResULIC, which incorporates residual signals into both semantic retrieval and the diffusion-based generation process. Specifically, we introduce Semantic Residual Coding (SRC) to capture the semantic disparity between the original image and its compressed latent representation. A perceptual fidelity optimizer is further applied for superior reconstruction quality. Additionally, we present the Compression-aware Diffusion Model (CDM), which establishes an optimal alignment between bitrates and diffusion time steps, improving compression-reconstruction synergy. Extensive experiments demonstrate the effectiveness of ResULIC, achieving superior objective and subjective performance compared to state-of-the-art diffusion-based methods with -80.7\%, -66.3\% BD-rate saving in terms of LPIPS and FID.

Lay Summary:

Our research addresses a critical challenge in image compression: avoiding incorrect textures (e.g., wrong colors or shapes) when using AI to reconstruct images from tiny files.Our solution combines two strategies:1.Semantic Residual Coding: AI compares the original and compressed images to identify and restore lost semantic details.2.Compression-Aware Diffusion Models: The AI’s image-generation process is dynamically adjusted based on compression levels, ensuring sharp, realistic results even at minimal file sizes.These advancements are particularly vital for bandwidth-constrained applications—such as emergency communications or satellite imaging—where both accuracy and efficiency are paramount. By combining these strategies, our work bridges the gap between extreme compression and faithful visual reconstruction, delivering a robust solution for resource-limited scenarios.

Chat is not available.