Yuanting Fan, Chengxu Liu, Ruhao Tian, Xueming Qian
IEEE TCSVT2024 2024
Super-resolution (SR) aims to restore a high-resolution (HR) image from its low-resolution (LR) counterpart. Existing works try to achieve an overall average recovery over all regions to provide better visual quality for human viewing. If we desire to explore the potential that performs super-resolution for machine recognition instead of human viewing, the solution should change accordingly. From this insight, we propose a new SR pipeline, called InstanceSR, which treats each region in the LR image differentially and consumes more resources to focus on the recovery of the foreground region where the instances exist. In particular, InstanceSR consists of an encoder that formulates the LR image into a set of various difficulty tokens according to the instances distribution in each sub-region, and a decoder based on a multi-exit network structure to recover the sub-regions corresponding to various difficulty tokens by consuming different computational resources. Experimental results demonstrate the superiority of the proposed InstanceSR over state-of-the-art models, especially the recovery of regions where instances exist, by extensive quantitative and qualitative evaluations on three widely used benchmarks containing small instances. Besides, the comparisons using SR results on three challenging small object detection benchmarks verify that our InstanceSR can consistently boost the detection accuracy and has great potential for subsequent machine recognition.

Yuanting Fan, Chengxu Liu, Ruhao Tian, Xueming Qian
IEEE TCSVT2024 2024
Super-resolution (SR) aims to restore a high-resolution (HR) image from its low-resolution (LR) counterpart. Existing works try to achieve an overall average recovery over all regions to provide better visual quality for human viewing. If we desire to explore the potential that performs super-resolution for machine recognition instead of human viewing, the solution should change accordingly. From this insight, we propose a new SR pipeline, called InstanceSR, which treats each region in the LR image differentially and consumes more resources to focus on the recovery of the foreground region where the instances exist. In particular, InstanceSR consists of an encoder that formulates the LR image into a set of various difficulty tokens according to the instances distribution in each sub-region, and a decoder based on a multi-exit network structure to recover the sub-regions corresponding to various difficulty tokens by consuming different computational resources. Experimental results demonstrate the superiority of the proposed InstanceSR over state-of-the-art models, especially the recovery of regions where instances exist, by extensive quantitative and qualitative evaluations on three widely used benchmarks containing small instances. Besides, the comparisons using SR results on three challenging small object detection benchmarks verify that our InstanceSR can consistently boost the detection accuracy and has great potential for subsequent machine recognition.
Yuanting Fan, Chengxu Liu, Nengzhong Yin, Changlong Gao, Xueming Qian
ECCV2024 2024
Diffusion models (DMs) have shown promising results on single-image super-resolution and other image-to-image translation tasks. Benefiting from more computational resources and longer inference times, they are able to yield more realistic imagesqueryThis is to inform you that corresponding author has been identified as per the information available in the Copyright form.. Existing DMs-based super-resolution methods try to achieve an overall average recovery over all regions via iterative refinement, ignoring the consideration that different input image regions require different timesteps to reconstruct. In this work, we notice that previous DMs-based super-resolution methods suffer from wasting computational resources to reconstruct invisible details. To further improve the utilization of computational resources, we propose AdaDiffSR, a DMs-based SR pipeline with dynamic timesteps sampling strategy (DTSS). Specifically, by introducing the multi-metrics latent entropy module (MMLE), we can achieve dynamic perception of the latent spatial information gain during the denoising process, thereby guiding the dynamic selection of the timesteps. In addition, we adopt a progressive feature injection module (PFJ), which dynamically injects the original image features into the denoising process based on the current information gain, so as to generate images with both fidelity and realism. Experiments show that our AdaDiffSR achieves comparable performance over current state-of-the-art DMs-based SR methods while consuming less computational resources and inference time on both synthetic and real-world datasets.

Yuanting Fan, Chengxu Liu, Nengzhong Yin, Changlong Gao, Xueming Qian
ECCV2024 2024
Diffusion models (DMs) have shown promising results on single-image super-resolution and other image-to-image translation tasks. Benefiting from more computational resources and longer inference times, they are able to yield more realistic imagesqueryThis is to inform you that corresponding author has been identified as per the information available in the Copyright form.. Existing DMs-based super-resolution methods try to achieve an overall average recovery over all regions via iterative refinement, ignoring the consideration that different input image regions require different timesteps to reconstruct. In this work, we notice that previous DMs-based super-resolution methods suffer from wasting computational resources to reconstruct invisible details. To further improve the utilization of computational resources, we propose AdaDiffSR, a DMs-based SR pipeline with dynamic timesteps sampling strategy (DTSS). Specifically, by introducing the multi-metrics latent entropy module (MMLE), we can achieve dynamic perception of the latent spatial information gain during the denoising process, thereby guiding the dynamic selection of the timesteps. In addition, we adopt a progressive feature injection module (PFJ), which dynamically injects the original image features into the denoising process based on the current information gain, so as to generate images with both fidelity and realism. Experiments show that our AdaDiffSR achieves comparable performance over current state-of-the-art DMs-based SR methods while consuming less computational resources and inference time on both synthetic and real-world datasets.
Charles Green (MIT)*, John Doe*, Robert White, James Wang, Your Name# (* equal contribution, # corresponding author)
International Conference on Learning Representations (ICLR) 2023
Photo by Dessy Dimcheva on Unsplash. Viverra nibh cras pulvinar mattis nunc sed. Quam quisque id diam vel quam elementum pulvinar etiam. Ac felis donec et odio pellentesque. Ligula ullamcorper malesuada proin libero nunc consequat interdum varius sit. A pellentesque sit amet porttitor eget. Magna fermentum iaculis eu non diam phasellus vestibulum lorem sed.

Charles Green (MIT)*, John Doe*, Robert White, James Wang, Your Name# (* equal contribution, # corresponding author)
International Conference on Learning Representations (ICLR) 2023
Photo by Dessy Dimcheva on Unsplash. Viverra nibh cras pulvinar mattis nunc sed. Quam quisque id diam vel quam elementum pulvinar etiam. Ac felis donec et odio pellentesque. Ligula ullamcorper malesuada proin libero nunc consequat interdum varius sit. A pellentesque sit amet porttitor eget. Magna fermentum iaculis eu non diam phasellus vestibulum lorem sed.
Your Name*, Robert White*, John Doe, Charles Green (Stanford) (* equal contribution)
Nature Communications 2023
Photo by Thomas Renaud on Unsplash. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Your Name*, Robert White*, John Doe, Charles Green (Stanford) (* equal contribution)
Nature Communications 2023
Photo by Thomas Renaud on Unsplash. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Your Name, James Wang, Some Other Name, John Doe
International Conference on Learning Representations (ICLR) 2023
Massa ultricies mi quis hendrerit dolor magna. Arcu non odio euismod lacinia at quis risus sed. Et tortor at risus viverra. Enim neque volutpat ac tincidunt. Dictum varius duis at consectetur lorem donec.

Your Name, James Wang, Some Other Name, John Doe
International Conference on Learning Representations (ICLR) 2023
Massa ultricies mi quis hendrerit dolor magna. Arcu non odio euismod lacinia at quis risus sed. Et tortor at risus viverra. Enim neque volutpat ac tincidunt. Dictum varius duis at consectetur lorem donec.