论文:Blind Super-Resolution With Iterative Kernel Correction

Blind Super-Resolution With Iterative Kernel Correction

  • Movtivation
    • Estimate the kernel
    • Correct the kernel
  • Method
    • Network Architecture of SR model
    • Network Architecture of P\mathcal{P}P and C\mathcal{C}C
  • Experiments
  • Comprehension


The point is that different kernel will generate different artifact-texture in the result. So you should choose the right kernel. For example, use Gaussian kernel with kernel width σLR\sigma_{LR}σLR​, the SR results show unnatural ringing artifacts when σSR>σLR\sigma_{SR} > \sigma_{LR}σSR​>σLR​ and over-smoothing on the other side.

Estimate the kernel

A straightforward method is to adopt a function that estimates kernel from the LR image. Let
Then we can optimize the function by minimizing the L2L_2L2​ distance
θP=arg min⁡θP∣∣k−P(ILR;θP)∣∣22\theta_{\mathcal{P}}=\argmin_{\theta_{\mathcal{P}}}{||k-\mathcal{P}(I^{LR};\theta_{\mathcal{P}})||^2_2}θP​=θP​argmin​∣∣k−P(ILR;θP​)∣∣22​
But accurate estimation of kernel is impossible as the problem is ill-posed. So they try to find a way to correct the estimation

Correct the kernel

The idea is to adopt the intermediate SR results. Let C\mathcal{C}C be the corrector function, then
θC=arg min⁡θC∣∣k−(C(ISR;θC)+k′)∣∣22\theta_{\mathcal{C}}=\argmin_{\theta_{\mathcal{C}}}{||k-(\mathcal{C}(I^{SR};\theta_{\mathcal{C}})+k')||_2^2}θC​=θC​argmin​∣∣k−(C(ISR;θC​)+k′)∣∣22​
To avoid over- or under-fitting, a smaller correction steps is used to refine the kernel until it reaches ground truth.


Let F\mathcal{F}F be a SR model, P\mathcal{P}P is a kernel predictor and C\mathcal{C}C is a corrector. You can use PCA to reduce the dimensionality of the kernel space. The kernel after the dimension reduction is denoted by hhh where h=Mkh=Mkh=Mk, MMM is the dimension reduction matrix. An initial estimation h0h_0h0​ is given by the predictor h0=P(ILR)h_0=\mathcal{P}(I^{LR})h0​=P(ILR) and the first SR result is I0SR=F(ILR,h0)I_0^{SR}=\mathcal{F}(I^{LR}, h_0)I0SR​=F(ILR,h0​), Then the iterative kernel correction algorithm can be written as
Δhi=C(ISR,hi−1)hi=hi−1+ΔhiIiSR=F(ILR,hi)\begin{array}{rl} \Delta h_i &=& \mathcal{C}(I^{SR}, h_{i-1}) \\ h_i &=& h_{i-1}+\Delta h_i \\ I^{SR}_i &=& \mathcal{F}(I^LR, h_i) \end{array}Δhi​hi​IiSR​​===​C(ISR,hi−1​)hi−1​+Δhi​F(ILR,hi​)​
After ttt iterations, the ItSRI^{SR}_tItSR​ is the final result of IKC.

Network Architecture of SR model

The SR method for multiple blur kerenls, SRMD have two problems.

  1. The kernel maps do not actually contain the information of the image.
  2. The influence of kernel information is only considered at the first layer.

So SFTMD is proposed which using spatial feature transform layer

Use SRResNet as the backbone (of cause you can change it) and then employ the SFT layer to provide the affine transformation for the feature maps FFF conditioned on the kernel maps H\mathcal{H}H by a scaling and shifting operation:
SFT(F,H)=γ⊙F+β\mathrm{SFT}(F,\mathcal{H})=\gamma \odot F + \betaSFT(F,H)=γ⊙F+β
The kernel maps H\mathcal{H}H is stretched by hhh, where all the elements of the iii-th map are equal to the iii-th element of hhh.

Network Architecture of P\mathcal{P}P and C\mathcal{C}C


Always best.


  1. How can I train the predictor P\mathcal{P}P
    Although the paper said that it is an ill-posed problem, then they get trained model. Over-fit or under-fit? What’s the loss curve like? and When should I stop training?
  2. Spatially uniform
    How? The paper said it is different from its application in semantic super resolution. Just employ the transformation characteristic of SFT layers.
    So I don’t understand that if the segmentation information can provide spatial variability?
  3. IKC With PCA is best?
    Why? As the paper said that the PCA can provides a feature representation and IKC learns the relationship between the SR images and features rather than the Gaussian kernel. But why can’t IKC learn features from kernel?

