Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation（CVPR20）

2024-07-02 07:23:26

3. Our NICE-GAN

3.1. General Formulation

No Independent Component for Encoding (NICE).

以domain yyy上的判别器DyD_yDy为例，DyD_yDy的结构包括encoder EyDE_y^DEyD，以及classifier CyC_yCy

DyD_yDy不断学习到判别图像是否属于domain yyy的能力，因此encoder EyDE_y^DEyD提取的特征是非常有用的，于是y→xy\rightarrow xy→x的生成器可以复用EyDE_y^DEyD

3.2. Architecture

Multi-Scale Discriminators DxD_xDx and DyD_yDy.

第1处结构上的改进，判别器的结构使用multi-scale structure

之前的文章中也使用了Multi-Scale Discriminators，具体来说，将图像down-sampling为一系列尺寸，然后将这一系列图像送入一系列判别器中

本文采用的做法更加efficient，具体Multi-Scale Discriminators的结构如Figure 2所示，总共设置了3级{Cx0,Cx1,Cx2}\left \{ C_x^0, C_x^1, C_x^2 \right \}{Cx0,Cx1,Cx2}

简单来说就是图像经过encoder之后的feature map送入Cx0C_x^0Cx0，然后经过卷积得到feature map送入Cx1C_x^1Cx1，最后再经过卷积得到feature map送入Cx2C_x^2Cx2

第2处结构上的改进，对于U-GAT-IT中的CAM attention，本文将它升级为残差的版本

Ex(x)E_x(x)Ex(x)是encoder得到的feature map，利用CAM学习到一个weight www，U-GAT-IT的做法是使用www对Ex(x)E_x(x)Ex(x)进行加权，得到reweighted feature map（又称attention map）

本文的做法是引入一个trainable parameter γ\gammaγ，来线性组合原始Ex(x)E_x(x)Ex(x)与加权的Ex(x)E_x(x)Ex(x)，即γ×w×Ex(x)+Ex(x)\gamma\times w\times E_x(x) + E_x(x)γ×w×Ex(x)+Ex(x)

第3处结构上的改进，对判别器使用spectral normalization

3.3. Decoupled Training

因为Encoder是复用的，所以 it will incur inconsistency if we apply conventional adversarial training.（缺乏一个理论上的解释）

To overcome this defect, we decouple the training of ExE_xEx from that of the generator Gx→yG_{x\rightarrow y}Gx→y.

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation（CVPR20）相关推荐

【论文译文】Few-Shot Unsupervised Image-to-Image Translation（FUNIT）
译文仅供参考! 原文是pdf,想下载的话可以戳:http://www.gwylab.com/pdf/FUNIT_chs.pdf
[论文阅读]ICE: Inter-instance Contrastive Encoding for Unsupervised Person Re-identification(ICCV2021)
ICE: Inter-instance Contrastive Encoding for Unsupervised Person Re-identification(ICCV2021) 实例间对比学习 ...
对比学习系列论文MoCo v1（二）：Momentum Contrast for Unsupervised Visual Representation Learning
0.Abstract 0.1逐句翻译 We present Momentum Contrast (MoCo) for unsupervised visual representation learni ...
阅读文献“Language Models are Unsupervised Multitask Learner”（GPT-2）
阅读文献"Language Models are Unsupervised Multitask Learner"(GPT-2) Abstract zero-shot:指在分类任务中 ...
Unsupervised Deep Image Stitching：首个无监督图像拼接框架（TIP2021）
作者丨廖康@知乎来源丨https://zhuanlan.zhihu.com/p/386863945 编辑丨3D视觉工坊一.写在前面图像拼接(Image Stitching)可以说是计算机视觉领域 ...
机器学习类别/标称（categorical）数据处理：目标编码(target encoding)
机器学习类别/标称(categorical)数据处理:目标编码(target encoding) 序号编码:序号编码通常用于处理类别间具有大小关系的数据可以通过导入sklearn.preproces ...
机器学习类别/标称（categorical）数据处理：序号编码(Ordinal Encoding)
机器学习类别/标称(categorical)数据处理:序号编码(Ordinal Encoding) 序号编码:序号编码通常用于处理类别间具有大小关系的数据可以通过导入sklearn.preproce ...
机器学习类别/标称（categorical）数据处理：独热编码(One Hot Encoding)
机器学习类别/标称(categorical)数据处理:独热编码(One Hot Encoding) 序号编码:序号编码通常用于处理类别间具有大小关系的数据可以通过导入sklearn.preproce ...
安装sklearn-poter遇到报错（TypeError：‘encoding‘ is an invalid keyword argument for this function）
问题: 在python2.7环境下进行sklearn-poter模块的安装的时候报错如下: 解决方法1:(未尝试) 遇到错查度娘,建议升级: 升级pip:python -m pip install - ...

最新文章

热门文章