文章目录

  • 作业介绍
  • 1. Network Visualization (PyTorch)
    • 1.1 Helper Functions
  • 2. Pretrained Model
    • 2.1 Load some ImageNet images
  • 3. Saliency Maps
    • 3.1 Hint: Pytorch的 `gather()` method
  • 4. Fooling Images
  • 5. Class visualization

作业介绍

  • 作业主页:Assignment #3
  • 作业目的:
  • 作业源代码: NetworkVisualization-PyTorch.ipynb
  • 本作业基于 Pytorch

1. Network Visualization (PyTorch)

In this notebook we will explore the use of image gradients for generating new images.

When training a model, we define a loss function which measures our current unhappiness with the model’s performance; we then use backpropagation to compute the gradient of the loss with respect to the model parameters, and perform gradient descent on the model parameters to minimize the loss.

Here we will do something slightly different. We will start from a convolutional neural network model which has been pretrained to perform image classification on the ImageNet dataset. We will use this model to define a loss function which quantifies our current unhappiness with our image, then use backpropagation to compute the gradient of this loss with respect to the pixels of the image. We will then keep the model fixed, and perform gradient descent on the image to synthesize a new image which minimizes the loss.

In this notebook we will explore three techniques for image generation:

  • Saliency Maps: Saliency maps are a quick way to tell which part of the image influenced the classification decision made by the network.
  • Fooling Images: We can perturb an input image so that it appears the same to humans, but will be misclassified by the pretrained network.
  • Class Visualization: We can synthesize an image to maximize the classification score of a particular class; this can give us some sense of what the network is looking for when it classifies images of that class.

This notebook uses PyTorch; we have provided another notebook which explores the same concepts in TensorFlow. You only need to complete one of these two notebooks.

1.1 Helper Functions

我们的预训练模型被训练在经过预处理的图像上,通过减去每种颜色的平均值并除以每种颜色的标准差。我们定义了几个helper函数来执行和撤消这个预处理。

def preprocess(img, size=224):transform = T.Compose([T.Resize(size),T.ToTensor(),T.Normalize(mean=SQUEEZENET_MEAN.tolist(),std=SQUEEZENET_STD.tolist()),T.Lambda(lambda x: x[None]),])return transform(img)def deprocess(img, should_rescale=True):transform = T.Compose([T.Lambda(lambda x: x[0]),T.Normalize(mean=[0, 0, 0], std=(1.0 / SQUEEZENET_STD).tolist()),T.Normalize(mean=(-SQUEEZENET_MEAN).tolist(), std=[1, 1, 1]),T.Lambda(rescale) if should_rescale else T.Lambda(lambda x: x),T.ToPILImage(),])return transform(img)def rescale(x):low, high = x.min(), x.max()x_rescaled = (x - low) / (high - low)return x_rescaleddef blur_image(X, sigma=1):X_np = X.cpu().clone().numpy()X_np = gaussian_filter1d(X_np, sigma, axis=2)X_np = gaussian_filter1d(X_np, sigma, axis=3)X.copy_(torch.Tensor(X_np).type_as(X))return X

2. Pretrained Model

For all of our image generation experiments, we will start with a convolutional neural network which was pretrained to perform image classification on ImageNet. We can use any model here, but for the purposes of this assignment we will use SqueezeNet [1], which achieves accuracies comparable to AlexNet but with a significantly reduced parameter count and computational complexity.

Using SqueezeNet rather than AlexNet or VGG or ResNet means that we can easily perform all image generation experiments on CPU.

[1] Iandola et al, “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size”, arXiv 2016

下载预训练模型

# Download and load the pretrained SqueezeNet model.
model = torchvision.models.squeezenet1_1(pretrained=True)# We don't want to train the model, so tell PyTorch not to compute gradients
# with respect to model parameters.
for param in model.parameters():param.requires_grad = False# you may see warning regarding initialization deprecated, that's fine, please continue to next steps

2.1 Load some ImageNet images

We have provided a few example images from the validation set of the ImageNet ILSVRC 2012 Classification dataset. To download these images, descend into cs231n/datasets/ and run get_imagenet_val.sh.

Since they come from the validation set, our pretrained model did not see these images during training.

Run the following cell to visualize some of these images, along with their ground-truth labels.

from cs231n.data_utils import load_imagenet_val
X, y, class_names = load_imagenet_val(num=5)plt.figure(figsize=(12, 6))
for i in range(5):plt.subplot(1, 5, i + 1)plt.imshow(X[i])plt.title(class_names[y[i]])plt.axis('off')
plt.gcf().tight_layout()

3. Saliency Maps

Using this pretrained model, we will compute class saliency maps as described in Section 3.1 of [2].

A saliency map tells us the degree to which each pixel in the image affects the classification score for that image. To compute it, we compute the gradient of the unnormalized score corresponding to the correct class (which is a scalar) with respect to the pixels of the image. If the image has shape (3, H, W) then this gradient will also have shape (3, H, W); for each pixel in the image, this gradient tells us the amount by which the classification score will change if the pixel changes by a small amount. To compute the saliency map, we take the absolute value of this gradient, then take the maximum value over the 3 input channels; the final saliency map thus has shape (H, W) and all entries are nonnegative.

[2] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR Workshop 2014.

3.1 Hint: Pytorch的 gather() method

Recall in Assignment 1 you needed to select one element from each row of a matrix;
if s is an numpy array of shape (N, C) and y is a numpy array of shape (N,) containing integers 0 <= y[i] < C, then s[np.arange(N), y] is a numpy array of shape (N,) which selects one element from each element in s using the indices in y.

In PyTorch you can perform the same operation using the gather() method. If s is a PyTorch Tensor of shape (N, C) and y is a PyTorch Tensor of shape (N,) containing longs in the range 0 <= y[i] < C, then

s.gather(1, y.view(-1, 1)).squeeze()

will be a PyTorch Tensor of shape (N,) containing one entry from each row of s, selected according to the indices in y.

run the following cell to see an example.
You can also read the documentation for the gather method and the squeeze method.

# Example of using gather to select one entry from each row in PyTorch
def gather_example():N, C = 4, 5s = torch.randn(N, C)y = torch.LongTensor([1, 2, 1, 3])print(s)print(y)print(s.gather(1, y.view(-1, 1)).squeeze())
gather_example()
tensor([[ 1.3367, -0.1561,  1.0817,  0.4078,  1.7038],[-0.8004, -1.3615, -2.1395,  0.1292,  1.5055],[-0.6311,  0.3392, -0.4773,  2.1220, -0.5365],[ 0.6265,  2.3000,  1.2274,  0.5011, -0.6861]])
tensor([1, 2, 1, 3])
tensor([-0.1561, -2.1395,  0.3392,  0.5011])

根据输入,标签计算正确类别对输入的梯度

def compute_saliency_maps(X, y, model):"""Compute a class saliency map using the model for images X and labels y.Input:- X: Input images; Tensor of shape (N, 3, H, W)- y: Labels for X; LongTensor of shape (N,)- model: A pretrained CNN that will be used to compute the saliency map.Returns:- saliency: A Tensor of shape (N, H, W) giving the saliency maps for the inputimages."""# Make sure the model is in "test" modemodel.eval()# Make input tensor require gradientX.requires_grad_()saliency = None############################################################################### TODO: Implement this function. Perform a forward and backward pass through ## the model to compute the gradient of the correct class score with respect  ## to each input image. You first want to compute the loss over the correct   ## scores (we'll combine losses across a batch by summing), and then compute  ## the gradients with a backward pass.                                        ################################################################################ (1) 先进行前向传播计算各类别的得分scores = model(X) # of shape (N,C)# (2) 选择正确的分类得分来反向传播correct_class_scores = scores.gather(1,y.view(-1,1)).squeeze() # of shape (N,)# (3) 正确分类反向传播,求图像上每个点对该类别的梯度#  def backward(self, gradient=None, retain_graph=None, create_graph=False)#  gradient: 形状与tensor一致,可以理解为链式求导的中间结果,若tensor标量,可以省略(默认为1)#  这里,我们的输出是向量,所以我们初始化一个上流梯度向量correct_class_scores.backward(torch.FloatTensor([1.0,1.0,1.0,1.0,1.0]))# (4) 对通道求最值saliency = X.grad.data # (N,3,H,W)saliency = saliency.abs()saliency,index = torch.max(saliency,dim=1) # (N,H,W)return saliency

可视化我们计算的类显著图

def show_saliency_maps(X, y):# Convert X and y from numpy arrays to Torch TensorsX_tensor = torch.cat([preprocess(Image.fromarray(x)) for x in X], dim=0)y_tensor = torch.LongTensor(y)# Compute saliency maps for images in Xsaliency = compute_saliency_maps(X_tensor, y_tensor, model)# Convert the saliency map from Torch Tensor to numpy array and show images# and saliency maps together.saliency = saliency.numpy()N = X.shape[0]for i in range(N):plt.subplot(2, N, i + 1)plt.imshow(X[i])plt.axis('off')plt.title(class_names[y[i]])plt.subplot(2, N, N + i + 1)plt.imshow(saliency[i], cmap=plt.cm.hot)plt.axis('off')plt.gcf().set_size_inches(12, 5)plt.show()show_saliency_maps(X, y)

这里代码中, plt.imshow(img, cmap=plt.cm.hot)
表示 colormap 是以热力图的形式 hotmap显示差异的


问题
**Q:**A friend of yours suggests that in order to find an image that maximizes the correct score, we can perform gradient ascent on the input image, but instead of the gradient we can actually use the saliency map in each step to update the image. Is this assertion true? Why or why not?

我的答案:
不是。因为这里的 saliency map 是每个像素点梯度的绝对值,是每个像素点对损失影响的相对大小,并不完全等于 gradient map。

4. Fooling Images

We can also use image gradients to generate “fooling images” as discussed in [3].
Given an image and a target class, we can perform gradient ascent over the image to maximize the target class, stopping when the network classifies the image as the target class. Implement the following function to generate fooling images.

[3] Szegedy et al, “Intriguing properties of neural networks”, ICLR 2014

def make_fooling_image(X, target_y, model):"""Generate a fooling image that is close to X, but that the model classifiesas target_y.Inputs:- X: Input image; Tensor of shape (1, 3, 224, 224)- target_y: An integer in the range [0, 1000)- model: A pretrained CNNReturns:- X_fooling: An image that is close to X, but that is classifed as target_yby the model."""# Initialize our fooling image to the input image, and make it require gradientX_fooling = X.clone()X_fooling = X_fooling.requires_grad_()learning_rate = 1iter = 100############################################################################### TODO: Generate a fooling image X_fooling that the model will classify as   ## the class target_y. You should perform gradient ascent on the score of the ## target class, stopping when the model is fooled.                           ## When computing an update step, first normalize the gradient:               ##   dX = learning_rate * g / ||g||_2                                         ##                                                                            ## You should write a training loop.                                          ##                                                                            ## HINT: For most examples, you should be able to generate a fooling image    ## in fewer than 100 iterations of gradient ascent.                           ## You can print your progress over iterations to check your algorithm.       ###############################################################################for i in range(iter):scores = model(X_fooling)_,predictions = scores.max(1)if predictions == target_y: breaktarget_scores = scores[0,target_y]target_scores.backward(torch.FloatTensor([1.0]))image_grad = X_fooling.gradwith torch.no_grad():X_fooling += learning_rate * (image_grad / image_grad.norm())X_fooling.grad.zero_()###############################################################################                             END OF YOUR CODE                               ###############################################################################return X_fooling

Run the following cell to generate a fooling image.

idx = 0
target_y = 6X_tensor = torch.cat([preprocess(Image.fromarray(x)) for x in X], dim=0)
X_fooling = make_fooling_image(X_tensor[idx:idx+1], target_y, model)scores = model(X_fooling)
# torch.max 会返回两个值,(最大值,最大值索引)
assert target_y == scores.data.max(1)[1][0].item(), 'The model is not fooled!'

After generating a fooling image, run the following cell to visualize the original image, the fooling image, as well as the difference between them.

X_fooling_np = deprocess(X_fooling.clone())
X_fooling_np = np.asarray(X_fooling_np).astype(np.uint8)plt.subplot(1, 4, 1)
plt.imshow(X[idx])
plt.title(class_names[y[idx]])
plt.axis('off')plt.subplot(1, 4, 2)
plt.imshow(X_fooling_np)
plt.title(class_names[target_y])
plt.axis('off')plt.subplot(1, 4, 3)
X_pre = preprocess(Image.fromarray(X[idx]))
diff = np.asarray(deprocess(X_fooling - X_pre, should_rescale=False))
plt.imshow(diff)
plt.title('Difference')
plt.axis('off')plt.subplot(1, 4, 4)
diff = np.asarray(deprocess(10 * (X_fooling - X_pre), should_rescale=False))
plt.imshow(diff)
plt.title('Magnified difference (10x)')
plt.axis('off')plt.gcf().set_size_inches(12, 5)
plt.show()


可以看到,用梯度上升的方式改变输入图像就能比较容易欺骗过分类器。某种程度上是因为分类器是数据驱动的,分类边界跟训练数据有关。

5. Class visualization


随机左右上下抖动

def jitter(X, ox, oy):"""Helper function to randomly jitter an image.Inputs- X: PyTorch Tensor of shape (N, C, H, W)- ox, oy: Integers giving number of pixels to jitter along W and H axesReturns: A new PyTorch Tensor of shape (N, C, H, W)"""if ox != 0:left = X[:, :, :, :-ox]right = X[:, :, :, -ox:]X = torch.cat([right, left], dim=3)if oy != 0:top = X[:, :, :-oy]bottom = X[:, :, -oy:]X = torch.cat([bottom, top], dim=2)return X

使用梯度上升求类图像

def create_class_visualization(target_y, model, dtype, **kwargs):"""Generate an image to maximize the score of target_y under a pretrained model.Inputs:- target_y: Integer in the range [0, 1000) giving the index of the class- model: A pretrained CNN that will be used to generate the image- dtype: Torch datatype to use for computationsKeyword arguments:- l2_reg: Strength of L2 regularization on the image- learning_rate: How big of a step to take- num_iterations: How many iterations to use- blur_every: How often to blur the image as an implicit regularizer- max_jitter: How much to gjitter the image as an implicit regularizer- show_every: How often to show the intermediate result"""model.type(dtype)l2_reg = kwargs.pop('l2_reg', 1e-3)learning_rate = kwargs.pop('learning_rate', 25)num_iterations = kwargs.pop('num_iterations', 100)blur_every = kwargs.pop('blur_every', 10)max_jitter = kwargs.pop('max_jitter', 16)show_every = kwargs.pop('show_every', 25)# Randomly initialize the image as a PyTorch Tensor, and make it requires gradient.img = torch.randn(1, 3, 224, 224).mul_(1.0).type(dtype).requires_grad_()loss = 0for t in range(num_iterations):# Randomly jitter the image a bit; this gives slightly nicer resultsox, oy = random.randint(0, max_jitter), random.randint(0, max_jitter)# 随机上下或者左右抖动img.data.copy_(jitter(img.data, ox, oy))######################################################################### TODO: Use the model to compute the gradient of the score for the     ## class target_y with respect to the pixels of the image, and make a   ## gradient step on the image using the learning rate. Don't forget the ## L2 regularization term!                                              ## Be very careful about the signs of elements in your code.            #########################################################################scores = model(img)# 需要的类别分数 (N,), 这里是 (1,)predictions = scores[0,target_y]loss = predictions - l2_reg * torch.sum(img * img)loss.backward()with torch.no_grad():# 除以范数类似于Adam优化算法中平衡各方向的梯度img += learning_rate * img.grad / img.grad.norm()img.grad.zero_()#########################################################################                             END OF YOUR CODE                         ########################################################################## Undo the random jitterimg.data.copy_(jitter(img.data, -ox, -oy))# As regularizer, clamp and periodically blur the image# 限定最值for c in range(3):lo = float(0.0 - SQUEEZENET_MEAN[c] / SQUEEZENET_STD[c])hi = float((1.0 - SQUEEZENET_MEAN[c]) / SQUEEZENET_STD[c])img.data[:, c].clamp_(min=lo, max=hi)# 高斯模糊if t % blur_every == 0:blur_image(img.data, sigma=0.5)# Periodically show the imageif t == 0 or (t + 1) % show_every == 0 or t == num_iterations - 1:plt.imshow(deprocess(img.data.clone().cpu()))class_name = class_names[target_y]plt.title('%s\nIteration %d / %d' % (class_name, t + 1, num_iterations))plt.gcf().set_size_inches(4, 4)plt.axis('off')plt.show()return deprocess(img.data.cpu())

生成某类的类图像

dtype = torch.FloatTensor
# dtype = torch.cuda.FloatTensor # Uncomment this to use GPU
model.type(dtype)target_y = 76 # Tarantula
# target_y = 78 # Tick
# target_y = 187 # Yorkshire Terrier
# target_y = 683 # Oboe
# target_y = 366 # Gorilla
# target_y = 604 # Hourglass
out = create_class_visualization(target_y, model, dtype,num_iterations = 100)


如上图,是狼蛛的类图,是有一点感觉的。

尝试其他的超参数或者类别
这里,我尝试了一下 Adam算法:

# 使用收敛快一点的优化器
def create_class_visualization_adam(target_y, model, dtype, **kwargs):model.type(dtype)l2_reg = kwargs.pop('l2_reg', 1e-3)learning_rate = kwargs.pop('learning_rate', 25)num_iterations = kwargs.pop('num_iterations', 100)blur_every = kwargs.pop('blur_every', 10)max_jitter = kwargs.pop('max_jitter', 16)show_every = kwargs.pop('show_every', 25)# Randomly initialize the image as a PyTorch Tensor, and make it requires gradient.img = torch.randn(1, 3, 224, 224).mul_(1.0).type(dtype).requires_grad_()optimzer = torch.optim.Adam([img],lr = learning_rate)for t in range(num_iterations):# Randomly jitter the image a bit; this gives slightly nicer resultsox, oy = random.randint(0, max_jitter), random.randint(0, max_jitter)# 随机上下或者左右抖动img.data.copy_(jitter(img.data, ox, oy))scores = model(img)# 需要的类别分数 (N,), 这里是 (1,)predictions = scores[0,target_y]loss = predictions - l2_reg * torch.sum(img * img)# 默认的优化器是最小损失loss = -1 * lossoptimzer.zero_grad()loss.backward()optimzer.step()# Undo the random jitterimg.data.copy_(jitter(img.data, -ox, -oy))# As regularizer, clamp and periodically blur the image# 限定最值for c in range(3):lo = float(0.0 - SQUEEZENET_MEAN[c] / SQUEEZENET_STD[c])hi = float((1.0 - SQUEEZENET_MEAN[c]) / SQUEEZENET_STD[c])img.data[:, c].clamp_(min=lo, max=hi)# 高斯模糊if t % blur_every == 0:blur_image(img.data, sigma=0.5)# Periodically show the imageif t == 0 or (t + 1) % show_every == 0 or t == num_iterations - 1:plt.imshow(deprocess(img.data.clone().cpu()))class_name = class_names[target_y]plt.title('%s\nIteration %d / %d' % (class_name, t + 1, num_iterations))plt.gcf().set_size_inches(4, 4)plt.axis('off')plt.show()return deprocess(img.data.cpu())

结果展示
Adam优化:

SGD优化:

[CS231n Assignment 3 #03] 网络可视化:显著映射、类可视化和欺骗图像相关推荐

  1. 数据可视化:趋势类可视化图表大全

    图表是处理数据的重要组成部分,因为它们是一种将大量数据压缩为易于理解的格式的方法.数据可视化可以让受众快速Get到重点. 数据可视化的图表类型极其丰富多样,而且每种都有不同的用例,通常,创建数据可视化 ...

  2. 全球名校课程作业分享系列(11)--斯坦福CS231n之生成对抗网络

    课程作业原地址:CS231n Assignment 3 作业及整理:@邓姸蕾 && @Molly && @寒小阳 时间:2018年2月. 出处:http://blog. ...

  3. python使用matplotlib可视化、使用matplotlib可视化scipy.misc图像、自定义使用grey灰色映射、将不同亮度映射到不同的色彩、并添加颜色标尺

    python使用matplotlib可视化.使用matplotlib可视化scipy.misc图像.自定义使用grey灰色映射.将不同亮度映射到不同的色彩.并添加颜色标尺 目录

  4. python使用matplotlib可视化、使用matplotlib可视化scipy.misc图像、自定义使用RdYIBu色彩映射、将不同亮度映射到不同的色彩

    python使用matplotlib可视化.使用matplotlib可视化scipy.misc图像.自定义使用RdYIBu色彩映射.将不同亮度映射到不同的色彩 目录

  5. python使用matplotlib可视化、使用matplotlib可视化scipy.misc图像、自定义使用winter色彩映射、将不同亮度映射到不同的色彩

    python使用matplotlib可视化.使用matplotlib可视化scipy.misc图像.自定义使用winter色彩映射.将不同亮度映射到不同的色彩 目录

  6. python使用matplotlib可视化、使用matplotlib可视化scipy.misc图像、自定义使用Accent色彩映射、将不同亮度映射到不同的色彩

    python使用matplotlib可视化.使用matplotlib可视化scipy.misc图像.自定义使用Accent色彩映射.将不同亮度映射到不同的色彩 目录

  7. DL之CNN:卷积神经网络算法简介之原理简介——CNN网络的3D可视化(LeNet-5为例可视化)

    DL之CNN:卷积神经网络算法简介之原理简介--CNN网络的3D可视化(LeNet-5为例可视化) CNN网络的3D可视化 3D可视化地址:http://scs.ryerson.ca/~aharley ...

  8. 服务器显示断开网络驱动器,断开网络驱动器 快速映射盘符

    7.多系统相互快速访问 如果你发现Windows2000机器访问98机器特别慢,可以在2000机器上按下Win+R,输入"regedt32",在"注册表编辑器" ...

  9. windows中添加一个网络位置与映射网络驱动器的区别

    网络位置 无盘符 只是一个标记,功能不多,例如不能被设备上其它工程访问到 占的通信流量少,如果只是简单的文件传输,可以用这个 网络驱动器 有盘符 相当于本地的磁盘 占的通信流量多 也有说 https: ...

最新文章

  1. 刻意练习:Python基础 -- Task09. else 与 with 语句
  2. linux查询数据库sql,SQL Server 跨数据库查询
  3. 压缩xvid ffmpeg x264 对比
  4. clickjacking:X-frame-options header missing 漏洞解决办法
  5. eBPF Internal: Instructions and Runtime | 凌云时刻
  6. Android插件化实现方案
  7. 第一章 Visual Basic入门
  8. 如何更改ElementUI组件的图标大小以及标签属性
  9. C语言 while语句的用法
  10. 刷网课被告非法控制计算机信息系统罪,您好,请问一下网上代刷网课叫非法控制计算...
  11. Shell入门之管道
  12. rfc3315_DHCPv6-RFC3315(中文).pdf
  13. dr.oracle黑钻面膜,dr.diamond是什么牌子?dr.diamond钻石面膜怎么样?
  14. c语言.jpg图片转成数组_JPG图片怎么转换成PDF?可以试试这些转换方法!
  15. pubmedy安装不聊了_摆脱单身全靠这个比Pubmed还6的神器啦~
  16. Java学习(84)Java集合——案例:公告管理(ArrayList增删改查)
  17. 消防应急疏散指示系统在某生物制药工厂项目的应用
  18. 基于混沌系统的文本加密算法研究(二)——经典混沌映射
  19. Leetcode14-最小前缀
  20. 规律, 性质, 原则等概念, 概括解析

热门文章

  1. 注册公司流程及手续费
  2. 手一贱,把GetColor工具更新了
  3. python: pyenv (python版本控制) 安装配置
  4. 苹果uwb_苹果新 HomePod 和 Apple TV 未来或可作为 UWB 基站使用
  5. android studio for mac无法真机调试,Android studio for mac真机测试
  6. comboBoxEx
  7. 编码器的工作原理及作用
  8. 3Kw OBC 车载充电器 含原理图、PCB图、C源代码、变压器参数
  9. Java报错 the trustAnchors parameter must be non-empty
  10. 2019秋招面经(计算机相关专业)