根据自己的理解和参考资料实现了一下
FullyConnectedNets.ipynb

Fully-Connected Neural Nets

In the previous homework you implemented a fully-connected two-layer neural network on CIFAR-10. The implementation was simple but not very modular since the loss and gradient were computed in a single monolithic function. This is manageable for a simple two-layer network, but would become impractical as we move to bigger models. Ideally we want to build networks using a more modular design so that we can implement different layer types in isolation and then snap them together into models with different architectures.

In this exercise we will implement fully-connected networks using a more modular approach. For each layer we will implement a forward and a backward function. The forward function will receive inputs, weights, and other parameters and will return both an output and a cache object storing data needed for the backward pass, like this:

def layer_forward(x, w):""" Receive inputs x and weights w """# Do some computations ...z = # ... some intermediate value# Do some more computations ...out = # the outputcache = (x, w, z, out) # Values we need to compute gradientsreturn out, cache

The backward pass will receive upstream derivatives and the cache object, and will return gradients with respect to the inputs and weights, like this:

def layer_backward(dout, cache):"""Receive derivative of loss with respect to outputs and cache,and compute derivative with respect to inputs."""# Unpack cache valuesx, w, z, out = cache# Use values in cache to compute derivativesdx = # Derivative of loss with respect to xdw = # Derivative of loss with respect to wreturn dx, dw

After implementing a bunch of layers this way, we will be able to easily combine them to build classifiers with different architectures.

In addition to implementing fully-connected networks of arbitrary depth, we will also explore different update rules for optimization, and introduce Dropout as a regularizer and Batch Normalization as a tool to more efficiently optimize deep networks.

# As usual, a bit of setupimport time
import numpy as np
import matplotlib.pyplot as plt
from cs231n.classifiers.fc_net import *
from cs231n.data_utils import get_CIFAR10_data
from cs231n.gradient_check import eval_numerical_gradient, eval_numerical_gradient_array
from cs231n.solver import Solver%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2def rel_error(x, y):""" returns relative error """return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))
# Load the (preprocessed) CIFAR10 data.data = get_CIFAR10_data()
for k, v in data.iteritems():print '%s: ' % k, v.shape
X_val:  (1000, 3, 32, 32)
X_train:  (49000, 3, 32, 32)
X_test:  (1000, 3, 32, 32)
y_val:  (1000,)
y_train:  (49000,)
y_test:  (1000,)

Affine layer: foward

Open the file cs231n/layers.py and implement the affine_forward function.

Once you are done you can test your implementaion by running the following:

# Test the affine_forward functionnum_inputs = 2
input_shape = (4, 5, 6)
output_dim = 3input_size = num_inputs * np.prod(input_shape)
weight_size = output_dim * np.prod(input_shape)x = np.linspace(-0.1, 0.5, num=input_size).reshape(num_inputs, *input_shape)
w = np.linspace(-0.2, 0.3, num=weight_size).reshape(np.prod(input_shape), output_dim)
b = np.linspace(-0.3, 0.1, num=output_dim)out, _ = affine_forward(x, w, b)
correct_out = np.array([[ 1.49834967,  1.70660132,  1.91485297],[ 3.25553199,  3.5141327,   3.77273342]])# Compare your output with ours. The error should be around 1e-9.
print 'Testing affine_forward function:'
print 'difference: ', rel_error(out, correct_out)
Testing affine_forward function:
difference:  9.76984946819e-10

Affine layer: backward

Now implement the affine_backward function and test your implementation using numeric gradient checking.

# Test the affine_backward functionx = np.random.randn(10, 2, 3)
w = np.random.randn(6, 5)
b = np.random.randn(5)
dout = np.random.randn(10, 5)dx_num = eval_numerical_gradient_array(lambda x: affine_forward(x, w, b)[0], x, dout)
dw_num = eval_numerical_gradient_array(lambda w: affine_forward(x, w, b)[0], w, dout)
db_num = eval_numerical_gradient_array(lambda b: affine_forward(x, w, b)[0], b, dout)_, cache = affine_forward(x, w, b)
dx, dw, db = affine_backward(dout, cache)# The error should be around 1e-10
print 'Testing affine_backward function:'
print 'dx error: ', rel_error(dx_num, dx)
print 'dw error: ', rel_error(dw_num, dw)
print 'db error: ', rel_error(db_num, db)
Testing affine_backward function:
dx error:  1.2176282406e-10
dw error:  7.13340106347e-11
db error:  1.72702200302e-11

ReLU layer: forward

Implement the forward pass for the ReLU activation function in the relu_forward function and test your implementation using the following:

# Test the relu_forward functionx = np.linspace(-0.5, 0.5, num=12).reshape(3, 4)out, _ = relu_forward(x)
correct_out = np.array([[ 0.,          0.,          0.,          0.,        ],[ 0.,          0.,          0.04545455,  0.13636364,],[ 0.22727273,  0.31818182,  0.40909091,  0.5,       ]])# Compare your output with ours. The error should be around 1e-8
print 'Testing relu_forward function:'
print 'difference: ', rel_error(out, correct_out)
Testing relu_forward function:
difference:  4.99999979802e-08

ReLU layer: backward

Now implement the backward pass for the ReLU activation function in the relu_backward function and test your implementation using numeric gradient checking:

x = np.random.randn(10, 10)
dout = np.random.randn(*x.shape)dx_num = eval_numerical_gradient_array(lambda x: relu_forward(x)[0], x, dout)_, cache = relu_forward(x)
dx = relu_backward(dout, cache)# The error should be around 1e-12
print 'Testing relu_backward function:'
print 'dx error: ', rel_error(dx_num, dx)
Testing relu_backward function:
dx error:  3.27563601875e-12

“Sandwich” layers

There are some common patterns of layers that are frequently used in neural nets. For example, affine layers are frequently followed by a ReLU nonlinearity. To make these common patterns easy, we define several convenience layers in the file cs231n/layer_utils.py.

For now take a look at the affine_relu_forward and affine_relu_backward functions, and run the following to numerically gradient check the backward pass:

from cs231n.layer_utils import affine_relu_forward, affine_relu_backwardx = np.random.randn(2, 3, 4)
w = np.random.randn(12, 10)
b = np.random.randn(10)
dout = np.random.randn(2, 10)out, cache = affine_relu_forward(x, w, b)
dx, dw, db = affine_relu_backward(dout, cache)dx_num = eval_numerical_gradient_array(lambda x: affine_relu_forward(x, w, b)[0], x, dout)
dw_num = eval_numerical_gradient_array(lambda w: affine_relu_forward(x, w, b)[0], w, dout)
db_num = eval_numerical_gradient_array(lambda b: affine_relu_forward(x, w, b)[0], b, dout)print 'Testing affine_relu_forward:'
print 'dx error: ', rel_error(dx_num, dx)
print 'dw error: ', rel_error(dw_num, dw)
print 'db error: ', rel_error(db_num, db)
Testing affine_relu_forward:
dx error:  8.06453565365e-11
dw error:  6.08088884271e-10
db error:  7.82671961624e-12

Loss layers: Softmax and SVM

You implemented these loss functions in the last assignment, so we’ll give them to you for free here. You should still make sure you understand how they work by looking at the implementations in cs231n/layers.py.

You can make sure that the implementations are correct by running the following:

num_classes, num_inputs = 10, 50
x = 0.001 * np.random.randn(num_inputs, num_classes)
y = np.random.randint(num_classes, size=num_inputs)dx_num = eval_numerical_gradient(lambda x: svm_loss(x, y)[0], x, verbose=False)
loss, dx = svm_loss(x, y)# Test svm_loss function. Loss should be around 9 and dx error should be 1e-9
print 'Testing svm_loss:'
print 'loss: ', loss
print 'dx error: ', rel_error(dx_num, dx)dx_num = eval_numerical_gradient(lambda x: softmax_loss(x, y)[0], x, verbose=False)
loss, dx = softmax_loss(x, y)# Test softmax_loss function. Loss should be 2.3 and dx error should be 1e-8
print '\nTesting softmax_loss:'
print 'loss: ', loss
print 'dx error: ', rel_error(dx_num, dx)
Testing svm_loss:
loss:  8.99822429951
dx error:  3.62260263876e-09Testing softmax_loss:
loss:  2.30240797834
dx error:  8.28111324059e-09[autoreload of cs231n.layers failed: Traceback (most recent call last):File "/home/sparkliu/anaconda2/lib/python2.7/site-packages/IPython/extensions/autoreload.py", line 247, in checksuperreload(m, reload, self.old_objects)File "cs231n/layers.py", line 86
SyntaxError: Non-ASCII character '\xe8' in file cs231n/layers.py on line 86, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details
]

Two-layer network

In the previous assignment you implemented a two-layer neural network in a single monolithic class. Now that you have implemented modular versions of the necessary layers, you will reimplement the two layer network using these modular implementations.

Open the file cs231n/classifiers/fc_net.py and complete the implementation of the TwoLayerNet class. This class will serve as a model for the other networks you will implement in this assignment, so read through it to make sure you understand the API. You can run the cell below to test your implementation.

N, D, H, C = 3, 5, 50, 7
X = np.random.randn(N, D)
y = np.random.randint(C, size=N)std = 1e-2
model = TwoLayerNet(input_dim=D, hidden_dim=H, num_classes=C, weight_scale=std)print 'Testing initialization ... '
W1_std = abs(model.params['W1'].std() - std)
b1 = model.params['b1']
W2_std = abs(model.params['W2'].std() - std)
b2 = model.params['b2']
assert W1_std < std / 10, 'First layer weights do not seem right'
assert np.all(b1 == 0), 'First layer biases do not seem right'
assert W2_std < std / 10, 'Second layer weights do not seem right'
assert np.all(b2 == 0), 'Second layer biases do not seem right'print 'Testing test-time forward pass ... '
model.params['W1'] = np.linspace(-0.7, 0.3, num=D*H).reshape(D, H)
model.params['b1'] = np.linspace(-0.1, 0.9, num=H)
model.params['W2'] = np.linspace(-0.3, 0.4, num=H*C).reshape(H, C)
model.params['b2'] = np.linspace(-0.9, 0.1, num=C)
X = np.linspace(-5.5, 4.5, num=N*D).reshape(D, N).T
scores = model.loss(X)
correct_scores = np.asarray([[11.53165108,  12.2917344,   13.05181771,  13.81190102,  14.57198434, 15.33206765,  16.09215096],[12.05769098,  12.74614105,  13.43459113,  14.1230412,   14.81149128, 15.49994135,  16.18839143],[12.58373087,  13.20054771,  13.81736455,  14.43418138,  15.05099822, 15.66781506,  16.2846319 ]])
scores_diff = np.abs(scores - correct_scores).sum()
assert scores_diff < 1e-6, 'Problem with test-time forward pass'print 'Testing training loss (no regularization)'
y = np.asarray([0, 5, 1])
loss, grads = model.loss(X, y)
correct_loss = 3.4702243556
assert abs(loss - correct_loss) < 1e-10, 'Problem with training-time loss'model.reg = 1.0
loss, grads = model.loss(X, y)
correct_loss = 26.5948426952
assert abs(loss - correct_loss) < 1e-10, 'Problem with regularization loss'for reg in [0.0, 0.7]:print 'Running numeric gradient check with reg = ', regmodel.reg = regloss, grads = model.loss(X, y)for name in sorted(grads):f = lambda _: model.loss(X, y)[0]grad_num = eval_numerical_gradient(f, model.params[name], verbose=False)print '%s relative error: %.2e' % (name, rel_error(grad_num, grads[name]))
Testing initialization ...
Testing test-time forward pass ...
Testing training loss (no regularization)
Running numeric gradient check with reg =  0.0
W1 relative error: 1.83e-08
W2 relative error: 3.20e-10
b1 relative error: 9.83e-09
b2 relative error: 4.33e-10
Running numeric gradient check with reg =  0.7
W1 relative error: 2.53e-07
W2 relative error: 7.98e-08
b1 relative error: 1.56e-08
b2 relative error: 9.09e-10

Solver

In the previous assignment, the logic for training models was coupled to the models themselves. Following a more modular design, for this assignment we have split the logic for training models into a separate class.

Open the file cs231n/solver.py and read through it to familiarize yourself with the API. After doing so, use a Solver instance to train a TwoLayerNet that achieves at least 50% accuracy on the validation set.

model = TwoLayerNet()
solver = None##############################################################################
# TODO: Use a Solver instance to train a TwoLayerNet that achieves at least  #
# 50% accuracy on the validation set.                                        #
##############################################################################
solver = Solver(model, data,update_rule='sgd',optim_config={'learning_rate': 1e-3,},lr_decay=0.95,num_epochs=10, batch_size=100, print_every=100)
solver.train()
print solver.best_val_acc
##############################################################################
#                             END OF YOUR CODE                               #
##############################################################################
(Iteration 1 / 4900) loss: 2.301963
(Epoch 0 / 10) train acc: 0.136000; val_acc: 0.143000
(Iteration 101 / 4900) loss: 1.914707
(Iteration 201 / 4900) loss: 1.801741
(Iteration 301 / 4900) loss: 1.561005
(Iteration 401 / 4900) loss: 1.556094
(Epoch 1 / 10) train acc: 0.443000; val_acc: 0.464000
(Iteration 501 / 4900) loss: 1.532592
(Iteration 601 / 4900) loss: 1.422208
(Iteration 701 / 4900) loss: 1.399126
(Iteration 801 / 4900) loss: 1.619161
(Iteration 901 / 4900) loss: 1.564740
(Epoch 2 / 10) train acc: 0.490000; val_acc: 0.461000
(Iteration 1001 / 4900) loss: 1.526923
(Iteration 1101 / 4900) loss: 1.246151
(Iteration 1201 / 4900) loss: 1.361200
(Iteration 1301 / 4900) loss: 1.376708
(Iteration 1401 / 4900) loss: 1.331583
(Epoch 3 / 10) train acc: 0.518000; val_acc: 0.492000
(Iteration 1501 / 4900) loss: 1.394482
(Iteration 1601 / 4900) loss: 1.672659
(Iteration 1701 / 4900) loss: 1.194596
(Iteration 1801 / 4900) loss: 1.304127
(Iteration 1901 / 4900) loss: 1.367667
(Epoch 4 / 10) train acc: 0.534000; val_acc: 0.498000
(Iteration 2001 / 4900) loss: 1.563325
(Iteration 2101 / 4900) loss: 1.116682
(Iteration 2201 / 4900) loss: 1.261018
(Iteration 2301 / 4900) loss: 1.225400
(Iteration 2401 / 4900) loss: 1.397259
(Epoch 5 / 10) train acc: 0.551000; val_acc: 0.486000
(Iteration 2501 / 4900) loss: 1.212678
(Iteration 2601 / 4900) loss: 1.444812
(Iteration 2701 / 4900) loss: 1.287897
(Iteration 2801 / 4900) loss: 1.261478
(Iteration 2901 / 4900) loss: 1.187157
(Epoch 6 / 10) train acc: 0.572000; val_acc: 0.502000
(Iteration 3001 / 4900) loss: 1.170833
(Iteration 3101 / 4900) loss: 1.379905
(Iteration 3201 / 4900) loss: 1.153035
(Iteration 3301 / 4900) loss: 1.268465
(Iteration 3401 / 4900) loss: 1.171372
(Epoch 7 / 10) train acc: 0.577000; val_acc: 0.467000
(Iteration 3501 / 4900) loss: 1.299025
(Iteration 3601 / 4900) loss: 1.245792
(Iteration 3701 / 4900) loss: 1.009821
(Iteration 3801 / 4900) loss: 1.284845
(Iteration 3901 / 4900) loss: 1.224792
(Epoch 8 / 10) train acc: 0.582000; val_acc: 0.523000
(Iteration 4001 / 4900) loss: 1.215382
(Iteration 4101 / 4900) loss: 1.008467
(Iteration 4201 / 4900) loss: 1.203199
(Iteration 4301 / 4900) loss: 0.834487
(Iteration 4401 / 4900) loss: 0.917840
(Epoch 9 / 10) train acc: 0.564000; val_acc: 0.520000
(Iteration 4501 / 4900) loss: 1.237451
(Iteration 4601 / 4900) loss: 1.238394
(Iteration 4701 / 4900) loss: 1.134911
(Iteration 4801 / 4900) loss: 1.320462
(Epoch 10 / 10) train acc: 0.607000; val_acc: 0.517000
0.523
# Run this cell to visualize training loss and train / val accuracyplt.subplot(2, 1, 1)
plt.title('Training loss')
plt.plot(solver.loss_history, 'o')
plt.xlabel('Iteration')plt.subplot(2, 1, 2)
plt.title('Accuracy')
plt.plot(solver.train_acc_history, '-o', label='train')
plt.plot(solver.val_acc_history, '-o', label='val')
plt.plot([0.5] * len(solver.val_acc_history), 'k--')
plt.xlabel('Epoch')
plt.legend(loc='lower right')
plt.gcf().set_size_inches(15, 12)
plt.show()

Multilayer network

Next you will implement a fully-connected network with an arbitrary number of hidden layers.

Read through the FullyConnectedNet class in the file cs231n/classifiers/fc_net.py.

Implement the initialization, the forward pass, and the backward pass. For the moment don’t worry about implementing dropout or batch normalization; we will add those features soon.

Initial loss and gradient check

As a sanity check, run the following to check the initial loss and to gradient check the network both with and without regularization. Do the initial losses seem reasonable?

For gradient checking, you should expect to see errors around 1e-6 or less.

N, D, H1, H2, C = 2, 15, 20, 30, 10
X = np.random.randn(N, D)
y = np.random.randint(C, size=(N,))for reg in [0, 3.14]:print 'Running check with reg = ', regmodel = FullyConnectedNet([H1, H2], input_dim=D, num_classes=C,reg=reg, weight_scale=5e-2, dtype=np.float64)loss, grads = model.loss(X, y)print 'Initial loss: ', lossfor name in sorted(grads):f = lambda _: model.loss(X, y)[0]grad_num = eval_numerical_gradient(f, model.params[name], verbose=False, h=1e-5)print '%s relative error: %.2e' % (name, rel_error(grad_num, grads[name]))
Running check with reg =  0
Initial loss:  2.29903373285
W1 relative error: 1.28e-06
W2 relative error: 8.00e-07
W3 relative error: 2.71e-08
b1 relative error: 7.03e-09
b2 relative error: 1.11e-08
b3 relative error: 1.45e-10
Running check with reg =  3.14
Initial loss:  7.1852394783
W1 relative error: 1.03e-07
W2 relative error: 6.52e-08
W3 relative error: 2.40e-08
b1 relative error: 1.59e-08
b2 relative error: 3.38e-09
b3 relative error: 2.29e-10

As another sanity check, make sure you can overfit a small dataset of 50 images. First we will try a three-layer network with 100 units in each hidden layer. You will need to tweak the learning rate and initialization scale, but you should be able to overfit and achieve 100% training accuracy within 20 epochs.

# TODO: Use a three-layer Net to overfit 50 training examples.num_train = 50
small_data = {'X_train': data['X_train'][:num_train],'y_train': data['y_train'][:num_train],'X_val': data['X_val'],'y_val': data['y_val'],
}# weight_scale = 1e-2
# learning_rate = 1e-4
weight_scale=4e-2
learning_rate=1e-3
model = FullyConnectedNet([100, 100],weight_scale=weight_scale, dtype=np.float64)
solver = Solver(model, small_data,print_every=10, num_epochs=20, batch_size=25,update_rule='sgd',optim_config={'learning_rate': learning_rate,})
solver.train()plt.plot(solver.loss_history, 'o')
plt.title('Training loss history')
plt.xlabel('Iteration')
plt.ylabel('Training loss')
plt.show()
(Iteration 1 / 40) loss: 16.766332
(Epoch 0 / 20) train acc: 0.140000; val_acc: 0.098000
(Epoch 1 / 20) train acc: 0.240000; val_acc: 0.082000
(Epoch 2 / 20) train acc: 0.460000; val_acc: 0.109000
(Epoch 3 / 20) train acc: 0.560000; val_acc: 0.114000
(Epoch 4 / 20) train acc: 0.780000; val_acc: 0.127000
(Epoch 5 / 20) train acc: 0.840000; val_acc: 0.133000
(Iteration 11 / 40) loss: 0.163433
(Epoch 6 / 20) train acc: 0.900000; val_acc: 0.125000
(Epoch 7 / 20) train acc: 0.940000; val_acc: 0.128000
(Epoch 8 / 20) train acc: 0.960000; val_acc: 0.123000
(Epoch 9 / 20) train acc: 0.940000; val_acc: 0.134000
(Epoch 10 / 20) train acc: 0.980000; val_acc: 0.134000
(Iteration 21 / 40) loss: 0.042606
(Epoch 11 / 20) train acc: 1.000000; val_acc: 0.130000
(Epoch 12 / 20) train acc: 1.000000; val_acc: 0.127000
(Epoch 13 / 20) train acc: 1.000000; val_acc: 0.125000
(Epoch 14 / 20) train acc: 1.000000; val_acc: 0.126000
(Epoch 15 / 20) train acc: 1.000000; val_acc: 0.126000
(Iteration 31 / 40) loss: 0.012419
(Epoch 16 / 20) train acc: 1.000000; val_acc: 0.128000
(Epoch 17 / 20) train acc: 1.000000; val_acc: 0.128000
(Epoch 18 / 20) train acc: 1.000000; val_acc: 0.128000
(Epoch 19 / 20) train acc: 1.000000; val_acc: 0.129000
(Epoch 20 / 20) train acc: 1.000000; val_acc: 0.129000

Now try to use a five-layer network with 100 units on each layer to overfit 50 training examples. Again you will have to adjust the learning rate and weight initialization, but you should be able to achieve 100% training accuracy within 20 epochs.

# TODO: Use a five-layer Net to overfit 50 training examples.num_train = 50
small_data = {'X_train': data['X_train'][:num_train],'y_train': data['y_train'][:num_train],'X_val': data['X_val'],'y_val': data['y_val'],
}# learning_rate = 1e-3
# weight_scale = 1e-5
learning_rate = 1e-3
weight_scale = 8e-2
model = FullyConnectedNet([100, 100, 100, 100],weight_scale=weight_scale, dtype=np.float64)
solver = Solver(model, small_data,print_every=10, num_epochs=20, batch_size=25,update_rule='sgd',optim_config={'learning_rate': learning_rate,})
solver.train()plt.plot(solver.loss_history, 'o')
plt.title('Training loss history')
plt.xlabel('Iteration')
plt.ylabel('Training loss')
plt.show()
(Iteration 1 / 40) loss: 56.223744
(Epoch 0 / 20) train acc: 0.160000; val_acc: 0.122000
(Epoch 1 / 20) train acc: 0.100000; val_acc: 0.093000
(Epoch 2 / 20) train acc: 0.300000; val_acc: 0.106000
(Epoch 3 / 20) train acc: 0.480000; val_acc: 0.122000
(Epoch 4 / 20) train acc: 0.760000; val_acc: 0.134000
(Epoch 5 / 20) train acc: 0.820000; val_acc: 0.124000
(Iteration 11 / 40) loss: 3.030085
(Epoch 6 / 20) train acc: 0.800000; val_acc: 0.138000
(Epoch 7 / 20) train acc: 0.940000; val_acc: 0.135000
(Epoch 8 / 20) train acc: 0.980000; val_acc: 0.132000
(Epoch 9 / 20) train acc: 1.000000; val_acc: 0.135000
(Epoch 10 / 20) train acc: 1.000000; val_acc: 0.134000
(Iteration 21 / 40) loss: 0.013649
(Epoch 11 / 20) train acc: 1.000000; val_acc: 0.133000
(Epoch 12 / 20) train acc: 1.000000; val_acc: 0.131000
(Epoch 13 / 20) train acc: 1.000000; val_acc: 0.133000
(Epoch 14 / 20) train acc: 1.000000; val_acc: 0.132000
(Epoch 15 / 20) train acc: 1.000000; val_acc: 0.132000
(Iteration 31 / 40) loss: 0.002347
(Epoch 16 / 20) train acc: 1.000000; val_acc: 0.133000
(Epoch 17 / 20) train acc: 1.000000; val_acc: 0.133000
(Epoch 18 / 20) train acc: 1.000000; val_acc: 0.133000
(Epoch 19 / 20) train acc: 1.000000; val_acc: 0.133000
(Epoch 20 / 20) train acc: 1.000000; val_acc: 0.132000

Inline question:

Did you notice anything about the comparative difficulty of training the three-layer net vs training the five layer net?

Answer:

五层网路相比三层网络的weight-scale影响更大,对于三层网络weight-scale合适的范围很大,但是对于五层网络weight-scale徐娅更大的值才能实现overfit,这是因为在网络中weight的变化会随着层数的加深而减小,所以需要更大的weight-scale。而对于学习率两者则差别不大

Update rules

So far we have used vanilla stochastic gradient descent (SGD) as our update rule. More sophisticated update rules can make it easier to train deep networks. We will implement a few of the most commonly used update rules and compare them to vanilla SGD.

SGD+Momentum

Stochastic gradient descent with momentum is a widely used update rule that tends to make deep networks converge faster than vanilla stochstic gradient descent.

Open the file cs231n/optim.py and read the documentation at the top of the file to make sure you understand the API. Implement the SGD+momentum update rule in the function sgd_momentum and run the following to check your implementation. You should see errors less than 1e-8.

from cs231n.optim import sgd_momentumN, D = 4, 5
w = np.linspace(-0.4, 0.6, num=N*D).reshape(N, D)
dw = np.linspace(-0.6, 0.4, num=N*D).reshape(N, D)
v = np.linspace(0.6, 0.9, num=N*D).reshape(N, D)config = {'learning_rate': 1e-3, 'velocity': v}
next_w, _ = sgd_momentum(w, dw, config=config)expected_next_w = np.asarray([[ 0.1406,      0.20738947,  0.27417895,  0.34096842,  0.40775789],[ 0.47454737,  0.54133684,  0.60812632,  0.67491579,  0.74170526],[ 0.80849474,  0.87528421,  0.94207368,  1.00886316,  1.07565263],[ 1.14244211,  1.20923158,  1.27602105,  1.34281053,  1.4096    ]])
expected_velocity = np.asarray([[ 0.5406,      0.55475789,  0.56891579, 0.58307368,  0.59723158],[ 0.61138947,  0.62554737,  0.63970526,  0.65386316,  0.66802105],[ 0.68217895,  0.69633684,  0.71049474,  0.72465263,  0.73881053],[ 0.75296842,  0.76712632,  0.78128421,  0.79544211,  0.8096    ]])print 'next_w error: ', rel_error(next_w, expected_next_w)
print 'velocity error: ', rel_error(expected_velocity, config['velocity'])
next_w error:  8.88234703351e-09
velocity error:  4.26928774328e-09

Once you have done so, run the following to train a six-layer network with both SGD and SGD+momentum. You should see the SGD+momentum update rule converge faster.

num_train = 4000
small_data = {'X_train': data['X_train'][:num_train],'y_train': data['y_train'][:num_train],'X_val': data['X_val'],'y_val': data['y_val'],
}solvers = {}for update_rule in ['sgd', 'sgd_momentum','Nesterov_momentum']:print 'running with ', update_rulemodel = FullyConnectedNet([100, 100, 100, 100, 100], weight_scale=5e-2)solver = Solver(model, small_data,num_epochs=5, batch_size=100,update_rule=update_rule,optim_config={'learning_rate': 1e-2,},verbose=True)solvers[update_rule] = solversolver.train()printplt.subplot(3, 1, 1)
plt.title('Training loss')
plt.xlabel('Iteration')plt.subplot(3, 1, 2)
plt.title('Training accuracy')
plt.xlabel('Epoch')plt.subplot(3, 1, 3)
plt.title('Validation accuracy')
plt.xlabel('Epoch')for update_rule, solver in solvers.iteritems():plt.subplot(3, 1, 1)plt.plot(solver.loss_history, 'o', label=update_rule)plt.subplot(3, 1, 2)plt.plot(solver.train_acc_history, '-o', label=update_rule)plt.subplot(3, 1, 3)plt.plot(solver.val_acc_history, '-o', label=update_rule)for i in [1, 2, 3]:plt.subplot(3, 1, i)plt.legend(loc='upper center', ncol=4)
plt.gcf().set_size_inches(15, 15)
plt.show()
running with  sgd
(Iteration 1 / 200) loss: 2.628549
(Epoch 0 / 5) train acc: 0.113000; val_acc: 0.104000
(Iteration 11 / 200) loss: 2.228658
(Iteration 21 / 200) loss: 2.190695
(Iteration 31 / 200) loss: 2.017840
(Epoch 1 / 5) train acc: 0.243000; val_acc: 0.222000
(Iteration 41 / 200) loss: 1.914843
(Iteration 51 / 200) loss: 1.913829
(Iteration 61 / 200) loss: 1.902821
(Iteration 71 / 200) loss: 1.924258
(Epoch 2 / 5) train acc: 0.273000; val_acc: 0.274000
(Iteration 81 / 200) loss: 1.767148
(Iteration 91 / 200) loss: 1.782822
(Iteration 101 / 200) loss: 1.781782
(Iteration 111 / 200) loss: 1.583985
(Epoch 3 / 5) train acc: 0.366000; val_acc: 0.312000
(Iteration 121 / 200) loss: 1.829295
(Iteration 131 / 200) loss: 1.683784
(Iteration 141 / 200) loss: 1.714471
(Iteration 151 / 200) loss: 1.666869
(Epoch 4 / 5) train acc: 0.397000; val_acc: 0.305000
(Iteration 161 / 200) loss: 1.729787
(Iteration 171 / 200) loss: 1.568279
(Iteration 181 / 200) loss: 1.585935
(Iteration 191 / 200) loss: 1.764712
(Epoch 5 / 5) train acc: 0.419000; val_acc: 0.334000running with  sgd_momentum
(Iteration 1 / 200) loss: 2.576404
(Epoch 0 / 5) train acc: 0.090000; val_acc: 0.108000
(Iteration 11 / 200) loss: 2.182130
(Iteration 21 / 200) loss: 2.022693
(Iteration 31 / 200) loss: 2.005269
(Epoch 1 / 5) train acc: 0.296000; val_acc: 0.293000
(Iteration 41 / 200) loss: 1.963764
(Iteration 51 / 200) loss: 1.801444
(Iteration 61 / 200) loss: 1.548850
(Iteration 71 / 200) loss: 1.824885
(Epoch 2 / 5) train acc: 0.372000; val_acc: 0.312000
(Iteration 81 / 200) loss: 1.616399
(Iteration 91 / 200) loss: 1.571472
(Iteration 101 / 200) loss: 1.596961
(Iteration 111 / 200) loss: 1.387090
(Epoch 3 / 5) train acc: 0.439000; val_acc: 0.347000
(Iteration 121 / 200) loss: 1.429546
(Iteration 131 / 200) loss: 1.595473
(Iteration 141 / 200) loss: 1.328942
(Iteration 151 / 200) loss: 1.418077
(Epoch 4 / 5) train acc: 0.474000; val_acc: 0.349000
(Iteration 161 / 200) loss: 1.582628
(Iteration 171 / 200) loss: 1.460142
(Iteration 181 / 200) loss: 1.498357
(Iteration 191 / 200) loss: 1.365589
(Epoch 5 / 5) train acc: 0.531000; val_acc: 0.338000running with  Nesterov_momentum
(Iteration 1 / 200) loss: 2.594123
(Epoch 0 / 5) train acc: 0.096000; val_acc: 0.116000
(Iteration 11 / 200) loss: 2.037924
(Iteration 21 / 200) loss: 1.731799
(Iteration 31 / 200) loss: 1.927105
(Epoch 1 / 5) train acc: 0.351000; val_acc: 0.298000
(Iteration 41 / 200) loss: 1.733407
(Iteration 51 / 200) loss: 1.825247
(Iteration 61 / 200) loss: 1.697723
(Iteration 71 / 200) loss: 1.599464
(Epoch 2 / 5) train acc: 0.447000; val_acc: 0.341000
(Iteration 81 / 200) loss: 1.640420
(Iteration 91 / 200) loss: 1.559612
(Iteration 101 / 200) loss: 1.637911
(Iteration 111 / 200) loss: 1.434598
(Epoch 3 / 5) train acc: 0.488000; val_acc: 0.351000
(Iteration 121 / 200) loss: 1.529465
(Iteration 131 / 200) loss: 1.184129
(Iteration 141 / 200) loss: 1.353501
(Iteration 151 / 200) loss: 1.340935
(Epoch 4 / 5) train acc: 0.565000; val_acc: 0.351000
(Iteration 161 / 200) loss: 1.240142
(Iteration 171 / 200) loss: 1.312745
(Iteration 181 / 200) loss: 1.285002
(Iteration 191 / 200) loss: 1.362102
(Epoch 5 / 5) train acc: 0.593000; val_acc: 0.321000

RMSProp and Adam

RMSProp [1] and Adam [2] are update rules that set per-parameter learning rates by using a running average of the second moments of gradients.

In the file cs231n/optim.py, implement the RMSProp update rule in the rmsprop function and implement the Adam update rule in the adam function, and check your implementations using the tests below.

[1] Tijmen Tieleman and Geoffrey Hinton. “Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude.” COURSERA: Neural Networks for Machine Learning 4 (2012).

[2] Diederik Kingma and Jimmy Ba, “Adam: A Method for Stochastic Optimization”, ICLR 2015.

# Test RMSProp implementation; you should see errors less than 1e-7
from cs231n.optim import rmspropN, D = 4, 5
w = np.linspace(-0.4, 0.6, num=N*D).reshape(N, D)
dw = np.linspace(-0.6, 0.4, num=N*D).reshape(N, D)
cache = np.linspace(0.6, 0.9, num=N*D).reshape(N, D)config = {'learning_rate': 1e-2, 'cache': cache}
next_w, _ = rmsprop(w, dw, config=config)expected_next_w = np.asarray([[-0.39223849, -0.34037513, -0.28849239, -0.23659121, -0.18467247],[-0.132737,   -0.08078555, -0.02881884,  0.02316247,  0.07515774],[ 0.12716641,  0.17918792,  0.23122175,  0.28326742,  0.33532447],[ 0.38739248,  0.43947102,  0.49155973,  0.54365823,  0.59576619]])
expected_cache = np.asarray([[ 0.5976,      0.6126277,   0.6277108,   0.64284931,  0.65804321],[ 0.67329252,  0.68859723,  0.70395734,  0.71937285,  0.73484377],[ 0.75037008,  0.7659518,   0.78158892,  0.79728144,  0.81302936],[ 0.82883269,  0.84469141,  0.86060554,  0.87657507,  0.8926    ]])print 'next_w error: ', rel_error(expected_next_w, next_w)
print 'cache error: ', rel_error(expected_cache, config['cache'])
next_w error:  9.52468751104e-08
cache error:  2.64779558072e-09
# Test Adam implementation; you should see errors around 1e-7 or less
from cs231n.optim import adamN, D = 4, 5
w = np.linspace(-0.4, 0.6, num=N*D).reshape(N, D)
dw = np.linspace(-0.6, 0.4, num=N*D).reshape(N, D)
m = np.linspace(0.6, 0.9, num=N*D).reshape(N, D)
v = np.linspace(0.7, 0.5, num=N*D).reshape(N, D)config = {'learning_rate': 1e-2, 'm': m, 'v': v, 't': 5}
next_w, _ = adam(w, dw, config=config)expected_next_w = np.asarray([[-0.40094747, -0.34836187, -0.29577703, -0.24319299, -0.19060977],[-0.1380274,  -0.08544591, -0.03286534,  0.01971428,  0.0722929],[ 0.1248705,   0.17744702,  0.23002243,  0.28259667,  0.33516969],[ 0.38774145,  0.44031188,  0.49288093,  0.54544852,  0.59801459]])
expected_v = np.asarray([[ 0.69966,     0.68908382,  0.67851319,  0.66794809,  0.65738853,],[ 0.64683452,  0.63628604,  0.6257431,   0.61520571,  0.60467385,],[ 0.59414753,  0.58362676,  0.57311152,  0.56260183,  0.55209767,],[ 0.54159906,  0.53110598,  0.52061845,  0.51013645,  0.49966,   ]])
expected_m = np.asarray([[ 0.48,        0.49947368,  0.51894737,  0.53842105,  0.55789474],[ 0.57736842,  0.59684211,  0.61631579,  0.63578947,  0.65526316],[ 0.67473684,  0.69421053,  0.71368421,  0.73315789,  0.75263158],[ 0.77210526,  0.79157895,  0.81105263,  0.83052632,  0.85      ]])print 'next_w error: ', rel_error(expected_next_w, next_w)
print 'v error: ', rel_error(expected_v, config['v'])
print 'm error: ', rel_error(expected_m, config['m'])
next_w error:  1.13956917985e-07
v error:  4.20831403811e-09
m error:  4.21496319311e-09

Once you have debugged your RMSProp and Adam implementations, run the following to train a pair of deep networks using these new update rules:

learning_rates = {'rmsprop': 1e-4, 'adam': 1e-3}
for update_rule in ['adam', 'rmsprop']:print 'running with ', update_rulemodel = FullyConnectedNet([100, 100, 100, 100, 100], weight_scale=5e-2)solver = Solver(model, small_data,num_epochs=5, batch_size=100,update_rule=update_rule,optim_config={'learning_rate': learning_rates[update_rule]},verbose=True)solvers[update_rule] = solversolver.train()printplt.subplot(3, 1, 1)
plt.title('Training loss')
plt.xlabel('Iteration')plt.subplot(3, 1, 2)
plt.title('Training accuracy')
plt.xlabel('Epoch')plt.subplot(3, 1, 3)
plt.title('Validation accuracy')
plt.xlabel('Epoch')for update_rule, solver in solvers.iteritems():plt.subplot(3, 1, 1)plt.plot(solver.loss_history, 'o', label=update_rule)plt.subplot(3, 1, 2)plt.plot(solver.train_acc_history, '-o', label=update_rule)plt.subplot(3, 1, 3)plt.plot(solver.val_acc_history, '-o', label=update_rule)for i in [1, 2, 3]:plt.subplot(3, 1, i)plt.legend(loc='upper center', ncol=4)
plt.gcf().set_size_inches(15, 15)
plt.show()
running with  adam
(Iteration 1 / 200) loss: 2.801725
(Epoch 0 / 5) train acc: 0.121000; val_acc: 0.126000
(Iteration 11 / 200) loss: 2.117279
(Iteration 21 / 200) loss: 1.847289
(Iteration 31 / 200) loss: 1.769301
(Epoch 1 / 5) train acc: 0.403000; val_acc: 0.341000
(Iteration 41 / 200) loss: 1.743352
(Iteration 51 / 200) loss: 1.770679
(Iteration 61 / 200) loss: 1.611959
(Iteration 71 / 200) loss: 1.566798
(Epoch 2 / 5) train acc: 0.435000; val_acc: 0.332000
(Iteration 81 / 200) loss: 1.638967
(Iteration 91 / 200) loss: 1.399417
(Iteration 101 / 200) loss: 1.483174
(Iteration 111 / 200) loss: 1.380184
(Epoch 3 / 5) train acc: 0.499000; val_acc: 0.361000
(Iteration 121 / 200) loss: 1.365415
(Iteration 131 / 200) loss: 1.498618
(Iteration 141 / 200) loss: 1.195903
(Iteration 151 / 200) loss: 1.469171
(Epoch 4 / 5) train acc: 0.481000; val_acc: 0.372000
(Iteration 161 / 200) loss: 1.528632
(Iteration 171 / 200) loss: 1.209119
(Iteration 181 / 200) loss: 1.215279
(Iteration 191 / 200) loss: 1.234940
(Epoch 5 / 5) train acc: 0.578000; val_acc: 0.370000running with  rmsprop
(Iteration 1 / 200) loss: 2.687890
(Epoch 0 / 5) train acc: 0.127000; val_acc: 0.116000
(Iteration 11 / 200) loss: 2.123464
(Iteration 21 / 200) loss: 2.136270
(Iteration 31 / 200) loss: 1.832075
(Epoch 1 / 5) train acc: 0.354000; val_acc: 0.302000
(Iteration 41 / 200) loss: 1.731681
(Iteration 51 / 200) loss: 1.762667
(Iteration 61 / 200) loss: 1.732131
(Iteration 71 / 200) loss: 1.722929
(Epoch 2 / 5) train acc: 0.372000; val_acc: 0.283000
(Iteration 81 / 200) loss: 1.717579
(Iteration 91 / 200) loss: 1.439227
(Iteration 101 / 200) loss: 1.605034
(Iteration 111 / 200) loss: 1.631904
(Epoch 3 / 5) train acc: 0.442000; val_acc: 0.336000
(Iteration 121 / 200) loss: 1.618410
(Iteration 131 / 200) loss: 1.624446
(Iteration 141 / 200) loss: 1.517704
(Iteration 151 / 200) loss: 1.518375
(Epoch 4 / 5) train acc: 0.486000; val_acc: 0.350000
(Iteration 161 / 200) loss: 1.457081
(Iteration 171 / 200) loss: 1.535888
(Iteration 181 / 200) loss: 1.382148
(Iteration 191 / 200) loss: 1.450497
(Epoch 5 / 5) train acc: 0.513000; val_acc: 0.359000

Train a good model!

Train the best fully-connected model that you can on CIFAR-10, storing your best model in the best_model variable. We require you to get at least 50% accuracy on the validation set using a fully-connected net.

If you are careful it should be possible to get accuracies above 55%, but we don’t require it for this part and won’t assign extra credit for doing so. Later in the assignment we will ask you to train the best convolutional network that you can on CIFAR-10, and we would prefer that you spend your effort working on convolutional nets rather than fully-connected nets.

You might find it useful to complete the BatchNormalization.ipynb and Dropout.ipynb notebooks before completing this part, since those techniques can help you train powerful models.

best_model = None
################################################################################
# TODO: Train the best FullyConnectedNet that you can on CIFAR-10. You might   #
# batch normalization and dropout useful. Store your best model in the         #
# best_model variable.                                                         #
################################################################################
best_val_acc=0
model = FullyConnectedNet([100, 100, 100], reg=0,weight_scale=5e-2,dtype=np.float64,use_batchnorm=True,dropout=0)solver = Solver(model, data,num_epochs=10, batch_size=100,update_rule='adam',optim_config={'learning_rate':5e-3},lr_decay=0.95,print_every=100,verbose=True)
solver.train()
if solver.best_val_acc>best_val_acc:best_model=model
print
plt.subplot(2, 1, 1)
plt.title('Training loss')
plt.xlabel('Iteration')
plt.plot(solver.loss_history, 'o')plt.subplot(2, 1, 2)
plt.title('Training accuracy VS Validation accuracy')
plt.xlabel('Epoch')
plt.plot(solver.train_acc_history, '-o',label='train_acc')
plt.plot(solver.val_acc_history, '-o',label='val_acc')
plt.grid(True)
plt.legend(loc='upper center', ncol=4)
plt.gcf().set_size_inches(15, 15)
plt.show()################################################################################
#                              END OF YOUR CODE                                #
################################################################################
(Iteration 1 / 4900) loss: 2.393883
(Epoch 0 / 10) train acc: 0.161000; val_acc: 0.186000
(Iteration 101 / 4900) loss: 1.729880
(Iteration 201 / 4900) loss: 1.930090
(Iteration 301 / 4900) loss: 1.541373
(Iteration 401 / 4900) loss: 1.561599
(Epoch 1 / 10) train acc: 0.472000; val_acc: 0.457000
(Iteration 501 / 4900) loss: 1.375797
(Iteration 601 / 4900) loss: 1.424907
(Iteration 701 / 4900) loss: 1.379262
(Iteration 801 / 4900) loss: 1.457694
(Iteration 901 / 4900) loss: 1.417678
(Epoch 2 / 10) train acc: 0.495000; val_acc: 0.478000
(Iteration 1001 / 4900) loss: 1.434784
(Iteration 1101 / 4900) loss: 1.290648
(Iteration 1201 / 4900) loss: 1.352584
(Iteration 1301 / 4900) loss: 1.418072
(Iteration 1401 / 4900) loss: 1.131203
(Epoch 3 / 10) train acc: 0.537000; val_acc: 0.504000
(Iteration 1501 / 4900) loss: 1.279383
(Iteration 1601 / 4900) loss: 1.360643
(Iteration 1701 / 4900) loss: 1.302837
(Iteration 1801 / 4900) loss: 1.393452
(Iteration 1901 / 4900) loss: 1.053392
(Epoch 4 / 10) train acc: 0.577000; val_acc: 0.528000
(Iteration 2001 / 4900) loss: 1.315768
(Iteration 2101 / 4900) loss: 1.184602
(Iteration 2201 / 4900) loss: 1.111081
(Iteration 2301 / 4900) loss: 1.218965
(Iteration 2401 / 4900) loss: 1.025492
(Epoch 5 / 10) train acc: 0.585000; val_acc: 0.540000
(Iteration 2501 / 4900) loss: 1.065151
(Iteration 2601 / 4900) loss: 1.139779
(Iteration 2701 / 4900) loss: 1.277318
(Iteration 2801 / 4900) loss: 1.366566
(Iteration 2901 / 4900) loss: 1.191658
(Epoch 6 / 10) train acc: 0.614000; val_acc: 0.514000
(Iteration 3001 / 4900) loss: 1.239840
(Iteration 3101 / 4900) loss: 1.247335
(Iteration 3201 / 4900) loss: 1.196184
(Iteration 3301 / 4900) loss: 0.989457
(Iteration 3401 / 4900) loss: 1.000137
(Epoch 7 / 10) train acc: 0.656000; val_acc: 0.524000
(Iteration 3501 / 4900) loss: 0.994300
(Iteration 3601 / 4900) loss: 1.043369
(Iteration 3701 / 4900) loss: 1.085914
(Iteration 3801 / 4900) loss: 1.062207
(Iteration 3901 / 4900) loss: 1.160448
(Epoch 8 / 10) train acc: 0.636000; val_acc: 0.533000
(Iteration 4001 / 4900) loss: 1.238945
(Iteration 4101 / 4900) loss: 1.052708
(Iteration 4201 / 4900) loss: 0.993288
(Iteration 4301 / 4900) loss: 1.119087
(Iteration 4401 / 4900) loss: 0.960225
(Epoch 9 / 10) train acc: 0.648000; val_acc: 0.551000
(Iteration 4501 / 4900) loss: 1.007681
(Iteration 4601 / 4900) loss: 1.005801
(Iteration 4701 / 4900) loss: 0.912101
(Iteration 4801 / 4900) loss: 0.894692
(Epoch 10 / 10) train acc: 0.664000; val_acc: 0.554000

Test you model

Run your best model on the validation and test sets. You should achieve above 50% accuracy on the validation set.

y_test_pred = np.argmax(best_model.loss(data['X_test']), axis=1)
y_val_pred = np.argmax(best_model.loss(data['X_val']), axis=1)
print 'Validation set accuracy: ', (y_val_pred == data['y_val']).mean()
print 'Test set accuracy: ', (y_test_pred == data['y_test']).mean()
Validation set accuracy:  0.555
Test set accuracy:  0.547

参考资料:

http://www.jianshu.com/p/9c4396653324
http://blog.csdn.net/xieyi4650/article/details/53839308

#cs231n#Assignment2:FullyConnectedNets.ipynb相关推荐

  1. CS231n+assignment2(一)

    assignment2目录 CS231n+assignment2(一) 文章目录 assignment2目录 前言 一.环境搭建 二.代码实现 Multi-Layer Fully Connected ...

  2. cs231n笔记:线性分类器

    cs231n线性分类器学习笔记,非完全翻译,根据自己的学习情况总结出的内容: 线性分类 本节介绍线性分类器,该方法可以自然延伸到神经网络和卷积神经网络中,这类方法主要有两部分组成,一个是评分函数(sc ...

  3. jupyter可以打开HTML文件吗,Jupyter ~ 像写文章般的 Coding (附:同一个ipynb文件,执行多语言代码)...

    前面用了很久Notebook来交互式编程了,此次说说几个其余的选项:html Notebook Markdown 此次选Markdown模式(关于Markdown基础能够看以前写的Markdown B ...

  4. 在jupyter编写代码列出HTML,Jupyter ~ 像写文章般的 Coding (附:同一个ipynb文件,执行多语言代码)...

    前面用了好久Notebook来交互式编程了,这次说说几个其他的选项: Notebook Markdown 这次选Markdown模式(关于Markdown基础可以看之前写的Markdown Base) ...

  5. 训练softmax分类器实例_作业:softmax.ipynb

    这个作业与svm.ipynb类似,要求: 为类Softmax分类器实现一个全向量化运算的损失函数 .类Softmax分类器定义在linear_classifier.py中,而损失函数实现在 softm ...

  6. cs231n assignment2 PyTorch

    文章目录 Barebones PyTorch Three-Layer ConvNet Training a ConvNet PyTorch Module API Module API: Train a ...

  7. cs231n作业:assignment1 - features

    GitHub地址:https://github.com/ZJUFangzh/cs231n 个人博客:fangzh.top 抽取图像的HOG和HSV特征. 对于每张图,我们会计算梯度方向直方图(HOG) ...

  8. CS231n - Assignment2 Tensorflow

    本次的作业很贴心,在ipython的作业中有一段教程大概告诉我们tensorflow的基本使用,还附上了一些常用API的guide链接,赞!没有科学上网也没有关系,我这里分享一个API查询神器 Dev ...

  9. cs231n笔记:lecture2,lecture3

    image classification 图像分类问题就是为输入图像从一组给定的类别中为其分配一个标签的问题,这是计算机视觉领域的核心任务之一,尽管他很简单但却有很多种实际应用,许多看似不同的其他计算 ...

  10. cs231n'18: Course Note 2

    Linear classification: Support Vector Machine, Softmax Linear Classification 实现image classification更 ...

最新文章

  1. 产品经理如何做好数据埋点
  2. 上周上线碰见的ORA-00054错误回放
  3. 在Github上面搭建一个自己域名的Hexo博客
  4. 第三天:制定项目计划
  5. 项目中使用粘性布局不起作用_项目中的 Git 使用规范
  6. Pat乙级1011题:A+B和C
  7. AtCoder - 4172 Modulo Summation 贪心
  8. 【专栏精选】Unity中的HTTP网络通信
  9. mysql独立开发_nacos的mysql独立部署
  10. iOS可视化动态绘制连通图(Swift版)
  11. 防火墙阻止tftp_再谈突破TCP-IP过滤/防火墙进入内网(icmp篇)
  12. 通达OA工作流-表单设计
  13. 蓝桥杯c语言基础试题答案,蓝桥杯试题C语言答案.doc
  14. 【盘点】最受欢迎十大中国风歌曲
  15. 壹度同城新零售系统v4.1.23 社交电商 同城商城
  16. 2021年二级c语言软件下载,2021计算机二级宝典
  17. Android应用开发详解【郭宏志】(奋斗之小鸟)_PDF 电子书
  18. 把kali装到U盘里
  19. 辽宁省大连市谷歌高清卫星地图下载
  20. 学生会管理系统(SSM)vue+ssm+shiro

热门文章

  1. 《吃豆人pacman》源码
  2. Say my love to my honey
  3. 股票量化交易软件(全自动量化交易软件)
  4. [模拟]来一瓶82年拉菲(酒杯问题)
  5. 夏令营来袭,这样的文书千万不要写
  6. 创业公司没有大公司的安全感、各种福利待遇,怎么激发大家热情把活干好?需要建立什么样的激励机制等?
  7. 客户关系管理的竞争对手分析
  8. 发作性睡病:控制不了自己的睡眠怎么办?
  9. 浏览器分屏技巧,一行代码搞定!!!
  10. 查询异常:java.sql.SQLException: HOUR_OF_DAY: 0 -> 1