简单神经网络练习

1.导入
2.神经网络
- 2.1 数据集载入
- 2.2 数据集数据可视化
- 2.3 模型表示法
- 2.4 模型的Tensorflow实现
3.numpy实现模型
- 3.1 numpy实现模型函数
- 3.2 向量化的numpy实现模型
- 3.3 numpy的broadcasting（扩展）

本文是对神经网络的练习和应用，本文利用神经网络进行图像识别和分类，一些包含0和1的手写体的识别和分类

1.导入

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import matplotlib.pyplot as plt
from autils import *
%matplotlib inlineimport logging
logging.getLogger("tensorflow").setLevel(logging.ERROR)
tf.autograph.set_verbosity(0)

2.神经网络

2.1 数据集载入

下面显示的load_data（）函数将数据加载到变量X和y中
数据集包含1000个手写数字1^11的训练示例，此处限制为零和一。
每个训练示例都是一个20像素x 20像素的数字灰度图像。
每个像素由一个浮点数表示，表示该位置的灰度强度。
20 x 20像素网格“展开”为400维矢量。
每个训练示例都成为数据矩阵“X”中的一行。
这给了我们一个1000 x 400矩阵“x”，其中每一行都是手写数字图像的训练示例。
X=(−−−(x(1))−−−−−−(x(2))−−−⋮−−−(x(m))−−−)X = \left(\begin{array}{cc} --- (x^{(1)}) --- \\ --- (x^{(2)}) --- \\ \vdots \\ --- (x^{(m)}) --- \end{array}\right)X=⎝⎛−−−(x(1))−−−−−−(x(2))−−−⋮−−−(x(m))−−−⎠⎞
训练集的第二部分是一个1000 x 1维向量“y”，其中包含训练集的标签
y=0如果图像为数字0，y=1如果图像是数字`1’。

# load dataset
X, y = load_data()

2.2 数据集数据可视化

检查数据

print ('The first element of X is: ', X[0])Output exceeds the size limit. Open the full output data in a text editor
The first element of X is:  [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  8.56059680e-061.94035948e-06 -7.37438725e-04 -8.13403799e-03 -1.86104473e-02-1.87412865e-02 -1.87572508e-02 -1.90963542e-02 -1.64039011e-02-3.78191381e-03  3.30347316e-04  1.27655229e-05  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  1.16421569e-04  1.20052179e-04-1.40444581e-02 -2.84542484e-02  8.03826593e-02  2.66540339e-012.73853746e-01  2.78729541e-01  2.74293607e-01  2.24676403e-012.77562977e-02 -7.06315478e-03  2.34715414e-04  0.00000000e+00
...0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+000.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00]

print ('The first element of y is: ', y[0,0])
print ('The last element of y is: ', y[-1,0])The first element of y is:  0
The last element of y is:  1

检查数据维度

print ('The shape of X is: ' + str(X.shape))
print ('The shape of y is: ' + str(y.shape))The shape of X is: (1000, 400)
The shape of y is: (1000, 1)

可视化数据

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cellm, n = X.shapefig, axes = plt.subplots(8,8, figsize=(8,8))
fig.tight_layout(pad=0.1)for i,ax in enumerate(axes.flat):# Select random indicesrandom_index = np.random.randint(m)# Select rows corresponding to the random indices and# reshape the imageX_random_reshaped = X[random_index].reshape((20,20)).T# Display the imageax.imshow(X_random_reshaped, cmap='gray')# Display the label above the imageax.set_title(y[random_index,0])ax.set_axis_off()

2.3 模型表示法

您将在本次作业中使用的神经网络如下图所示。

这有三层dense的sigmoid激活函数。
- 我们的输入是数字图像的像素值。
- 由于图像的大小为20×2020\times2020×20，因此我们可以输入400400400
这些参数的尺寸大小适用于一个神经网络，第一层为252525单元，第二层为151515单元，而第三层为111输出单元。
- 回想一下，这些参数的尺寸确定如下：
  - 如果网络在一个层中有sins_{in}sin个单元，在下一层中有souts_{out}sout个单位，那么
    - WWW的维度为sin×souts_{in}\times s_{out}sin×sout。
    - bbb将是一个包含souts_{out}sout元素的向量
- 因此，W和b的形状为
  - layer1:W1的形状为（400，25），b1的形式为（25，）
  - layer2:W2的形状为（25，15），b2的形状是：（15，）
  - layer3:W3的形状为（15，1），b3的形式为：（1，）

注：偏移向量b可以表示为1-D（n，）或2-D（n、1）数组。Tensorflow使用一维表示，本文将保持这种惯例。

2.4 模型的Tensorflow实现

# UNQ_C1
# GRADED CELL: Sequential modelmodel = Sequential([               tf.keras.Input(shape=(400,)),    #specify input size### START CODE HERE ### Dense(25, activation='sigmoid', name = 'layer1'),Dense(15, activation='sigmoid', name = 'layer2'),Dense(1, activation='sigmoid', name = 'layer3'),### END CODE HERE ### ], name = "my_model"
)

model.summary()Model: "my_model"
_________________________________________________________________Layer (type)                Output Shape              Param #
=================================================================layer1 (Dense)              (None, 25)                10025     layer2 (Dense)              (None, 15)                390       layer3 (Dense)              (None, 1)                 16        =================================================================
Total params: 10,431
Trainable params: 10,431
Non-trainable params: 0
_________________________________________________________________

右边的三个参数来自于：

L1_num_params = 400 * 25 + 25  # W1 parameters  + b1 parameters
L2_num_params = 25 * 15 + 15   # W2 parameters  + b2 parameters
L3_num_params = 15 * 1 + 1     # W3 parameters  + b3 parameters
print("L1 params = ", L1_num_params, ", L2 params = ", L2_num_params, ",  L3 params = ", L3_num_params )L1 params =  10025 , L2 params =  390 ,  L3 params =  16

进一步验证Tensorflow产生的权重与计算所得的权重相同：

[layer1, layer2, layer3] = model.layers

#### Examine Weights shapes
W1,b1 = layer1.get_weights()
W2,b2 = layer2.get_weights()
W3,b3 = layer3.get_weights()
print(f"W1 shape = {W1.shape}, b1 shape = {b1.shape}")
print(f"W2 shape = {W2.shape}, b2 shape = {b2.shape}")
print(f"W3 shape = {W3.shape}, b3 shape = {b3.shape}")W1 shape = (400, 25), b1 shape = (25,)
W2 shape = (25, 15), b2 shape = (15,)
W3 shape = (15, 1), b3 shape = (1,)

xx.get_weights返回NumPy数组。还可以直接以Tensor形式访问权重。请注意最后一层中Tensor的形状。

print(model.layers[2].weights)[<tf.Variable 'layer3/kernel:0' shape=(15, 1) dtype=float32, numpy=
array([[ 2.0668435e-01],[ 3.0981123e-02],[ 1.5515453e-01],[-4.5015967e-01],[ 1.1071807e-01],[ 5.0223887e-02],[-3.4112835e-01],[ 3.1129056e-01],[ 3.3140182e-04],[ 7.3278010e-02],[-3.6888242e-01],[ 2.8538823e-02],[-1.9153926e-01],[ 5.5546862e-01],[ 2.4924773e-01]], dtype=float32)>, <tf.Variable 'layer3/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>]

模型编译和拟合

model.compile(loss=tf.keras.losses.BinaryCrossentropy(),optimizer=tf.keras.optimizers.Adam(0.001),
)model.fit(X,y,epochs=20
)

看看预测的效果

prediction = model.predict(X[0].reshape(1,400))  # a zero
print(f" predicting a zero: {prediction}")
prediction = model.predict(X[500].reshape(1,400))  # a one
print(f" predicting a one:  {prediction}")

模型的输出被解释为概率。在上面的第一个示例中，输入为零。该模型预测输入为1的概率几乎为零。

在第二个示例中，输入是一个1。该模型预测输入为1的概率接近1。

与逻辑回归的情况一样，将概率与阈值进行比较，以做出最终预测。

if prediction >= 0.5:yhat = 1
else:yhat = 0
print(f"prediction after threshold: {yhat}")

对64个样本的模型预测结果和实际结果进行对比

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cellm, n = X.shapefig, axes = plt.subplots(8,8, figsize=(8,8))
fig.tight_layout(pad=0.1,rect=[0, 0.03, 1, 0.92]) #[left, bottom, right, top]for i,ax in enumerate(axes.flat):# Select random indicesrandom_index = np.random.randint(m)# Select rows corresponding to the random indices and# reshape the imageX_random_reshaped = X[random_index].reshape((20,20)).T# Display the imageax.imshow(X_random_reshaped, cmap='gray')# Predict using the Neural Networkprediction = model.predict(X[random_index].reshape(1,400))if prediction >= 0.5:yhat = 1else:yhat = 0# Display the label above the imageax.set_title(f"{y[random_index,0]},{yhat}")ax.set_axis_off()
fig.suptitle("Label, yhat", fontsize=16)
plt.show()

3.numpy实现模型

3.1 numpy实现模型函数

这里主要是为了熟悉底层的实现，比如dense函数

使用for循环访问层中的每个单元（j），并执行该单元权重的点积（W[：，j]），然后求出单元（b[j]）的偏差之和，形成z。然后将激活函数“g（z）”应用于该结果。

练习：

# UNQ_C2
# GRADED FUNCTION: my_densedef my_dense(a_in, W, b, g):"""Computes dense layerArgs:a_in (ndarray (n, )) : Data, 1 example W    (ndarray (n,j)) : Weight matrix, n features per unit, j unitsb    (ndarray (j, )) : bias vector, j units  g    activation function (e.g. sigmoid, relu..)Returnsa_out (ndarray (j,))  : j units"""units = W.shape[1]a_out = np.zeros(units)
### START CODE HERE ### for j in range(units):w = W[:, j]  # 注意这里是小w，小w的数据类型是(ndarray (j, ))z = np.dot(a_in,w) + b[j]a_out[j] = g(z)
### END CODE HERE ### return(a_out)

自定义sequential函数：

def my_sequential(x, W1, b1, W2, b2, W3, b3):a1 = my_dense(x,  W1, b1, sigmoid)a2 = my_dense(a1, W2, b2, sigmoid)a3 = my_dense(a2, W3, b3, sigmoid)return(a3)

尝试使用自定义函数来预测：

W1_tmp,b1_tmp = layer1.get_weights()
W2_tmp,b2_tmp = layer2.get_weights()
W3_tmp,b3_tmp = layer3.get_weights()# make predictions
prediction = my_sequential(X[0], W1_tmp, b1_tmp, W2_tmp, b2_tmp, W3_tmp, b3_tmp )
if prediction >= 0.5:yhat = 1
else:yhat = 0
print( "yhat = ", yhat, " label= ", y[0,0])
prediction = my_sequential(X[500], W1_tmp, b1_tmp, W2_tmp, b2_tmp, W3_tmp, b3_tmp )
if prediction >= 0.5:yhat = 1
else:yhat = 0
print( "yhat = ", yhat, " label= ", y[500,0])

把自定义函数和TensorFlow提供的以及实际值进行对比

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cellm, n = X.shapefig, axes = plt.subplots(8,8, figsize=(8,8))
fig.tight_layout(pad=0.1,rect=[0, 0.03, 1, 0.92]) #[left, bottom, right, top]for i,ax in enumerate(axes.flat):# Select random indicesrandom_index = np.random.randint(m)# Select rows corresponding to the random indices and# reshape the imageX_random_reshaped = X[random_index].reshape((20,20)).T# Display the imageax.imshow(X_random_reshaped, cmap='gray')# Predict using the Neural Network implemented in Numpymy_prediction = my_sequential(X[random_index], W1_tmp, b1_tmp, W2_tmp, b2_tmp, W3_tmp, b3_tmp )my_yhat = int(my_prediction >= 0.5)# Predict using the Neural Network implemented in Tensorflowtf_prediction = model.predict(X[random_index].reshape(1,400))tf_yhat = int(tf_prediction >= 0.5)# Display the label above the imageax.set_title(f"{y[random_index,0]},{tf_yhat},{my_yhat}")ax.set_axis_off()
fig.suptitle("Label, yhat Tensorflow, yhat Numpy", fontsize=16)
plt.show()

3.2 向量化的numpy实现模型

我们可以使用上面的示例X和W1、b1参数来演示这一点。我们使用np.matmul执行矩阵乘法。注意，如上图所示，x和W的尺寸必须兼容。

x = X[0].reshape(-1,1)         # column vector (400,1)
z1 = np.matmul(x.T,W1) + b1    # (1,400)(400,25) = (1,25)
a1 = sigmoid(z1)
print(a1.shape)

dense函数的向量化实现

# UNQ_C3
# GRADED FUNCTION: my_dense_vdef my_dense_v(A_in, W, b, g):"""Computes dense layerArgs:A_in (ndarray (m,n)) : Data, m examples, n features eachW    (ndarray (n,j)) : Weight matrix, n features per unit, j unitsb    (ndarray (j,1)) : bias vector, j units  g    activation function (e.g. sigmoid, relu..)ReturnsA_out (ndarray (m,j)) : m examples, j units"""
### START CODE HERE ### Z = np.matmul(A_in,W) + bA_out = g(Z)### END CODE HERE ### return(A_out)

sequential函数

def my_sequential_v(X, W1, b1, W2, b2, W3, b3):A1 = my_dense_v(X,  W1, b1, sigmoid)A2 = my_dense_v(A1, W2, b2, sigmoid)A3 = my_dense_v(A2, W3, b3, sigmoid)return(A3)

获取weights

W1_tmp,b1_tmp = layer1.get_weights()
W2_tmp,b2_tmp = layer2.get_weights()
W3_tmp,b3_tmp = layer3.get_weights()

使用新模型预测

Prediction = my_sequential_v(X, W1_tmp, b1_tmp, W2_tmp, b2_tmp, W3_tmp, b3_tmp )
Prediction.shapeTensorShape([1000, 1])

阈值归类

Yhat = (Prediction >= 0.5).numpy().astype(int)
print("predict a zero: ",Yhat[0], "predict a one: ", Yhat[500])

将预测可视化

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cellm, n = X.shapefig, axes = plt.subplots(8, 8, figsize=(8, 8))
fig.tight_layout(pad=0.1, rect=[0, 0.03, 1, 0.92]) #[left, bottom, right, top]for i, ax in enumerate(axes.flat):# Select random indicesrandom_index = np.random.randint(m)# Select rows corresponding to the random indices and# reshape the imageX_random_reshaped = X[random_index].reshape((20, 20)).T# Display the imageax.imshow(X_random_reshaped, cmap='gray')# Display the label above the imageax.set_title(f"{y[random_index,0]}, {Yhat[random_index, 0]}")ax.set_axis_off()
fig.suptitle("Label, Yhat", fontsize=16)
plt.show()

虽然但是，有一些不伦不类的还是有误差的，注意numpy的where函数的使用，里面能填写条件表达式

fig = plt.figure(figsize=(1, 1))
errors = np.where(y != Yhat)
random_index = errors[0][0]
X_random_reshaped = X[random_index].reshape((20, 20)).T
plt.imshow(X_random_reshaped, cmap='gray')
plt.title(f"{y[random_index,0]}, {Yhat[random_index, 0]}")
plt.axis('off')
plt.show()

3.3 numpy的broadcasting（扩展）

NumPy compares their shapes element-wise. It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are compatible when

they are equal, or
one of them is 1

具体文档
如果不满足这些条件，则会引发ValueError:操作数无法一起扩展，表明数组具有不兼容的形状。结果数组的大小不是输入轴上的1。

上面一堆有点难理解，来看看例子

扩展图示

扩展前

扩展后：

如下列例子，当一个矩阵加一个数时，矩阵所有元素都加上这个数得到结果

a = np.array([1,2,3]).reshape(-1,1)  #(3,1)
b = 5
print(f"(a + b).shape: {(a + b).shape}, \na + b = \n{a + b}")

例子2：

a = np.array([1,2,3,4]).reshape(-1,1)
b = np.array([1,2,3]).reshape(1,-1)
print(a)
print(b)
print(f"(a + b).shape: {(a + b).shape}, \na + b = \n{a + b}")

【Machine Learning】15.简单神经网络练习相关推荐

machine learning(15) --Regularization:Regularized logistic regression
Regularization:Regularized logistic regression without regularization 当features很多时会出现overfitting现象,图 ...
中科院计算所开源Easy Machine Learning：让机器学习应用开发简单快捷 By 机器之心2017年6月13日 13:05 今日，中科院计算所研究员徐君在微博上宣布「中科院计算所开源了
中科院计算所开源Easy Machine Learning:让机器学习应用开发简单快捷 By 机器之心2017年6月13日 13:05 今日,中科院计算所研究员徐君在微博上宣布「中科院计算所开源了 E ...
吴恩达《Machine Learning》精炼笔记 5：神经网络
作者 | Peter 编辑 | AI有道系列文章: 吴恩达<Machine Learning>精炼笔记 1:监督学习与非监督学习吴恩达<Machine Learning>精 ...
吴恩达《Machine Learning》精炼笔记 4：神经网络基础
作者 | Peter 编辑 | AI有道今天带来第四周课程的笔记:神经网络基础. 非线性假设神经元和大脑模型表示特征和直观理解多类分类问题非线性假设Non-linear Hypothese ...
机器学习案例学习【每周一例】之 Titanic: Machine Learning from Disaster
下面一文章就总结几点关键: 1.要学会观察,尤其是输入数据的特征提取时,看各输入数据和输出的关系,用绘图看! 2.训练后,看测试数据和训练数据误差,确定是否过拟合还是欠拟合: 3.欠拟合的话,说明模 ...
Paper：《A Few Useful Things to Know About Machine Learning—关于机器学习的一些有用的知识》翻译与解读
Paper:<A Few Useful Things to Know About Machine Learning-关于机器学习的一些有用的知识>翻译与解读目录 <A Fe ...
Paper：《Multimodal Machine Learning: A Survey and Taxonomy，多模态机器学习:综述与分类》翻译与解读
Paper:<Multimodal Machine Learning: A Survey and Taxonomy,多模态机器学习:综述与分类>翻译与解读目录 <Multimoda ...
【译】The challenge of verification and testing of machine learning
在我们的第二篇文章中 ,我们给出了一些背景解释为什么攻击机器学习通常比维护它更容易. 我们看到了一些原因,为什么我们还没有完全有效的防范敌对的例子,我们猜测我们是否能够期待这样的防御. 在这篇文章中, ...
Federated Machine Learning: Concept and Applications
郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布! 今天的人工智能仍然面临两大挑战.一种是,在大多数行业中,数据以孤岛的形式存在.二是加强数据隐私和安全.我们提出了一个解决这些挑战的可能方案 ...

【Machine Learning】15.简单神经网络练习