很多内容来源于网络，如有冒犯。私信删除

文章目录

Pytorch搭建网络问题
- 1. 数据预处理
- - 1.1 归一化 (Normalization)
  - 1.2 标准化(Standardization)
  - 1.3 正则化
  - 1.4 Pytorch中常用张量操作
  - - 1.4.1 torch.cat
    - 1.4.2 torch.stack
    - 1.4.3 round四舍五入操作
    - 1.4.4 Tensor()与tensor()
    - 1.4.5取分类输出的最大值
    - 1.4.6字典切片
    - 1.4.7 数据类型转换
    - - (1) Numpy转换为Tensor
      - (2) Tensor转换为Numpy
      - (3) Tensor与 list 相互转换
      - (4) 基本数据类型转换
      - (5) type_as将张量转换成指定类型张量
      - (6) 使用torch.type()函数
    - 1.4.8 isinstance数据类型判断
  - 1.5 数据加载
  - 1.6 不同维度数组相乘操作
  - 1.7 pytorch中函数公式
  - - 1.7.1 logsigmoid形式
    - 1.7.2 激活函数
  - 1.8 Excel数据提取转换
- 2. 梯度操作
- - 2.1 打开梯度
  - 2.2 梯度清零
- 3. 网络模型搭建
- - 3.1 Sequential方法
  - 3.2 Class类方法
  - 3.3 优化器
  - 3.4 损失函数
  - 3.5 修改模型默认参数
  - 3.6 model.train和model.eval
- 4. 保存模型参数
- - 4.1 仅保存网络参数
  - 4.2 保存整个网络
- 5. 遇到的巨坑
- - 5.1 CrossEntropyLoss分类问题

Pytorch搭建网络问题

1. 数据预处理

1.1 归一化 (Normalization)

属性缩放到一个指定的最大和最小值（通常是1-0）之间，这可以通过preprocessing.MinMaxScaler类实现。常用的最小最大规范化方法(x-min(x))/(max(x)-min(x))。

from sklearn import preprocessing
import numpy as npmin_max_scaler = preprocessing.MinMaxScaler()
X_train = np.array([[ 1., -1., 2.],[ 2., 0., 0.],[ 0., 1., -1.]])
X_train_minmax = min_max_scaler.fit_transform(X_train)>>> X_train_minmax
array([[ 0.5 , 0. , 1. ],
[ 1. , 0.5 , 0.33333333],
[ 0. , 1. , 0. ]])

1.2 标准化(Standardization)

将数据按比例缩放，使之落入一个小的特定区间内，标准化后的数据可正可负，一般绝对值不会太大。计算时对每个属性/每列分别进行将数据按期属性（按列进行）减去其均值，并处以其方差。得到的结果是，对于每个属性/每列来说所有数据都聚集在0附近，方差为1。使用z-score方法规范化(x-mean(x))/std(x)这个在matlab中有特定的方程使用sklearn.preprocessing.scale()函数，可以直接将给定数据进行标准化：

from sklearn import preprocessing
import numpy as npX = np.array([[ 1., -1.,  2.],[ 2.,  0.,  0.],[ 0.,  1., -1.]])
X_scaled = preprocessing.scale(X)>>> X_scaled
array([[ 0.  ..., -1.22...,  1.33...],[ 1.22...,  0.  ..., -0.26...],[-1.22...,  1.22..., -1.06...]])>>>#处理后数据的均值和方差
>>> X_scaled.mean(axis=0)
array([ 0.,  0.,  0.])>>> X_scaled.std(axis=0)
array([ 1.,  1.,  1.])

1.3 正则化

正则化的过程是将每个样本缩放到单位范数（每个样本的范数为1），如果后面要使用如二次型（点积）或者其它核方法计算两个样本之间的相似性这个方法会很有用。Normalization主要思想是对每个样本计算其p-范数，然后对该样本中每个元素除以该范数，这样处理的结果是使得每个处理后样本的p-范数（l1-norm,l2-norm）等于1。
$P-范数的计算公式：||X||_p=(|x_1|^p+|x_2|^p+...+|x_n|^p)^{1/p}$
该方法主要应用于文本分类和聚类中。例如，对于两个TF-IDF向量的l2-norm进行点积，就可以得到这两个向量的余弦相似性。

可以使用preprocessing.normalize()函数对指定数据进行转换：

>>> X = [[ 1., -1., 2.],
... [ 2., 0., 0.],
... [ 0., 1., -1.]]
>>> X_normalized = preprocessing.normalize(X, norm='l2')>>> X_normalized
array([[ 0.40..., -0.40..., 0.81...],
[ 1. ..., 0. ..., 0. ...],
[ 0. ..., 0.70..., -0.70...]])

可以使用processing.Normalizer()类实现对训练集和测试集的拟合和转换：

>>> normalizer = preprocessing.Normalizer().fit(X) # fit does nothing
>>> normalizer
Normalizer(copy=True, norm='l2')>>>
>>> normalizer.transform(X)
array([[ 0.40..., -0.40..., 0.81...],
[ 1. ..., 0. ..., 0. ...],
[ 0. ..., 0.70..., -0.70...]])>>> normalizer.transform([[-1., 1., 0.]])
array([[-0.70..., 0.70..., 0. ...]])

1.4 Pytorch中常用张量操作

1.4.1 torch.cat

对数据沿着某一维度进行拼接，cat后的总维度数不变，需要注意两个张量进行cat时某一维的维数要相同，否则会报错！

import torch
x = torch.randn(2,3)
print(x)
print('*'*80)
y = torch.randn(1,3)
print(y)
print('*'*80)
t = torch.cat((x, y), 0)   # 维度为(3, 3)
print(t)
torch.cat((x, z), 0)  # 报错

运行结果：

tensor([[-1.3758, -0.3441, -1.4608],[ 1.2006, -0.7091,  0.1233]])
********************************************************************************
tensor([[-0.8673, -0.8082, -2.3864]])
********************************************************************************
tensor([[-1.3758, -0.3441, -1.4608],[ 1.2006, -0.7091,  0.1233],[-0.8673, -0.8082, -2.3864]])

1.4.2 torch.stack

相比于Cat，Stack则会增加新的维度，并且将两个矩阵在新的维度上进行堆叠，一般要求两个矩阵的维度是相同的！

import torch
x = torch.randn(1,2)
y = torch.randn(1,2)
torch.stack((x, y), 0)   # 在0维度进行堆叠，维度为(2, 1, 2)
torch.stack((x, y), 1)   # 维度为(1, 2, 2)

运行结果：

tensor([[-0.9762, -1.1769]])
********************************************************************************
tensor([[-0.6522,  0.0318]])
********************************************************************************
tensor([[[-0.9762, -1.1769]],[[-0.6522,  0.0318]]])
********************************************************************************
tensor([[[-0.9762, -1.1769],[-0.6522,  0.0318]]])
********************************************************************************

1.4.3 round四舍五入操作

import torchx = 2.55555
y = torch.tensor(2.55555, dtype= torch.float32)
# 方法一
print('结果1：',round(x,3))  # round为python语法中自带的函数，3是保留小数的位数
# 方法二
print('结果2：',torch.round(y))  # torch.round不能限制小数位数
print('结果3：',torch.round(y).item())  # item()把数从tensor中取出

运行结果：

结果1： 2.556
结果2： tensor(3.)
结果3： 3.0

1.4.4 Tensor()与tensor()

import torch

在PyTorch中，Tensor和tensor都能用于生成新的张量：

a = torch.Tensor([1,2])

>>> a=torch.Tensor([1,2])
>>> a
tensor([1., 2.])
>>> a=torch.tensor([1,2])
>>> a
tensor([1, 2])

首先，我们需要明确一下，torch.Tensor()是python类，更明确地说，是默认张量类型torch.FloatTensor()的别名，torch.Tensor([1,2])会调用Tensor类的构造函数__init__，生成单精度浮点类型的张量。

>>> a=torch.Tensor([1,2])
>>> a.type()
'torch.FloatTensor'

而torch.tensor()仅仅是python函数：https://pytorch.org/docs/stable/torch.html#torch.tensor ，函数原型是：

torch.tensor(data, dtype=None, device=None, requires_grad=False)

其中data可以是：list, tuple, NumPy ndarray, scalar和其他类型。torch.tensor会从data中的数据部分做拷贝（而不是直接引用），根据原始数据类型生成相应的torch.LongTensor、torch.FloatTensor和torch.DoubleTensor。

>>> a=torch.tensor([1,2])
>>> a.type()
'torch.LongTensor'

>>> a=torch.tensor([1.,2.])
>>> a.type()
'torch.FloatTensor

>>> a=np.zeros(2,dtype=np.float64)
>>> a=torch.tensor(a)
>>> a.type()
'torch.DoubleTensor'

这里再说一下torch.empty()，根据 https://pytorch.org/docs/stable/torch.html?highlight=empty#torch.empty ，我们可以生成指定类型、指定设备以及其他参数的张量，由于torch.Tensor()只能指定数据类型为torch.float，所以torch.Tensor()可以看做torch.empty()的一个特殊情况。

1.4.5取分类输出的最大值

with torch.no_grad():testY = model(testX)
print(testY)

运行结果：

tensor([[  7.4433,  -1.4233,  -1.6965,  -4.9028],[ 11.1287,  -5.7861,  -2.3523,  -1.3352],[  1.6368,   4.0758,   1.5106,  -6.8918],[ 11.1269,  -6.2055,  -0.2486,  -4.0074],[  4.2791,  -7.5071,   8.0243,  -5.0912],[  3.9377,   0.1002,  -3.0278,   0.7973],[ 10.4937,  -5.5156,   0.3815,  -4.5885],[ 10.2765,  -2.4278,  -0.0422,  -7.3499],[  0.8234,   9.4561,  -2.2854,  -7.8151],[  3.6753,  -2.6943,   6.2879,  -5.9786],[  9.7963,  -1.1426,   0.2660,  -8.2053],[  5.3171,   3.5008,  -3.4102,  -5.2817],[  9.0295,  -2.3807,  -5.0728,  -2.1787],[ 12.7925,  -6.8981,  -3.3715,  -1.3687],[  2.9363,  -4.1924,  -3.8692,   5.4553],[  7.0463,  -1.8211,  -2.3471,  -1.9651],[  6.1256,  -1.4506,  -0.0740,  -4.6081],[  4.4470,   0.8657,   1.6806,  -5.3237],[  7.1012,   1.6752,   1.1116,  -9.0371],[  1.7235,  -5.7148,   6.2477,  -1.1781],[  0.8945,   4.2796,  -1.5190,  -3.4724],[  9.7305,  -2.1866,  -2.9471,  -2.3112],[  7.8209,  -2.1488,   0.8533,  -5.8382],[  0.6063,   7.9243,  -2.4863,  -5.3481],[  3.1649,  -0.0549,   3.5648,  -6.4298],[  8.4594,  -0.2936,  -0.4718,  -6.2386],[  2.6753,   2.1676,   0.6504,  -4.7133],[ 11.4688,  -4.3625,  -5.2973,  -1.6718],[ 12.7178,  -6.6919,  -4.8123,  -1.9376],[ -0.9076,  -0.9274,  -4.6698,   7.8568],[  8.5488,  -3.4524,  -1.4708,  -3.4786],[  9.8643,  -6.3564,  -2.3896,   0.1812],[ -0.3086,   6.6137,  -1.6922,  -4.2936],[  5.6480,  -0.3888,  -1.8955,  -0.7594],[  2.4999,  -2.9834,   7.2879,  -5.2193],[  4.2896,   0.3526,  -4.0778,   0.2920],[  9.1389,  -5.9225,  -0.3296,  -3.1200],[  6.9025,  -3.9361,  -2.1047,   1.1030],[  1.7949,   2.7270,  -1.1831,  -1.9257],[  4.2454,  -4.7726,   5.9915,  -4.7709],[ 10.3149,  -2.4509,  -0.5917,  -6.6981],[  0.3288,   8.1812,  -5.5801,  -0.7519],[ 10.9215,  -3.3665,  -3.9858,  -2.0602],[  9.2952,  -3.1185,  -5.7481,  -0.3535],[  2.7448,  -6.3724,  -4.5297,   7.7019],[  8.7598,  -4.8083,  -2.2426,  -0.4326],[  9.3423,  -5.7544,   0.3519,  -2.5967],[  2.0215,   2.5876,  -0.7334,  -1.8973],[  8.3974,  -1.2813,  -0.1331,  -5.7042],[  1.4222,  -2.6100,   6.5302,  -2.1887],[  7.4289,   2.8581,   0.6636,  -8.8257],[  7.4660,  -3.3966,  -3.2598,   0.7070],[  7.7047,  -3.7917,  -0.8066,  -2.5238],[  3.9101,   3.1239,  -2.9358,  -1.0799],[  2.7316,  -3.2821,   8.4985,  -6.1583],[  9.0011,  -2.5707,  -1.6200,  -3.3008],[ -0.5210,   4.3287,  -2.8837,   0.1590],[  9.4240,  -1.8600,  -4.6306,  -0.2257],[ 10.5553,  -4.5794,  -2.8072,  -1.3519],[ -2.0982,  -1.5021,  -5.8774,  10.1451],[  8.1251,  -5.1918,  -3.6729,   0.5811],[  8.6910,  -2.0897,  -4.6669,   0.5333],[ -0.7934,   5.4703,  -0.1302,  -3.1170],[  6.9602,  -1.3405,  -0.1571,  -4.3973],[  0.1805,  -0.8911,   6.1601,  -5.5365],[  2.1057,   2.5338,  -5.6351,   2.3221],[  7.3220,   0.2707,  -4.7512,  -2.4399],[  8.4964,  -1.4643,   4.8854, -10.9043],[  3.1047,   5.5968,   0.9471,  -8.8787],[  4.9688,  -5.2696,   6.1680,  -4.2479],[  9.7998,  -3.5701,   1.4597,  -6.7401],[  1.8569,   6.1164,  -3.1263,  -4.2748],[  6.1492,   2.9876,  -7.2567,  -2.3775],[  9.4298,  -2.8283,  -7.4377,   1.5422],[ -0.6555,  -0.2519,  -5.8323,   6.8694],[  7.3518,   3.0800,  -0.9119,  -9.0124],[  6.9438,   1.7972,  -2.6768,  -6.1078],[ -0.0528,   7.3127,  -1.9607,  -3.8322],[  5.6991,   2.8540,  -3.7784,  -4.0820],[ -1.1966,  -1.2128,   6.1327,  -2.6217],[ -0.3849,   7.3386,  -2.5669,  -4.6670],[  6.3555,   1.5932,  -5.1967,  -1.0024],[  2.5816,   4.1530,  -0.7747,  -5.6864],[ -0.7420,   9.3222,   0.5745,  -7.3484],[  0.1243,  -2.8342,  10.8683,  -7.7141],[  6.9208,   1.0358,  -0.1274,  -5.5745],[  0.7077,   4.9082,   2.1944,  -6.8996],[  6.7253,  -0.3559,  -4.0509,  -1.9693],[  8.4796,  -3.4290,  -4.4795,  -1.4465],[ -0.5281,  -0.7838,  -5.0702,   7.0901],[  5.6690,   0.0732,  -3.9329,  -2.0248],[ 12.2119,  -2.2533,  -1.3228,  -7.9734],[  3.9205,   2.3429,   0.3645,  -6.9054],[  7.0275,   1.2768,  -2.3088,  -4.5443],[  0.8335,  -3.6880,   8.6731,  -5.6231],[  0.8692,   6.5459,  -5.7003,  -0.1224],[  8.7197,  -1.6967,  -3.0582,  -3.4979],[  7.0834,  -1.9839,  -3.9747,   0.7744],[  3.1499,   4.4433,  -3.7725,  -2.7284],[  8.1010,  -2.4316,   4.0292,  -8.6694]])

代码：

print(testY.max(1))  # 返回两个tensor， 第一个tensor为每一行的最大值，第二个tensor为最大值在每一行位置的索引

运行结果：

torch.return_types.max(
values=tensor([ 7.4433, 11.1287,  4.0758, 11.1269,  8.0243,  3.9377, 10.4937, 10.2765,9.4561,  6.2879,  9.7963,  5.3171,  9.0295, 12.7925,  5.4553,  7.0463,6.1256,  4.4470,  7.1012,  6.2477,  4.2796,  9.7305,  7.8209,  7.9243,3.5648,  8.4594,  2.6753, 11.4688, 12.7178,  7.8568,  8.5488,  9.8643,6.6137,  5.6480,  7.2879,  4.2896,  9.1389,  6.9025,  2.7270,  5.9915,10.3149,  8.1812, 10.9215,  9.2952,  7.7019,  8.7598,  9.3423,  2.5876,8.3974,  6.5302,  7.4289,  7.4660,  7.7047,  3.9101,  8.4985,  9.0011,4.3287,  9.4240, 10.5553, 10.1451,  8.1251,  8.6910,  5.4703,  6.9602,6.1601,  2.5338,  7.3220,  8.4964,  5.5968,  6.1680,  9.7998,  6.1164,6.1492,  9.4298,  6.8694,  7.3518,  6.9438,  7.3127,  5.6991,  6.1327,7.3386,  6.3555,  4.1530,  9.3222, 10.8683,  6.9208,  4.9082,  6.7253,8.4796,  7.0901,  5.6690, 12.2119,  3.9205,  7.0275,  8.6731,  6.5459,8.7197,  7.0834,  4.4433,  8.1010]),
indices=tensor([0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 1, 0, 0, 1,2, 0, 0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 1, 0, 0, 3, 0, 0, 1,0, 2, 0, 0, 0, 0, 2, 0, 1, 0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 0, 1, 2, 0, 1,0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 1, 1, 2, 0, 1, 0, 0, 3, 0, 0, 0, 0, 2, 1,0, 0, 1, 0]))

代码：

print(testY.max(1)[1])

运行结果：

tensor([0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 1, 0, 0, 1,2, 0, 0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 0, 1, 2, 0, 1, 0, 0, 3, 0, 0, 1,0, 2, 0, 0, 0, 0, 2, 0, 1, 0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 0, 1, 2, 0, 1,0, 0, 3, 0, 0, 1, 0, 2, 1, 0, 1, 1, 2, 0, 1, 0, 0, 3, 0, 0, 0, 0, 2, 1,0, 0, 1, 0])

1.4.6字典切片

word_to_idx = {word:i for i, word in enumerate(idx_to_word)}
# print(type(word_to_idx))  # 字典0 : the, 1 : of,………………
# print(word_to_idx[:100])  # 报错
# 字典不能切片显示，可以转换成list
print(list(word_to_idx.items())[:100])
print('*'*80)
print(list(word_to_idx)[:100])

运行结果：

[('the', 0), ('of', 1), ('and', 2), ('one', 3), ('in', 4), ('a', 5), ('to', 6), ('zero', 7), ('nine', 8), ('two', 9), ('is', 10), ('as', 11), ('eight', 12), ('for', 13), ('s', 14), ('five', 15), ('three', 16), ('was', 17), ('by', 18), ('that', 19), ('four', 20), ('six', 21), ('seven', 22), ('with', 23), ('on', 24), ('are', 25), ('it', 26), ('from', 27), ('or', 28), ('his', 29), ('an', 30), ('be', 31), ('this', 32), ('he', 33), ('at', 34), ('which', 35), ('not', 36), ('also', 37), ('have', 38), ('were', 39), ('has', 40), ('but', 41), ('other', 42), ('their', 43), ('its', 44), ('first', 45), ('they', 46), ('had', 47), ('some', 48), ('more', 49), ('all', 50), ('can', 51), ('most', 52), ('been', 53), ('such', 54), ('who', 55), ('many', 56), ('new', 57), ('there', 58), ('used', 59), ('after', 60), ('american', 61), ('when', 62), ('time', 63), ('into', 64), ('these', 65), ('only', 66), ('see', 67), ('may', 68), ('than', 69), ('i', 70), ('world', 71), ('b', 72), ('d', 73), ('would', 74), ('no', 75), ('however', 76), ('between', 77), ('about', 78), ('over', 79), ('states', 80), ('years', 81), ('war', 82), ('people', 83), ('united', 84), ('during', 85), ('known', 86), ('if', 87), ('called', 88), ('use', 89), ('th', 90), ('often', 91), ('system', 92), ('so', 93), ('history', 94), ('state', 95), ('will', 96), ('up', 97), ('while', 98), ('where', 99)]
********************************************************************************
['the', 'of', 'and', 'one', 'in', 'a', 'to', 'zero', 'nine', 'two', 'is', 'as', 'eight', 'for', 's', 'five', 'three', 'was', 'by', 'that', 'four', 'six', 'seven', 'with', 'on', 'are', 'it', 'from', 'or', 'his', 'an', 'be', 'this', 'he', 'at', 'which', 'not', 'also', 'have', 'were', 'has', 'but', 'other', 'their', 'its', 'first', 'they', 'had', 'some', 'more', 'all', 'can', 'most', 'been', 'such', 'who', 'many', 'new', 'there', 'used', 'after', 'american', 'when', 'time', 'into', 'these', 'only', 'see', 'may', 'than', 'i', 'world', 'b', 'd', 'would', 'no', 'however', 'between', 'about', 'over', 'states', 'years', 'war', 'people', 'united', 'during', 'known', 'if', 'called', 'use', 'th', 'often', 'system', 'so', 'history', 'state', 'will', 'up', 'while', 'where']

1.4.7 数据类型转换

import torch
import numpy as np
a_numpy = np.array([1,2,3])

(1) Numpy转换为Tensor

a_tensor = torch.from_numpy(a_numpy)
print(a_tensor)

(2) Tensor转换为Numpy

a_numpy = a_tensor.numpy()
print(a_numpy)

(3) Tensor与 list 相互转换

# Tensor转list
>>>a=torch.ones([1,5])
>>>atensor([[1., 1., 1., 1., 1.]])>>>b=a.tolist()
>>>b[[1.0, 1.0, 1.0, 1.0, 1.0]]# list转Tensor
>>>a=list(range(1,6))
>>>a
[1, 2, 3, 4, 5]>>>b=torch.tensor(a)
>>>b
tensor([1, 2, 3, 4, 5])

(4) 基本数据类型转换

tensor = torch.Tensor(3, 5)# torch.long() 将tensor投射为long类型
newtensor = tensor.long()# torch.half()将tensor投射为半精度浮点类型
newtensor = tensor.half()# torch.int()将该tensor投射为int类型
newtensor = tensor.int()# torch.double()将该tensor投射为double类型
newtensor = tensor.double()# torch.float()将该tensor投射为float类型
newtensor = tensor.float()# torch.char()将该tensor投射为char类型
newtensor = tensor.char()# torch.byte()将该tensor投射为byte类型
newtensor = tensor.byte()# torch.short()将该tensor投射为short类型
newtensor = tensor.short()

(5) type_as将张量转换成指定类型张量

>>> a=torch.Tensor(2,5)
>>> atensor([[1.9431e-19, 4.8613e+30, 1.4603e-19, 2.0704e-19, 4.7429e+30],[1.6530e+19, 1.8254e+31, 1.4607e-19, 6.8801e+16, 1.8370e+25]])>>> b=torch.IntTensor(1,2)
>>> btensor([[16843009,        1]], dtype=torch.int32)>>> a.type_as(b)tensor([[          0, -2147483648,           0,           0, -2147483648],[-2147483648, -2147483648,           0, -2147483648, -2147483648]],dtype=torch.int32)>>> atensor([[1.9431e-19, 4.8613e+30, 1.4603e-19, 2.0704e-19, 4.7429e+30],[1.6530e+19, 1.8254e+31, 1.4607e-19, 6.8801e+16, 1.8370e+25]])

(6) 使用torch.type()函数

type(new_type=None, async=False)如果未提供new_type，则返回类型，否则将此对象转换为指定的类型。如果已经是正确的类型，则不会执行且返回原对象，用法如下：

>>>t1 = torch.LongTensor(3, 5)
>>>print(t1.type())torch.LongTensor# 转换为其他类型
>>>t2=t1.type(torch.FloatTensor)
>>>print(t2.type())torch.FloatTensor

存在的类型有：

torch.FloatTensor
torch.LongTensor
torch.HalfTensor
torch.IntTensor
torch.DoubleTensor
torch.FloatTensor
torch.CharTensor
torch.ByteTensor
torch.ShortTensor

1.4.8 isinstance数据类型判断

isinstance() 函数来判断一个对象是否是一个已知的类型，类似 type()。

以下是 isinstance() 方法的语法:

isinstance(object,classinfo)

object – 实例对象
classinfo – 可以是直接或间接类名、基本类型或者由它们组成的元组。
返回值：如果对象的类型与参数二的类型（classinfo）相同则返回 True，否则返回 False。

>>>a = 2
>>>isinstance(a,int)
True
>>>isinstance(a,str)
False
>>>isinstance(a,(str,int,list)
# 是元组中的任何一个返还True
True

isinstance()与type()的区别

type() 不会认为子类是一种父类类型，不考虑继承关系。
isinstance() 会认为子类是一种父类类型，考虑继承关系。

class A:passclass B(A):passisinstance(A(), A)    # returns True
type(A()) == A        # returns True
isinstance(B(), A)    # returns True
type(B()) == A        # returns False

我们发现，创建一个A对象，再创建一个继承A对象的B对象，使用 isinstance() 和 type() 来比较 A() 和 A 时，由于它们的类型都是一样的，所以都返回了 True。而B对象继承于A对象，在使用isinstance()函数来比较 B() 和 A 时，由于考虑了继承关系，所以返回了 True，使用 type() 函数来比较 B() 和 A 时，不会考虑 B() 继承自哪里，所以返回了 False。如果要判断两个类型是否相同，则推荐使用isinstance()。

if isinstance(h, torch.Tensor):pass
else:pass

1.5 数据加载

class WordEmbeddingDataset(torch.utils.data.Dataset):def __init__(self, text, word_to_idx, idx_to_word, word_freqs, word_counts):super(WordEmbeddingDataset, self).__init__()self.text_encoded = [word_to_idx.get(t, VOCAB_SIZE-1) for t in text]self.text_encoded = torch.LongTensor(self.text_encoded).long()self.word_to_idx = word_to_idxself.idx_to_word = idx_to_wordself.word_freqs = torch.Tensor(word_freqs)self.word_counts = torch.Tensor(word_counts)def __len__(self):# 这个数据集一共有多少itemsreturn len(self.text_encoded)def __getitem__(self, idx):  # 根据idx返回数据(tensor)center_word = self.text_encoded[idx]pos_indices = list(range(idx-C, idx)) + list(range(idx+1, idx+1+C))  # 周围单词的索引# 防止 idx+1+C 大于 len(self.text_encoded)，# i % len(self.text_encoded)，当i<len(self.text_encoded)时，余数为 i，# 当i>len(self.text_encoded)时，余数为个数pos_indices = [i % len(self.text_encoded) for i in pos_indices]  pos_words = self.text_encoded[pos_indices]  # 周围正确的单词，希望预测出来# torch.multinomial()neg_words = torch.multinomial(self.word_freqs, K*pos_words.shape[0],True)  # 负例采样，pos_words.shape[0]表示正确单词个数return center_word,pos_words,neg_wordsdataset = WordEmbeddingDataset(text, word_to_idx, idx_to_word, word_freqs, word_counts)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=0)
# 查看dataloader中dataset数据
# 方法一
next(iter(dataloader))
# 方法二
for i,(center_word, pos_words, neg_words) in enumerate(dataloader):print(center_word, pos_words, neg_words)if i>5:break

1.6 不同维度数组相乘操作

input_embedding = self.in_embed(input_labels) # Batch_size * embed_size
pos_embedding = self.out_embed(pos_labels) # Batch_size * (2*C) * embed_size
neg_embedding = self.out_embed(neg_labels) # Batch_size * (2*C * K) * embed_sizeinput_embedding = input_embedding.unsqueeze(2)  # Batch_size * embed_size * 1,unsqueeze(2)增加了第三个维度
# torch.bmm实现第一个维度不变，其余维度矩阵相乘
pos_dot = torch.bmm(pos_embedding, input_embedding).squeeze() # 本来是Batch_size * (2*C) * 1，squeeze()后变为B * (2*C)
neg_pot = torch.bmm(neg_embedding, -input_embedding).squeeze() # Batch_size * (2*C*K)

1.7 pytorch中函数公式

1.7.1 logsigmoid形式

import torch.nn.functional as F# log形式的sigmoid函数，用F.log(F.sigmoid)形式，可能会出现内存爆炸等一系列问题
log_pos = F.logsigmoid(pos_dot).sum(1)
log_neg = F.logsigmoid(neg_pot).sum(1)

1.7.2 激活函数

import torch.nn.functional as F
import torch.nn as nn# 只是纯粹调用函数，都是小写字母开头
F.tanh()
F.sigmoid()
# 在网络中增加激活层，均为大写字母开头
nn.Tanh()
nn.Sigmoid()

1.8 Excel数据提取转换

方法一：


# input_datas.xlsx存储复数的表格data_input = pd.read_excel(r"E://Datas/input_datas.xlsx")
# print(data_1)
data_input = np.array(data_input)
# data_1 = data_1.reshape(1024,2)
data_input = data_input.tolist()
data_input = np.array(data_input)
data_input = data_input.astype(np.complex).tolist()  # 数据类型转换成复数
data_input = np.array(data_input)
print(data_input.shape)
data_input_r = torch.tensor(np.real(data_input), dtype=torch.float32)  # 实部
data_input_i = torch.tensor(np.imag(data_input), dtype=torch.float32)  # 虚部# 提取data_input_r中，除第一行之外的所有数据new_data_input_r = torch.zeros((46,1024), dtype=torch.float32)
new_data_input_r = data_input_r[1:,:]
# print(data_input_r[46])
# print(new_data_input_r[45])

方法二：

# 训练集输入数据
data_input = pd.read_excel('/content/drive/My Drive/Colab Notebooks/工作簿6.xlsx') #data_input = np.array(data_input)
data_input = data_input.tolist()
new = list()
for i in range(347):for j in range(1024):new.append(complex(data_input[i][j]))data_input = np.array(new).reshape(347,1024)

2. 梯度操作

2.1 打开梯度

# 打开梯度的两种方式# 方法一
x = torch.ones(2, 2, requires_grad=True)
# 方法二
x.requires_grad_(True)
# model = torch.nn.Sequential(……)
for params in model.parameters():params.requires_grad_(True)# 无梯度运算
with torch.no_grad():for param in model.parameters():  # 注意加括号param -= learning_rate*param.grad

2.2 梯度清零

# 方法一
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
………………………………
optimizer.zero_grad()
# 方法二
model = torch.nn.Sequential(……)
model.zero_grad()

3. 网络模型搭建

3.1 Sequential方法

# 方法一
hidden_Layers = 100
NUM_DIGITS = 10
model = torch.nn.Sequential(torch.nn.Linear(NUM_DIGITS, hidden_Layers),  # 不能少逗号torch.nn.ReLU(),torch.nn.Linear(hidden_Layers, 4)
)loss_fn = torch.nn.CrossEntropyLoss()  # 多用作分类，集成了Softmax
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
……………………
y_pred = model(input_data)
loss = loss_fn(y_pred, y_label)
optimizer.zero_grad()  # 梯度清零不能忘
loss.backward()
optimizer.step()

3.2 Class类方法

# 方法二
class TwoLayerNet(torch.nn.Module):def __init__(self, n_features, n_hidden, n_out):  # define the model architecturesuper(TwoLayerNet, self).__init__()self.linear1 = torch.nn.Linear(n_features, n_hidden)  # 在句尾多家一个逗号，会报错self.linear2 = torch.nn.Linear(n_hidden, n_out)def forward(self, x):y_before = F.relu(self.linear1(x))y_pred = self.linear2(y_before)
#         y_pred = self.linear2(self.linear1(x).clamp(min = 0))return y_prednet = TwoLayerNet(2, 10, 4)
loss_fn = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.05)
……………………
y_pred = net(input_data)
loss = loss_fn(y_pred, y_label)
optimizer.zero_grad()  # 梯度清零不能忘
loss.backward()
optimizer.step()

3.3 优化器

optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
optimizer = torch.optim.Adam(model.parameters(), lr=0.05)
………………

3.4 损失函数

loss_fn = torch.nn.CrossEntropyLoss()  # 多用作分类，集成了Softmax
loss_fn = torch.nn.MSELoss()
………………
loss = loss_fn(y_pred, y_label)
optimizer.zero_grad()  # 梯度清零不能忘
loss.backward()

3.5 修改模型默认参数

# 以Sequential方法为例
model = torch.nn.Sequential(torch.nn.Linear(NUM_DIGITS, hidden_Layers),  # 不能少逗号torch.nn.ReLU(),torch.nn.Linear(hidden_Layers, 4)
)print(model)
print(model[0].weight)# 修改方法如下：修改模型默认初始化的数据
torch.nn.init.normal_(model[0].weight)
torch.nn.init.normal_(model[2].weight)

运行结果：

Sequential((0): Linear(in_features=1000, out_features=100, bias=True)(1): ReLU()(2): Linear(in_features=100, out_features=10, bias=True)
)Parameter containing:
tensor([[ 0.6446,  0.6133, -1.2414,  ...,  0.7190,  0.1795, -0.1246],[ 1.5737, -1.2386, -0.7058,  ...,  0.8870,  0.0807,  0.4245],[-0.8080, -2.5309, -0.9246,  ..., -0.1821, -0.0434, -0.2618],...,[-0.6270, -1.0656,  1.3784,  ...,  0.3057, -1.4967, -0.3401],[ 0.9599, -0.0353, -1.1812,  ...,  1.1073,  0.9129,  0.0291],[-1.3919, -0.1804,  0.0903,  ...,  0.5543,  0.3251,  1.8142]],requires_grad=True)

3.6 model.train和model.eval

两条语句有固定的使用场景。

在训练模型时会在前面加上：

model.train()

在测试模型时在前面使用:

model.eval()

同时发现，如果不使用这两条语句，程序也可以运行。这两个方法是针对在网络train和eval时采用不同方式的情况，比如Batch Normalization和Dropout。下面对这Batch Normalization和Dropout做一下详细的解析：

Batch Normalization

BN的作用主要是对网络中间的每层进行归一化处理，并且使用变换重构（Batch Normalization Transform）保证每层所提取的特征分布不会被破坏。
训练时是针对每个mini-batch的，但是在测试中往往是针对单张图片，即不存在mini-batch的概念。由于网络训练完毕后参数都是固定的，因此每个batch的均值和方差都是不变的，因此直接结算所有batch的均值和方差。所有Batch Normalization的训练和测试时的操作不同。

Dropout

4. 保存模型参数

4.1 仅保存网络参数

import torch
torch.save(model.state_dict(),path):

功能：保存训练完的网络的各层参数（即weights和bias)

其中：model.state_dict()获取各层参数，path是文件存放路径(通常保存文件格式为.pt或.pth)

import torch
model2 = Sequential(…………)model2 = TheModelClass(*args, **kwargs)
model2.load_state_dict(torch.load(PATH))
model2.eval()
# 必须在加载模型后调用model.eval函数来将dropout及批归一化层设置为预测模式。如果不这么做结果出错。

功能：加载保存到path中的各层参数到神经网络

注意：不可以直接为torch.load_state_dict(path)，此函数不能直接接收字符串类型参数

4.2 保存整个网络

torch.save(net,path):

功能：保存训练完的整个网络模型（不止weights和bias）

net2=torch.load(path):

功能：加载保存到path中的整个神经网络

说明：官方推荐方式一，原因自然是保存的内容少，速度会更快。

案列：保存loss最小时，model中的参数。

# 0.5,表示每调用一次 lr 降一半。
scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer,0.5)  loss_list = []
for epoch in range(20000):for start in range(0, 346, batch_size):  #end = start + batch_sizebatch_input_datas = new_data_input_r[start:end]batch_label_datas = new_data_label_r[start:end]acc_sum, err_sum =0.0, 0.0new_y_pred_r = model(batch_input_datas)loss = loss_fn(new_y_pred_r, batch_label_datas)# 训练准确率：if epoch % 50 ==0:
#############################################################################loss_list = loss.item()  # 将loss保存在列表中if len(loss_list) ==0 or loss_list < min(loss_list):torch.save(model.state_dict(), 'lm.pth')print("best model saved to lm.pth")else:  # 模型loss没有下降时：# learning rate decay：下降学习率。# 也可以设置loss三次没下降，调用该函数scheduler.step()  # 必须放在optimizer.step()之后

5. 遇到的巨坑

5.1 CrossEntropyLoss分类问题

pytorch 中利用交叉熵损失函数分类时，输入的正确 label 不能是 one-hot 格式。函数内部会自己处理成 one-hot 格式。所以不需要输入 [ 0 0 0 0 1]，只需要输入 4 就行。自己转换成 one-hot 与预测值比较，求损失函数。
label 用数字标注是从 0 开始，不能从 1或者其他数开始。
label 一定要是 LongTensor 类型。
label 的 shape 必须是 [batch_size]，如果是 [batch_size, 1]，需要用 label.squeeze() 转化为 [batch_size]。

未完待续…………

PyTorch小技巧相关推荐

你应该知道的一个PyTorch小技巧
欢迎关注 "小白玩转Python",发现更多 "有趣" 使用过深度学习的人都知道,有时候调试模型是非常困难的.张量的不匹配.梯度爆炸,以及其他无数的问题都会让你 ...
优化Pytorch模型训练的小技巧
在本文中,我将描述并展示4种不同的Pytorch训练技巧的代码,这些技巧是我个人发现的,用于改进我的深度学习模型的训练. 混合精度在一个常规的训练循环中,PyTorch以32位精度存储所有浮点数变量 ...
7个使用PyTorch的技巧，含在线代码示例！网友：我连第一个都不知道？！
点击上方"视学算法",选择加"星标"或"置顶" 重磅干货,第一时间送达丰色发自凹非寺量子位报道 | 公众号 QbitAI 大家在使 ...
使用谷歌 Colab Notebooks，这 6 个小技巧你需要掌握
点击上方"视学算法",选择加"星标"或"置顶" 重磅干货,第一时间送达选自 | Medium 作者 | Iden W. 转自 | 机器之心 ...
使用谷歌Colab Notebooks，这6个小技巧你需要掌握
选自Medium 作者:Iden W. 机器之心编译编辑:陈萍.杜伟切换暗黑模式.读取 CSV 文件- 这些非常实用的小技巧为开发者使用谷歌 Colab Notebooks 提供了便利. Goog ...
Windows 11 小技巧- WSL运行本地GPU算力
WSL 已经被很多开发⼈员⽤于云原⽣开发,但如果你像我每天要完成⼈⼯智能应⽤的发,会⼀直希望能加上GPU算⼒,这样就不需要再去安装⼀台Linux的机器去做⼈⼯智能的⼯作了(毕竟很多的⼈⼯智能场景都是 ...
编写高效的PyTorch代码技巧（下）
点击上方"算法猿的成长",关注公众号,选择加"星标"或"置顶" 总第 133 篇文章,本文大约 3000 字,阅读大约需要 15 分钟原文 ...
安装环境及Git小技巧
深度学习环境安装及Git小技巧以下记录一些命令,针对pytorch安装的一些小技巧. 1.命令查看镜像源通道 conda config --show channels 添加镜像源通道 conda ...
小技巧（6）：进行BelgiumTSC交通标志数据集识别（定义自己的数据集）
小技巧(5):将TT100K数据集转成VOC格式,并且用Python脚本选出45类超过100张的图片和XML 文章目录 1 数据预处理 1.1 下载数据集 1.2 制作bs_dataset 1.3 制 ...

PyTorch小技巧

文章目录

Pytorch搭建网络问题

1. 数据预处理

1.1 归一化 (Normalization)

1.2 标准化(Standardization)

1.3 正则化

1.4 Pytorch中常用张量操作

1.4.1 torch.cat

1.4.2 torch.stack

1.4.3 round四舍五入操作

1.4.4 Tensor()与tensor()

1.4.5取分类输出的最大值

1.4.6字典切片

1.4.7 数据类型转换

(1) Numpy转换为Tensor

(2) Tensor转换为Numpy

(3) Tensor与 list 相互转换

(4) 基本数据类型转换

(5) type_as将张量转换成指定类型张量

(6) 使用torch.type()函数

1.4.8 isinstance数据类型判断

1.5 数据加载

1.6 不同维度数组相乘操作

1.7 pytorch中函数公式

1.7.1 logsigmoid形式

1.7.2 激活函数

1.8 Excel数据提取转换

2. 梯度操作

2.1 打开梯度

2.2 梯度清零

3. 网络模型搭建

3.1 Sequential方法

3.2 Class类方法

3.3 优化器

3.4 损失函数

3.5 修改模型默认参数

3.6 model.train和model.eval

4. 保存模型参数

4.1 仅保存网络参数

4.2 保存整个网络

5. 遇到的巨坑

5.1 CrossEntropyLoss分类问题

PyTorch小技巧相关推荐

最新文章

热门文章