当前位置：首页 > news >正文

论某网站职能建设室内效果图制作流程

news 2026/5/7 18:08:27

论某网站职能建设,室内效果图制作流程,吉林市做网站的公司,西安网站建设第一品牌PyTorch学习笔记#xff08;一#xff09;#xff1a;PyTorch环境安装往期学习资料推荐#xff1a; 1.Pytorch实战笔记_GoAI的博客-CSDN博客 2.Pytorch入门教程_GoAI的博客-CSDN博客安装参考#xff1a; 1.视频教程#xff1a;3分钟深度学习【环境搭建】CUDA Anacon…PyTorch学习笔记一PyTorch环境安装往期学习资料推荐 1.Pytorch实战笔记_GoAI的博客-CSDN博客 2.Pytorch入门教程_GoAI的博客-CSDN博客安装参考 1.视频教程3分钟深度学习【环境搭建】CUDA Anaconda 简单粗暴_哔哩哔哩】 2.windows10下CUDA11.1、cuDNN8.0、tensorflow-gpu2.4.1安装教程以及问题解决方法_3.Win10中CUDA、cuDNN的安装与卸载 4.Pytorch详细安装-强推 1.安装CUDA 安装包下载地址主博客在介绍版本选择的时候也有提到官网各种version的CUDA下载地址官网各种cuDNN下载地址打开“cuda_8.0.44_win10.exe”此过程会很慢耐心等待这也提示我该换电脑了选择解压地址反正是临时的就C盘吧问题不大开始解压此过程依然很慢特别是我的电脑那段时间不知道出了什么问题CPU和内存占用异常高这个过程也是费了我很多时间解压完毕加在安装程序开始安装 4. 敲黑板了这里千万不要选默认的精简这里的精简应该改成全部才对看下面的小字说明这就是全家桶倒不是说安装全家桶不可以主要是有一个东西的安装会一直导致安装失败特别是这个visual studio integration千万不能选选择以下的安装就够了就装在C盘吧之后别的地方路径啥的会方便一点安装成功验证安装成功 1环境变量应该已经自动加好 2cmd里查看版本信息nvcc -V 3进入到路径下后查看GPU运行时的监测界面 4运行bandwidthTest.exe需要先进入到所在目录 5运行deviceQuery.exe需要先进入到所在目录 2.安装cuDNN 官网各种cuDNN下载地址 cuDNN称不上安装只需要将下载下来的压缩包解压后将对应文件夹的文件放到CUDA安装路径下的对应文件夹里即可 3.卸载 cuDNN本来就只是将文件拷贝进CUDA的安装目录故删除即可卸载CUDA后直接删除整个文件夹也可以 CUDA的卸载控制面板-卸载程序不要用360等杀毒软件找不到对应程序的按照安装时间排序最上面这几个带版本号的就是刚才安装的CUDA了挨个卸载即可 PyTorch学习笔记二PyTorch简介与基础知识 1. PyTorch简介概念由Facebook人工智能研究小组开发的一种基于Lua编写的Torch库的Python实现的深度学习库优势简洁、上手快、具有良好的文档和社区支持、项目开源、支持代码调试、丰富的扩展库 2 PyTorch基础知识 2.1张量分类0维张量标量、1维张量向量、2维张量矩阵、3维张量时间序列、4维张量图像、5维张量视频概念一个数据容器可以包含数据、字符串等 import torch # 创建tensor x torch.rand(4, 3) print(x) # 构造数据类型为long数据是0的矩阵 x torch.zeros(4, 3, dtypetorch.long) print(x)tensor([[0.9515, 0.6332, 0.8228],[0.3508, 0.0493, 0.7606],[0.7326, 0.7003, 0.1925],[0.1172, 0.8946, 0.9501]]) tensor([[0, 0, 0],[0, 0, 0],[0, 0, 0],[0, 0, 0]])常见的构造Tensor的函数函数功能Tensor(*sizes)基础构造函数tensor(data)类似于np.arrayones(*sizes)全1zeros(*sizes)全0eye(*sizes)对角为1其余为0arange(s,e,step)从s到e步长为steplinspace(s,e,steps)从s到e均匀分成step份rand/randn(*sizes)rand是[0,1)均匀分布randn是服从N01的正态分布normal(mean,std)正态分布均值为mean标准差是stdrandperm(m)随机排列操作使用索引表示的变量与原数据共享内存即修改其中一个另一个也会被修改使用torch.view改变tensor的大小广播机制当对两个形状不同的Tensor按元素运算时可能会触发广播(broadcasting)机制 # 使用view改变张量的大小 x torch.randn(5, 4) y x.view(20) z x.view(-1, 5) # -1是指这一维的维数由其他维度决定 print(x.size(), y.size(), z.size())torch.Size([5, 4]) torch.Size([20]) torch.Size([4, 5]) x tensor([[1, 2]]) y tensor([[1],[2],[3]]) x y tensor([[2, 3],[3, 4],[4, 5]])2.2 自动求导 autograd包提供张量上的自动求导机制原理如果设置.requires_grad为True那么将会追踪张量的所有操作。当完成计算后可以通过调用.backward()自动计算所有的梯度。张量的所有梯度将会自动累加到.grad属性FunctionTensor和Function互相连接生成了一个无环图 (acyclic graph)它编码了完整的计算历史。每个张量都有一个.grad_fn属性该属性引用了创建Tensor自身的Function x torch.ones(2, 2, requires_gradTrue) print(x) tensor([[1., 1.],[1., 1.]], requires_gradTrue) y x ** 2 print(y) tensor([[1., 1.],[1., 1.]], grad_fnPowBackward0) z y * y * 3 out z.mean() print(z , z) print(z mean , out) z tensor([[3., 3.],[3., 3.]], grad_fnMulBackward0) z mean tensor(3., grad_fnMeanBackward0)grad的反向传播运行反向传播梯度都会累加之前的梯度所以一般在反向传播之前需把梯度清零 out.backward() print(x.grad) tensor([[3., 3.],[3., 3.]]) # 反向传播累加 out2 x.sum() out2.backward() print(x.grad) tensor([[4., 4.],[4., 4.]])2.3并行计算目的通过使用多个GPU参与训练加快训练速度提高模型学习的效果 CUDA通过使用NVIDIA提供的GPU并行计算框架采用cuda()方法让模型或者数据迁移到GPU中进行计算并行计算方法 Network partitioning将一个模型网络的各部分拆分分配到不同的GPU中,执行不同的计算任务Layer-wise partitioning将同一层模型拆分分配到不同的GPU中训练同一层模型的部分任务Data parallelism主流将不同的数据分配到不同的GPU中执行相同的任务 PyTorch学习笔记三PyTorch主要组成模块往期学习资料推荐 1.Pytorch实战笔记_GoAI的博客-CSDN博客 2.Pytorch入门教程_GoAI的博客-CSDN博客 1 深度学习步骤 1数据预处理通过专门的数据加载通过批训练提高模型表现每次训练读取固定数量的样本输入到模型中进行训练 2深度神经网络搭建逐层搭建实现特定功能的层如积层、池化层、批正则化层、LSTM层等 3损失函数和优化器的设定保证反向传播能够在用户定义的模型结构上实现 4模型训练使用并行计算加速训练将数据按批加载放入GPU中训练对损失函数反向传播回网络最前面的层同时使用优化器调整网络参数 2 基本配置导入相关的包 import os import numpy as py import torch import torch.nn as nn from torch.utils.data import Dataset, DataLoader import torch.optim as optimizer统一设置超参数batch size、初始学习率、训练次数、GPU配置 # set batch size batch_size 16# 初始学习率 lr 1e-4# 训练次数 max_epochs 100# 配置GPU device torch.device(cuda:1 if torch.cuda.is_available() else cpu) devicedevice(typecuda, index1)3 数据读入读取方式通过DatasetDataLoader的方式加载数据Dataset定义好数据的格式和数据变换形式DataLoader用iterative的方式不断读入批次数据。自定义Dataset类实现__init___、__getitem__、__len__函数 torch.utils.data.DataLoader参数 batch_size样本是按“批”读入的表示每次读入的样本数num_workers表示用于读取数据的进程数shuffle是否将读入的数据打乱drop_last对于样本最后一部分没有达到批次数的样本使其不再参与训练 4 模型构建 4.1 神经网络的构造通过Module类构造模型实例化模型之后可完成模型构造 # 构造多层感知机 class MLP(nn.Module):def __init__(self, **kwargs):super(MLP, self).__init__(**kwargs)self.hidden nn.Linear(784, 256)self.act nn.ReLU()self.output nn.Linear(256, 10)def forward(self, X):o self.act(self.hidden(x))return self.output(o)x torch.rand(2, 784) net MLP() print(x) net(x)tensor([[0.8924, 0.9624, 0.3262, ..., 0.8376, 0.1889, 0.9060],[0.3609, 0.8005, 0.5175, ..., 0.6255, 0.1462, 0.9846]])tensor([[-0.0902, 0.0199, 0.0677, -0.0679, 0.0799, 0.0826, 0.0628, 0.1809,-0.2387, 0.0366],[-0.2271, 0.0056, -0.0984, -0.0432, -0.0160, -0.0038, 0.0953, 0.0545,-0.1530, -0.0214]], grad_fnAddmmBackward)4.2 神经网络常见的层不含模型参数的层 # 构造一个输入减去均值后输出的层 class MyLayer(nn.Module):def __init__(self, **kwargs):super(MyLayer, self).__init__(**kwargs)def forward(self, x):return x - x.mean() x torch.tensor([0, 5, 10, 15, 20], dtypetorch.float) layer MyLayer() layer(x)tensor([-10., -5., 0., 5., 10.])含模型参数的层如果一个Tensor是Parameter那么它会⾃动被添加到模型的参数列表里 # 使用ParameterList定义参数的列表 class MyListDense(nn.Module):def __init__(self):super(MyListDense, self).__init__()self.params nn.ParameterList([nn.Parameter(torch.randn(4, 4)) for i in range(3)])self.params.append(nn.Parameter(torch.randn(4, 1)))def forward(self, x):for i in range(len(self.params)):x torch.mm(x, self.params[i])return x net MyListDense() print(net)MyListDense((params): ParameterList((0): Parameter containing: [torch.FloatTensor of size 4x4](1): Parameter containing: [torch.FloatTensor of size 4x4](2): Parameter containing: [torch.FloatTensor of size 4x4](3): Parameter containing: [torch.FloatTensor of size 4x1]) )# 使用ParameterDict定义参数的字典 class MyDictDense(nn.Module):def __init__(self):super(MyDictDense, self).__init__()self.params nn.ParameterDict({linear1: nn.Parameter(torch.randn(4, 4)),linear2: nn.Parameter(torch.randn(4, 1))})# 新增参数linear3self.params.update({linear3: nn.Parameter(torch.randn(4, 2))}) def forward(self, x, choicelinear1):return torch.mm(x, self.params[choice])net MyDictDense() print(net)MyDictDense((params): ParameterDict((linear1): Parameter containing: [torch.FloatTensor of size 4x4](linear2): Parameter containing: [torch.FloatTensor of size 4x1](linear3): Parameter containing: [torch.FloatTensor of size 4x2]) ) 二维卷积层使用nn.Conv2d类构造模型参数包括卷积核和标量偏差在训练模式时通常先对卷积核随机初始化再不断迭代卷积核和偏差 # 计算卷积层对输入和输出做相应的升维和降维 def comp_conv2d(conv2d, X):# (1, 1)代表批量大小和通道数X X.view((1, 1) X.shape)Y conv2d(X)# 排除不关心的前两维批量和通道return Y.view(Y.shape[2:]) # 注意这里是两侧分别填充1⾏或列所以在两侧一共填充2⾏或列 conv2d nn.Conv2d(in_channels1, out_channels1, kernel_size3,padding1)X torch.rand(8, 8) comp_conv2d(conv2d, X).shapetorch.Size([8, 8])池化层直接计算池化窗口内元素的最大值或者平均值分别叫做最大池化或平均池化 # 二维池化层 def pool2d(X, pool_size, modemax):p_h, p_w pool_sizeY torch.zeros((X.shape[0] - p_h 1, X.shape[1] - p_w 1))for i in range(Y.shape[0]):for j in range(Y.shape[1]):if mode max:Y[i, j] X[i: i p_h, j: j p_w].max()elif mode avg:Y[i, j] X[i: i p_h, j: j p_w].mean()return Y X torch.tensor([[0, 1, 2], [3, 4, 5], [6, 7, 8]], dtypetorch.float) pool2d(X, (2, 2), max)tensor([[4., 5.],[7., 8.]])pool2d(X, (2, 2), avg)tensor([[2., 3.],[5., 6.]])4.3 模型示例神经网络训练过程定义可学习参数的神经网络在输入数据集上进行迭代训练通过神经网络处理输入数据计算loss损失值将梯度反向传播给神经网络参数更新网络参数使用梯度下降 LeNet(前馈神经网络) import torch.nn.functional as Fclass Net(nn.Module):def __init__(self):super(Net, self).__init__()# 输入图像channel是1输出channel是65x5卷积核self.conv1 nn.Conv2d(1, 6, 5)self.conv2 nn.Conv2d(6, 16, 5)# an affine operation: y Wx bself.fc1 nn.Linear(16 * 5 * 5, 120)self.fc2 nn.Linear(120, 84)self.fc3 nn.Linear(84, 10)def forward(self, x):# 2x2 Max poolingx F.max_pool2d(F.relu(self.conv1(x)), (2, 2))# 如果是方阵,则可以只使用一个数字进行定义x F.max_pool2d(F.relu(self.conv2(x)), 2)x x.view(-1, self.num_flat_features(x))x F.relu(self.fc1(x))x F.relu(self.fc2(x))x self.fc3(x)return xdef num_flat_features(self, x):# 除去批处理维度的其他所有维度size x.size()[1:] num_features 1for s in size:num_features * sreturn num_featuresnet Net() netNet((conv1): Conv2d(1, 6, kernel_size(5, 5), stride(1, 1))(conv2): Conv2d(6, 16, kernel_size(5, 5), stride(1, 1))(fc1): Linear(in_features400, out_features120, biasTrue)(fc2): Linear(in_features120, out_features84, biasTrue)(fc3): Linear(in_features84, out_features10, biasTrue) )# 假设输入的数据为随机的32x32 input torch.randn(1, 1, 32, 32) out net(input) print(out)tensor([[-0.0921, -0.0605, -0.0726, -0.0451, 0.1399, -0.0087, 0.1075, 0.0799,-0.1472, 0.0288]], grad_fnAddmmBackward)# 清零所有参数的梯度缓存然后进行随机梯度的反向传播 net.zero_grad() out.backward(torch.randn(1, 10))AlexNet class AlexNet(nn.Module):def __init__(self):super(AlexNet, self).__init__()self.conv nn.Sequential(# in_channels, out_channels, kernel_size, stride, paddingnn.Conv2d(1, 96, 11, 4), nn.ReLU(),# kernel_size, stridenn.MaxPool2d(3, 2), # 减小卷积窗口使用填充为2来使得输入与输出的高和宽一致且增大输出通道数nn.Conv2d(96, 256, 5, 1, 2),nn.ReLU(),nn.MaxPool2d(3, 2),# 连续3个卷积层且使用更小的卷积窗口。# 除了最后的卷积层外进一步增大了输出通道数。# 前两个卷积层后不使用池化层来减小输入的高和宽nn.Conv2d(256, 384, 3, 1, 1),nn.ReLU(),nn.Conv2d(384, 384, 3, 1, 1),nn.ReLU(),nn.Conv2d(384, 256, 3, 1, 1),nn.ReLU(),nn.MaxPool2d(3, 2))# 这里全连接层的输出个数比LeNet中的大数倍。使用丢弃层来缓解过拟合self.fc nn.Sequential(nn.Linear(256*5*5, 4096),nn.ReLU(),nn.Dropout(0.5),nn.Linear(4096, 4096),nn.ReLU(),nn.Dropout(0.5),# 输出层。由于这里使用Fashion-MNIST所以用类别数为10而非论文中的1000nn.Linear(4096, 10),)def forward(self, img):feature self.conv(img)output self.fc(feature.view(img.shape[0], -1))return output net AlexNet() print(net)AlexNet((conv): Sequential((0): Conv2d(1, 96, kernel_size(11, 11), stride(4, 4))(1): ReLU()(2): MaxPool2d(kernel_size3, stride2, padding0, dilation1, ceil_modeFalse)(3): Conv2d(96, 256, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): ReLU()(5): MaxPool2d(kernel_size3, stride2, padding0, dilation1, ceil_modeFalse)(6): Conv2d(256, 384, kernel_size(3, 3), stride(1, 1), padding(1, 1))(7): ReLU()(8): Conv2d(384, 384, kernel_size(3, 3), stride(1, 1), padding(1, 1))(9): ReLU()(10): Conv2d(384, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1))(11): ReLU()(12): MaxPool2d(kernel_size3, stride2, padding0, dilation1, ceil_modeFalse))(fc): Sequential((0): Linear(in_features6400, out_features4096, biasTrue)(1): ReLU()(2): Dropout(p0.5, inplaceFalse)(3): Linear(in_features4096, out_features4096, biasTrue)(4): ReLU()(5): Dropout(p0.5, inplaceFalse)(6): Linear(in_features4096, out_features10, biasTrue)) )5 损失函数二分类交叉熵损失函数torch.nn.BCELoss用于计算二分类任务时的交叉熵 m nn.Sigmoid() loss nn.BCELoss() input torch.randn(3, requires_gradTrue) target torch.empty(3).random_(2)output loss(m(input), target) output.backward() print(BCE损失函数的计算结果:,output)BCE损失函数的计算结果: tensor(0.9389, grad_fnBinaryCrossEntropyBackward)交叉熵损失函数torch.nn.CrossEntropyLoss用于计算交叉熵 loss nn.CrossEntropyLoss() input torch.randn(3, 5, requires_gradTrue) target torch.empty(3, dtypetorch.long).random_(5)output loss(input, target) output.backward() print(CrossEntropy损失函数的计算结果:,output)CrossEntropy损失函数的计算结果: tensor(2.7367, grad_fnNllLossBackward)L1损失函数torch.nn.L1Loss用于计算输出y和真实值target之差的绝对值 loss nn.L1Loss() input torch.randn(3, 5, requires_gradTrue) target torch.randn(3, 5)output loss(input, target) output.backward() print(L1损失函数的计算结果:,output)L1损失函数的计算结果: tensor(1.0351, grad_fnL1LossBackward)MSE损失函数torch.nn.MSELoss用于计算输出y和真实值target之差的平方 loss nn.MSELoss() input torch.randn(3, 5, requires_gradTrue) target torch.randn(3, 5)output loss(input, target) output.backward() print(MSE损失函数的计算结果:,output)MSE损失函数的计算结果: tensor(1.7612, grad_fnMseLossBackward)平滑L1Smooth L1损失函数torch.nn.SmoothL1Loss用于计算L1的平滑输出减轻离群点带来的影响通过与L1损失的比较在0点的尖端处过渡更为平滑 loss nn.SmoothL1Loss() input torch.randn(3, 5, requires_gradTrue) target torch.randn(3, 5)output loss(input, target) output.backward() print(Smooth L1损失函数的计算结果:,output)Smooth L1损失函数的计算结果: tensor(0.7252, grad_fnSmoothL1LossBackward)目标泊松分布的负对数似然损失torch.nn.PoissonNLLLoss loss nn.PoissonNLLLoss() log_input torch.randn(5, 2, requires_gradTrue) target torch.randn(5, 2)output loss(log_input, target) output.backward() print(PoissonNL损失函数的计算结果:,output)PoissonNL损失函数的计算结果: tensor(1.7593, grad_fnMeanBackward0)KL散度torch.nn.KLDivLoss用于连续分布的距离度量可用在对离散采用的连续输出空间分布的回归场景 inputs torch.tensor([[0.5, 0.3, 0.2], [0.2, 0.3, 0.5]]) target torch.tensor([[0.9, 0.05, 0.05], [0.1, 0.7, 0.2]], dtypetorch.float) loss nn.KLDivLoss(reductionbatchmean)output loss(inputs,target) print(KLDiv损失函数的计算结果:,output)KLDiv损失函数的计算结果: tensor(-1.0006)MarginRankingLosstorch.nn.MarginRankingLoss用于计算两组数据之间的差异相似度可使用在排序任务的场景 nn.MarginRankingLoss() torch.randn(3, requires_gradTrue) torch.randn(3, requires_gradTrue) torch.randn(3).sign() loss(input1, input2, target).backward()(MarginRanking损失函数的计算结果:,output)MarginRanking损失函数的计算结果: tensor(1.1762, grad_fnMeanBackward0)多标签边界损失函数torch.nn.MultiLabelMarginLoss用于计算多标签分类问题的损失 loss nn.MarginRankingLoss() input1 torch.randn(3, requires_gradTrue) input2 torch.randn(3, requires_gradTrue) target torch.randn(3).sign()output loss(input1, input2, target) output.backward() print(MarginRanking损失函数的计算结果:,output)MultiLabelMargin损失函数的计算结果: tensor(0.4500)二分类损失函数torch.nn.SoftMarginLoss用于计算二分类的logistic损失 loss nn.MultiLabelMarginLoss() x torch.FloatTensor([[0.9, 0.2, 0.4, 0.8]]) # 真实的分类是第3类和第0类 y torch.LongTensor([[3, 0, -1, 1]])output loss(x, y) print(MultiLabelMargin损失函数的计算结果:,output)SoftMargin损失函数的计算结果: tensor(0.6764)多分类的折页损失函数torch.nn.MultiMarginLoss用于计算多分类问题的折页损失 inputs torch.tensor([[0.3, 0.7], [0.5, 0.5]]) target torch.tensor([0, 1], dtypetorch.long) loss_f nn.MultiMarginLoss() output loss_f(inputs, target) print(MultiMargin损失函数的计算结果:,output)MultiMargin损失函数的计算结果: tensor(0.6000)三元组损失函数torch.nn.TripletMarginLoss用于处理实体1关系实体2类型的数据计算该类型数据的损失 triplet_loss nn.TripletMarginLoss(margin1.0, p2) anchor torch.randn(100, 128, requires_gradTrue) positive torch.randn(100, 128, requires_gradTrue) negative torch.randn(100, 128, requires_gradTrue)output triplet_loss(anchor, positive, negative) output.backward() print(TripletMargin损失函数的计算结果:,output)TripletMargin损失函数的计算结果: tensor(1.1507, grad_fnMeanBackward0)HingEmbeddingLosstorch.nn.HingeEmbeddingLoss用于计算输出的embedding结果的Hing损失 loss_f nn.HingeEmbeddingLoss() inputs torch.tensor([[1., 0.8, 0.5]]) target torch.tensor([[1, 1, -1]])output loss_f(inputs,target) print(HingEmbedding损失函数的计算结果:,output)HingEmbedding损失函数的计算结果: tensor(0.7667)余弦相似度torch.nn.CosineEmbeddingLoss用于计算两个向量的余弦相似度如果两个向量距离近则损失函数值小反之亦然 loss_f nn.CosineEmbeddingLoss() inputs_1 torch.tensor([[0.3, 0.5, 0.7], [0.3, 0.5, 0.7]]) inputs_2 torch.tensor([[0.1, 0.3, 0.5], [0.1, 0.3, 0.5]]) target torch.tensor([1, -1], dtypetorch.float)output loss_f(inputs_1,inputs_2,target) print(CosineEmbedding损失函数的计算结果:,output)CosineEmbedding损失函数的计算结果: tensor(0.5000)CTC损失函数torch.nn.CTCLoss用于处理时序数据的分类问题计算连续时间序列和目标序列之间的损失 # Target are to be padded # 序列长度 T 50 # 类别数包括空类 C 20 # batch size N 16 # Target sequence length of longest target in batch (padding length) S 30 # Minimum target length, for demonstration purposes S_min 10 input torch.randn(T, N, C).log_softmax(2).detach().requires_grad_() # 初始化target(0 blank, 1:C classes) target torch.randint(low1, highC, size(N, S), dtypetorch.long)input_lengths torch.full(size(N,), fill_valueT, dtypetorch.long) target_lengths torch.randint(lowS_min, highS, size(N,), dtypetorch.long)ctc_loss nn.CTCLoss() loss ctc_loss(input, target, input_lengths, target_lengths) loss.backward() print(CTC损失函数的计算结果:,loss)CTC损失函数的计算结果: tensor(6.1333, grad_fnMeanBackward0)6 优化器 6.1 Optimizer的属性和方法使用方向为了使求解参数过程更快使用BP优化器逼近求解 Optimizer的属性 defaults优化器的超参数state参数的缓存param_groups参数组顺序是paramslrmomentumdampeningweight_decaynesterov Optimizer的方法 zero_grad()清空所管理参数的梯度step()执行一步梯度更新add_param_group()添加参数组load_state_dict()加载状态参数字典可以用来进行模型的断点续训练继续上次的参数进行训练state_dict()获取优化器当前状态信息字典 6.2 基本操作 # 设置权重服从正态分布 -- 2 x 2 weight torch.randn((2, 2), requires_gradTrue)# 设置梯度为全1矩阵 -- 2 x 2 weight.grad torch.ones((2, 2))# 输出现有的weight和data print(The data of weight before step:\n{}.format(weight.data)) print(The grad of weight before step:\n{}.format(weight.grad)) The data of weight before step: tensor([[-0.5871, -1.1311],[-1.0446, 0.2656]]) The grad of weight before step: tensor([[1., 1.],[1., 1.]]) # 实例化优化器 optimizer torch.optim.SGD([weight], lr0.1, momentum0.9)# 进行一步操作 optimizer.step()# 查看进行一步后的值梯度 print(The data of weight after step:\n{}.format(weight.data)) print(The grad of weight after step:\n{}.format(weight.grad)) The data of weight after step: tensor([[-0.6871, -1.2311],[-1.1446, 0.1656]]) The grad of weight after step: tensor([[1., 1.],[1., 1.]]) # 权重清零 optimizer.zero_grad()# 检验权重是否为0 print(The grad of weight after optimizer.zero_grad():\n{}.format(weight.grad)) The grad of weight after optimizer.zero_grad(): tensor([[0., 0.],[0., 0.]])# 添加参数weight2 weight2 torch.randn((3, 3), requires_gradTrue) optimizer.add_param_group({params: weight2, lr: 0.0001, nesterov: True})# 查看现有的参数 print(optimizer.param_groups is\n{}.format(optimizer.param_groups))# 查看当前状态信息 opt_state_dict optimizer.state_dict() print(state_dict before step:\n, opt_state_dict) optimizer.param_groups is [{params: [tensor([[-0.6871, -1.2311],[-1.1446, 0.1656]], requires_gradTrue)], lr: 0.1, momentum: 0.9, dampening: 0, weight_decay: 0, nesterov: False}, {params: [tensor([[ 0.0411, -0.6569, 0.7445],[-0.7056, 1.1146, -0.4409],[-0.2302, -1.1507, -1.3807]], requires_gradTrue)], lr: 0.0001, nesterov: True, momentum: 0.9, dampening: 0, weight_decay: 0}] state_dict before step:{state: {0: {momentum_buffer: tensor([[1., 1.],[1., 1.]])}}, param_groups: [{lr: 0.1, momentum: 0.9, dampening: 0, weight_decay: 0, nesterov: False, params: [0]}, {lr: 0.0001, nesterov: True, momentum: 0.9, dampening: 0, weight_decay: 0, params: [1]}]}# 进行5次step操作 for _ in range(50):optimizer.step() # 输出现有状态信息 print(state_dict after step:\n, optimizer.state_dict()) state_dict after step:{state: {0: {momentum_buffer: tensor([[0.0052, 0.0052],[0.0052, 0.0052]])}}, param_groups: [{lr: 0.1, momentum: 0.9, dampening: 0, weight_decay: 0, nesterov: False, params: [0]}, {lr: 0.0001, nesterov: True, momentum: 0.9, dampening: 0, weight_decay: 0, params: [1]}]} 7 训练与评估 def train(epoch):# 设置训练状态model.train()train_loss 0# 循环读取DataLoader中的全部数据for data, label in train_loader:# 将数据放到GPU用于后续计算data, label data.cuda(), label.cuda()# 将优化器的梯度清0optimizer.zero_grad()# 将数据输入给模型output model(data)# 设置损失函数loss criterion(label, output)# 将loss反向传播给网络loss.backward()# 使用优化器更新模型参数optimizer.step()# 累加训练损失train_loss loss.item()*data.size(0)train_loss train_loss/len(train_loader.dataset)print(Epoch: {} \tTraining Loss: {:.6f}.format(epoch, train_loss)) def val(epoch): # 设置验证状态model.eval()val_loss 0# 不设置梯度with torch.no_grad():for data, label in val_loader:data, label data.cuda(), label.cuda()output model(data)preds torch.argmax(output, 1)loss criterion(output, label)val_loss loss.item()*data.size(0)# 计算准确率running_accu torch.sum(preds label.data)val_loss val_loss/len(val_loader.dataset)print(Epoch: {} \tTraining Loss: {:.6f}.format(epoch, val_loss))PyTorch学习笔记四PyTorch基础实战通过一个基础实战案例结合前面所涉及的PyTorch入门知识。本次任务是对10个类别的“时装”图像进行分类使用FashionMNIST数据集fashion-mnist/data/fashion at master · zalandoresearch/fashion-mnist · GitHub FashionMNIST数据集中包含已经预先划分好的训练集和测试集其中训练集共60,000张图像测试集共10,000张图像。每张图像均为单通道黑白图像大小为32*32pixel分属10个类别。 1.导入必要****的包 import os import numpy as np import pandas as pd import torch import torch.nn as nn import torch.optim as optim from torch.utils.data import Dataset, DataLoader2 配置训练环境和超参数 # 配置GPU这里有两种方式 ## 方案一使用os.environ os.environ[CUDA_VISIBLE_DEVICES] 0 # 方案二使用“device”后续对要使用GPU的变量用.to(device)即可 #device torch.device(cuda:1 if torch.cuda.is_available() else cpu)## 配置其他超参数如batch_size, num_workers, learning rate, 以及总的epochs batch_size 256 num_workers 4 # 对于Windows用户这里应设置为0否则会出现多线程错误 lr 1e-4 epochs 203 数据读取和加载这里同时展示两种方式: 下载并使用PyTorch提供的内置数据集从网站下载以csv格式存储的数据读入并转成预期的格式第一种数据读入方式只适用于常见的数据集如MNISTCIFAR10等PyTorch官方提供了数据下载。这种方式往往适用于快速测试方法比如测试下某个idea在MNIST数据集上是否有效第二种数据读入方式需要自己构建Dataset这对于PyTorch应用于自己的工作中十分重要同时还需要对数据进行必要的变换比如说需要将图片统一为一致的大小以便后续能够输入网络训练需要将数据格式转为Tensor类等等。 from torchvision import transforms# 设置数据变换 image_size 28 data_transform transforms.Compose([transforms.ToPILImage(), # 这一步取决于后续的数据读取方式如果使用内置数据集则不需要transforms.Resize(image_size),transforms.ToTensor() ])## 读取方式一使用torchvision自带数据集下载可能需要一段时间 from torchvision import datasetstrain_data datasets.FashionMNIST(root./, trainTrue, downloadTrue, transformdata_transform) test_data datasets.FashionMNIST(root./, trainFalse, downloadTrue, transformdata_transform)## 读取方式二读入csv格式的数据自行构建Dataset类即自定义数据集 # csv数据下载链接https://www.kaggle.com/zalando-research/fashionmnist class FMDataset(Dataset):def __init__(self, df, transformNone):self.df dfself.transform transformself.images df.iloc[:,1:].values.astype(np.uint8)self.labels df.iloc[:, 0].valuesdef __len__(self):return len(self.images)def __getitem__(self, idx):image self.images[idx].reshape(28,28,1)label int(self.labels[idx])if self.transform is not None:image self.transform(image)else:image torch.tensor(image/255., dtypetorch.float)label torch.tensor(label, dtypetorch.long)return image, labeltrain_df pd.read_csv(./FashionMNIST/fashion-mnist_train.csv) test_df pd.read_csv(./FashionMNIST/fashion-mnist_test.csv) train_data FMDataset(train_df, data_transform) test_data FMDataset(test_df, data_transform)在构建训练和测试数据集完成后需要定义DataLoader类以便在训练和测试时加载数据 train_loader DataLoader(train_data, batch_sizebatch_size, shuffleTrue, num_workersnum_workers, drop_lastTrue) test_loader DataLoader(test_data, batch_sizebatch_size, shuffleFalse, num_workersnum_workers)读入后我们可以做一些数据可视化操作主要是验证我们读入的数据是否正确 import matplotlib.pyplot as plt image, label next(iter(train_loader)) print(image.shape, label.shape) plt.imshow(image[0][0], cmapgray)torch.Size([256, 1, 28, 28]) torch.Size([256]) matplotlib.image.AxesImage at 0x1b39b49fcc84.模型设计 # 使用CNN class Net(nn.Module):def __init__(self):super(Net, self).__init__()self.conv nn.Sequential(nn.Conv2d(1, 32, 5),nn.ReLU(),nn.MaxPool2d(2, stride2),nn.Dropout(0.3),nn.Conv2d(32, 64, 5),nn.ReLU(),nn.MaxPool2d(2, stride2),nn.Dropout(0.3))self.fc nn.Sequential(nn.Linear(64*4*4, 512),nn.ReLU(),nn.Linear(512, 10))def forward(self, x):x self.conv(x)x x.view(-1, 64*4*4)x self.fc(x)# x nn.functional.normalize(x)return xmodel Net() model model.cuda()5 设置损失函数和优化器使用torch.nn模块自带的CrossEntropy损失。 PyTorch会自动把整数型的label转为one-hot型用于计算CE loss 这里需要确保label是从0开始的同时模型不加softmax层使用logits计算,这也说明了PyTorch训练中各个部分不是独立的需要通盘考虑。 # 使用交叉熵损失函数 criterion nn.CrossEntropyLoss()# 使用Adam优化器 optimizer optim.Adam(model.parameters(), lr0.001)6.训练与测试各自封装成函数方便后续调用关注两者的主要区别模型状态设置是否需要初始化优化器是否需要将loss传回到网络是否需要每步更新optimizer 此外对于测试或验证过程可以计算分类准确率。训练 def train(epoch):# 设置训练状态model.train()train_loss 0# 循环读取DataLoader中的全部数据for data, label in train_loader:# 将数据放到GPU用于后续计算data, label data.cuda(), label.cuda()# 将优化器的梯度清0optimizer.zero_grad()# 将数据输入给模型output model(data)# 设置损失函数loss criterion(output, label)# 将loss反向传播给网络loss.backward()# 使用优化器更新模型参数optimizer.step()# 累加训练损失train_loss loss.item() * data.size(0)train_loss train_loss/len(train_loader.dataset)print(Epoch: {} \tTraining Loss: {:.6f}.format(epoch, train_loss)) 验证 def val(epoch): # 设置验证状态model.eval()val_loss 0gt_labels []pred_labels []# 不设置梯度with torch.no_grad():for data, label in test_loader:data, label data.cuda(), label.cuda()output model(data)preds torch.argmax(output, 1)gt_labels.append(label.cpu().data.numpy())pred_labels.append(preds.cpu().data.numpy())loss criterion(output, label)val_loss loss.item()*data.size(0)# 计算验证集的平均损失val_loss val_loss/len(test_loader.dataset)gt_labels, pred_labels np.concatenate(gt_labels), np.concatenate(pred_labels)# 计算准确率acc np.sum(gt_labelspred_labels)/len(pred_labels)print(Epoch: {} \tValidation Loss: {:.6f}, Accuracy: {:6f}.format(epoch, val_loss, acc))for epoch in range(1, epochs1):train(epoch)val(epoch)Epoch: 1 Training Loss: 0.664049 Epoch: 1 Validation Loss: 0.421500, Accuracy: 0.852400 Epoch: 2 Training Loss: 0.417311 Epoch: 2 Validation Loss: 0.349790, Accuracy: 0.871200 Epoch: 3 Training Loss: 0.355448 Epoch: 3 Validation Loss: 0.318987, Accuracy: 0.879500 Epoch: 4 Training Loss: 0.323644 Epoch: 4 Validation Loss: 0.290521, Accuracy: 0.893800 Epoch: 5 Training Loss: 0.301900 Epoch: 5 Validation Loss: 0.266420, Accuracy: 0.901300 Epoch: 6 Training Loss: 0.286696 Epoch: 6 Validation Loss: 0.246448, Accuracy: 0.909700 Epoch: 7 Training Loss: 0.271441 Epoch: 7 Validation Loss: 0.241845, Accuracy: 0.911200 Epoch: 8 Training Loss: 0.260185 Epoch: 8 Validation Loss: 0.243311, Accuracy: 0.910800 Epoch: 9 Training Loss: 0.247986 Epoch: 9 Validation Loss: 0.225896, Accuracy: 0.916200 Epoch: 10 Training Loss: 0.240718 Epoch: 10 Validation Loss: 0.227848, Accuracy: 0.914700 Epoch: 11 Training Loss: 0.232358 Epoch: 11 Validation Loss: 0.220180, Accuracy: 0.917500 Epoch: 12 Training Loss: 0.223933 Epoch: 12 Validation Loss: 0.215308, Accuracy: 0.919400 Epoch: 13 Training Loss: 0.218354 Epoch: 13 Validation Loss: 0.211890, Accuracy: 0.919300 Epoch: 14 Training Loss: 0.210027 Epoch: 14 Validation Loss: 0.209707, Accuracy: 0.922700 Epoch: 15 Training Loss: 0.203024 Epoch: 15 Validation Loss: 0.208233, Accuracy: 0.925600 Epoch: 16 Training Loss: 0.196965 Epoch: 16 Validation Loss: 0.208209, Accuracy: 0.921900 Epoch: 17 Training Loss: 0.193155 Epoch: 17 Validation Loss: 0.200000, Accuracy: 0.926100 Epoch: 18 Training Loss: 0.184376 Epoch: 18 Validation Loss: 0.197259, Accuracy: 0.926200 Epoch: 19 Training Loss: 0.184272 Epoch: 19 Validation Loss: 0.200259, Accuracy: 0.926000 Epoch: 20 Training Loss: 0.172641 Epoch: 20 Validation Loss: 0.200177, Accuracy: 0.9271007.模型保存训练完成后可以使用torch.save保存模型参数或者整个模型也可以在训练过程中保存模型 save_path ./FahionModel.pkl torch.save(model, save_path)PyTorch学习笔记五模型定义、修改、保存一、PyTorch模型定义的方式 Module 类是 torch.nn 模块里提供的一个模型构造类 (nn.Module)是所有神经⽹网络模块的基类我们可以继承它来定义我们想要的模型PyTorch模型定义应包括两个主要部分各个部分的初始化_init_数据流向定义forward 基于nn.Module可以通过SequentialModuleList和ModuleDict三种方式定义PyTorch模型。 1.Sequential 对应模块为nn.Sequential()。当模型的前向计算为简单串联各个层的计算时 Sequential 类可以通过更加简单的方式定义模型。它可以接收一个子模块的有序字典(OrderedDict) 或者一系列子模块作为参数来逐一添加 Module 的实例⽽模型的前向计算就是将这些实例按添加的顺序逐⼀计算。我们结合Sequential和定义方式加以理解 class MySequential(nn.Module):from collections import OrderedDictdef __init__(self, *args):super(MySequential, self).__init__()if len(args) 1 and isinstance(args[0], OrderedDict): # 如果传入的是一个OrderedDictfor key, module in args[0].items():self.add_module(key, module) # add_module方法会将module添加进self._modules(一个OrderedDict)else: # 传入的是一些Modulefor idx, module in enumerate(args):self.add_module(str(idx), module)def forward(self, input):# self._modules返回一个 OrderedDict保证会按照成员添加时的顺序遍历成for module in self._modules.values():input module(input)return input下面来看下如何使用Sequential来定义模型。只需要将模型的层按序排列起来即可根据层名的不同排列的时候有两种方式直接排列 import torch.nn as nn net nn.Sequential(nn.Linear(784, 256),nn.ReLU(),nn.Linear(256, 10), ) print(net)Sequential((0): Linear(in_features784, out_features256, biasTrue)(1): ReLU()(2): Linear(in_features256, out_features10, biasTrue) )OrderedDict import collections import torch.nn as nn net2 nn.Sequential(collections.OrderedDict([(fc1, nn.Linear(784, 256)),(relu1, nn.ReLU()),(fc2, nn.Linear(256, 10))])) print(net2)Sequential((fc1): Linear(in_features784, out_features256, biasTrue)(relu1): ReLU()(fc2): Linear(in_features256, out_features10, biasTrue) )可以看到使用Sequential定义模型的好处在于简单、易读同时使用Sequential定义的模型不需要再写forward因为顺序已经定义好了。但使用Sequential也会使得模型定义丧失灵活性比如需要在模型中间加入一个外部输入时就不适合用Sequential的方式实现。使用时需根据实际需求加以选择。 2.ModuleList 对应模块为nn.ModuleList()。 ModuleList 接收一个子模块或层需属于nn.Module类的列表作为输入然后也可以类似List那样进行append和extend操作。同时子模块或层的权重也会自动添加到网络中来。 net nn.ModuleList([nn.Linear(784, 256), nn.ReLU()]) net.append(nn.Linear(256, 10)) # # 类似List的append操作 print(net[-1]) # 类似List的索引访问 print(net)Linear(in_features256, out_features10, biasTrue) ModuleList((0): Linear(in_features784, out_features256, biasTrue)(1): ReLU()(2): Linear(in_features256, out_features10, biasTrue) )要特别注意的是nn.ModuleList 并没有定义一个网络它只是将不同的模块储存在一起。ModuleList中元素的先后顺序并不代表其在网络中的真实位置顺序需要经过forward函数指定各个层的先后顺序后才算完成了模型的定义。具体实现时用for循环即可完成 class model(nn.Module):def __init__(self, ...):self.modulelist ......def forward(self, x):for layer in self.modulelist:x layer(x)return x3.ModuleDict 对应模块为nn.ModuleDict()。 ModuleDict和ModuleList的作用类似只是ModuleDict能够更方便地为神经网络的层添加名称。 net nn.ModuleDict({linear: nn.Linear(784, 256),act: nn.ReLU(), }) net[output] nn.Linear(256, 10) # 添加 print(net[linear]) # 访问 print(net.output) print(net)Linear(in_features784, out_features256, biasTrue) Linear(in_features256, out_features10, biasTrue) ModuleDict((act): ReLU()(linear): Linear(in_features784, out_features256, biasTrue)(output): Linear(in_features256, out_features10, biasTrue) )三种方法的比较总结 Sequential适用于快速验证结果不需要同时写__init__和forwardModuleList和ModuleDict在某个完全相同的层需要重复出现多次时非常方便实现可以”一行顶多行“当我们需要之前层的信息的时候比如 ResNets 中的残差计算当前层的结果需要和之前层中的结果进行融合一般使用 ModuleList/ModuleDict 比较方便。二、利用模型块快速搭建复杂网络模型搭建基本方法模型块分析模型块实现利用模型块组装模型以U-Net模型为例该模型为分割模型通过残差连接结构解决了模型学习中的退化问题使得神经网络的深度能够不断扩展。模型块分析每个子块内部的两次卷积DoubleConv左侧模型块之间的下采样连接Down通过Max pooling来实现右侧模型块之间的上采样连接Up输出层的处理OutConv模型块之间的横向连接输入和U-Net底部的连接等计算这些单独的操作可以通过forward函数来实现模型块实现以U-net为例 # 两次卷积 conv 3x3, ReLU class DoubleConv(nn.Module):(convolution [BN] ReLU) * 2def __init__(self, in_channels, out_channels, mid_channelsNone):super().__init__()if not mid_channels:mid_channels out_channelsself.double_conv nn.Sequential(nn.Conv2d(in_channels, mid_channels, kernel_size3, padding1, biasFalse),nn.BatchNorm2d(mid_channels),nn.ReLU(inplaceTrue),nn.Conv2d(mid_channels, out_channels, kernel_size3, padding1, biasFalse),nn.BatchNorm2d(out_channels),nn.ReLU(inplaceTrue))def forward(self, x):return self.double_conv(x)# 下采样 max pool 2x2 class Down(nn.Module):Downscaling with maxpool then double convdef __init__(self, in_channels, out_channels):super().__init__()self.maxpool_conv nn.Sequential(nn.MaxPool2d(2),DoubleConv(in_channels, out_channels))def forward(self, x):return self.maxpool_conv(x) # 上采样 up-conv 2x2 class Up(nn.Module):Upscaling then double convdef __init__(self, in_channels, out_channels, bilinearTrue):super().__init__()# if bilinear, use the normal convolutions to reduce the number of channelsif bilinear:self.up nn.Upsample(scale_factor2, modebilinear, align_cornersTrue)self.conv DoubleConv(in_channels, out_channels, in_channels // 2)else:self.up nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size2, stride2)self.conv DoubleConv(in_channels, out_channels)def forward(self, x1, x2):x1 self.up(x1)# input is CHWdiffY x2.size()[2] - x1.size()[2]diffX x2.size()[3] - x1.size()[3]x1 F.pad(x1, [diffX // 2, diffX - diffX // 2,diffY // 2, diffY - diffY // 2])x torch.cat([x2, x1], dim1)# 输出 conv 1x1 class OutConv(nn.Module):def __init__(self, in_channels, out_channels):super(OutConv, self).__init__()self.conv nn.Conv2d(in_channels, out_channels, kernel_size1)def forward(self, x):return self.conv(x)利用模型块组装U-net模型 class UNet(nn.Module):def __init__(self, n_channels, n_classes, bilinearTrue):super(UNet, self).__init__()self.n_channels n_channelsself.n_classes n_classesself.bilinear bilinearself.inc DoubleConv(n_channels, 64)self.down1 Down(64, 128)self.down2 Down(128, 256)self.down3 Down(256, 512)factor 2 if bilinear else 1self.down4 Down(512, 1024 // factor)self.up1 Up(1024, 512 // factor, bilinear)self.up2 Up(512, 256 // factor, bilinear)self.up3 Up(256, 128 // factor, bilinear)self.up4 Up(128, 64, bilinear)self.outc OutConv(64, n_classes)def forward(self, x):x1 self.inc(x)x2 self.down1(x1)x3 self.down2(x2)x4 self.down3(x3)x5 self.down4(x4)x self.up1(x5, x4)x self.up2(x, x3)x self.up3(x, x2)x self.up4(x, x1)logits self.outc(x)return logits三、PyTorch修改模型 1.模型层以pytorch中torchvision库预定义好的模型ResNet50为例模型参数如下 import torchvision.models as models net models.resnet50() print(net)ResNet((conv1): Conv2d(3, 64, kernel_size(7, 7), stride(2, 2), padding(3, 3), biasFalse)(bn1): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(maxpool): MaxPool2d(kernel_size3, stride2, padding1, dilation1, ceil_modeFalse)(layer1): Sequential((0): Bottleneck((conv1): Conv2d(64, 64, kernel_size(1, 1), stride(1, 1), biasFalse)(bn1): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(conv2): Conv2d(64, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(conv3): Conv2d(64, 256, kernel_size(1, 1), stride(1, 1), biasFalse)(bn3): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(downsample): Sequential((0): Conv2d(64, 256, kernel_size(1, 1), stride(1, 1), biasFalse)(1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue))) ..............(avgpool): AdaptiveAvgPool2d(output_size(1, 1))(fc): Linear(in_features2048, out_features1000, biasTrue) ) 为了适配ImageNet预训练的权重因此最后全连接层fc的输出节点数是1000。假设我们要用这个resnet模型去做一个10分类的问题就应该修改模型的fc层将其输出节点数替换为10。另外我们觉得一层全连接层可能太少了想再加一层。可以做如下修改 from collections import OrderedDict classifier nn.Sequential(OrderedDict([(fc1, nn.Linear(2048, 128)),(relu1, nn.ReLU()), (dropout1,nn.Dropout(0.5)),(fc2, nn.Linear(128, 10)),(output, nn.Softmax(dim1))]))net.fc classifier这里的操作相当于将模型net最后名称为“fc”的层替换成了名称为“classifier”的结构该结构是我们自己定义的。这里使用了SequentialOrderedDict的模型定义方式现在的模型就可以去做10分类任务了。 2.添加外部输入有时候在模型训练中除了已有模型的输入之外还需要输入额外的信息。比如在CNN网络中我们除了输入图像还需要同时输入图像对应的其他信息这时候就需要在已有的CNN网络中添加额外的输入变量。基本思路是将原模型添加输入位置前的部分作为一个整体同时在forward中定义好原模型不变的部分、添加的输入和后续层之间的连接关系从而完成模型的修改。我们以torchvision的resnet50模型为基础任务还是10分类任务。不同点在于我们希望利用已有的模型结构在倒数第二层增加一个额外的输入变量add_variable来辅助预测。具体实现如下 class Model(nn.Module):def __init__(self, net):super(Model, self).__init__()self.net netself.relu nn.ReLU()self.dropout nn.Dropout(0.5)self.fc_add nn.Linear(1001, 10, biasTrue)self.output nn.Softmax(dim1)def forward(self, x, add_variable):x self.net(x)x torch.cat((self.dropout(self.relu(x)), add_variable.unsqueeze(1)),1)x self.fc_add(x)x self.output(x)return x这里的实现要点是通过torch.cat实现了tensor的拼接。torchvision中的resnet50输出是一个1000维的tensor我们通过修改forward函数配套定义一些层先将2048维的tensor通过激活函数层和dropout层再和外部输入变量add_variable拼接最后通过全连接层映射到指定的输出维度10。另外这里对外部输入变量add_variable进行unsqueeze操作是为了和net输出的tensor保持维度一致常用于add_variable是单一数值 (scalar) 的情况此时add_variable的维度是 (batch_size, )需要在第二维补充维数1从而可以和tensor进行torch.cat操作。对于unsqueeze操作可以复习下2.1节的内容和配套代码之后对我们修改好的模型结构进行实例化就可以使用了 import torchvision.models as models net models.resnet50() model Model(net).cuda()另外别忘了训练中在输入数据的时候要给两个inputs outputs model(inputs, add_var)3.添加额外输出有时候在模型训练中除了模型最后的输出外我们需要输出模型某一中间层的结果以施加额外的监督获得更好的中间层结果。基本的思路是修改模型定义中forward函数的return变量。我们依然以resnet50做10分类任务为例在已经定义好的模型结构上同时输出1000维的倒数第二层和10维的最后一层结果。具体实现如下 class Model(nn.Module):def __init__(self, net):super(Model, self).__init__()self.net netself.relu nn.ReLU()self.dropout nn.Dropout(0.5)self.fc1 nn.Linear(1000, 10, biasTrue)self.output nn.Softmax(dim1)def forward(self, x, add_variable):x1000 self.net(x)x10 self.dropout(self.relu(x1000))x10 self.fc1(x10)x10 self.output(x10)return x10, x1000之后对我们修改好的模型结构进行实例化就可以使用了 import torchvision.models as models net models.resnet50() model Model(net).cuda() #另外别忘了训练中在输入数据后会有两个outputsout10, out1000 model(inputs, add_var)四、PyTorch模型保存与读取 1.模型存储格式 PyTorch存储模型主要采用pklptpth三种格式。就使用层面来说没有区别这里不做具体的讨论。本节最后的参考内容中列出了查阅到的一些资料感兴趣的读者可以进一步研究欢迎留言讨论。 3.模型存储内容一个PyTorch模型主要包含两个部分模型结构和权重。其中模型是继承nn.Module的类权重的数据结构是一个字典key是层名value是权重向量。存储也由此分为两种形式存储整个模型包括结构和权重和只存储模型权重。 from torchvision import models model models.resnet152(pretrainedTrue)# 保存整个模型 torch.save(model, save_dir) # 保存模型权重 torch.save(model.state_dict, save_dir)对于PyTorch而言pt, pth和pkl三种数据格式均支持模型权重和整个模型存储使用上没有差别。保存读取整个模型 torch.save(model, save_dir) loaded_model torch.load(save_dir) loaded_model.cuda()保存读取模型权重 torch.save(model.state_dict(), save_dir) loaded_dict torch.load(save_dir) loaded_model models.resnet152() #注意这里需要对模型结构有定义 loaded_model.state_dict loaded_dict loaded_model.cuda()模型保存文章推荐阿里云登录 - 欢迎登录阿里云安全稳定的云计算服务平台 PyTorch学习笔记六PyTorch进阶训练技巧 import torch import torch.nn as nn import torch.nn.functional as F1 自定义损失函数以函数方式定义通过输出值和目标值进行计算返回损失值以类方式定义通过继承nn.Module将其当做神经网络的一层来看待以DiceLoss损失函数为例定义如下 DSC \frac{2|X∩Y|}{|X||Y|}DSC∣X∣∣Y∣2∣X∩Y∣ class DiceLoss(nn.Module):def __init__(self, weightNone, size_averageTrue):super(DiceLoss,self).__init__()def forward(self, inputs, targets, smooth1):inputs F.sigmoid(inputs) inputs inputs.view(-1)targets targets.view(-1)intersection (inputs * targets).sum() dice (2.*intersection smooth)/(inputs.sum() targets.sum() smooth) return 1 - dice2 动态调整学习率 Scheduler学习率衰减策略解决学习率选择的问题用于提高精度 PyTorch Scheduler策略 lr_scheduler.LambdaLRlr_scheduler.MultiplicativeLRlr_scheduler.StepLRlr_scheduler.MultiStepLRlr_scheduler.ExponentialLRlr_scheduler.CosineAnnealingLRlr_scheduler.ReduceLROnPlateaulr_scheduler.CyclicLRlr_scheduler.OneCycleLRlr_scheduler.CosineAnnealingWarmRestarts 使用说明需要将scheduler.step()放在optimizer.step()后面自定义Scheduler通过自定义函数对学习率进行修改 3 模型微调概念找到一个同类已训练好的模型调整模型参数使用数据进行训练。模型微调的流程在源数据集上预训练一个神经网络模型即源模型创建一个新的神经网络模型即目标模型该模型复制了源模型上除输出层外的所有模型设计和参数给目标模型添加一个输出大小为目标数据集类别个数的输出层并随机初始化改成的模型参数使用目标数据集训练目标模型使用已有模型结构通过传入pretrained参数决定是否使用预训练好的权重训练特定层使用requires_gradFalse冻结部分网络层只计算新初始化的层的梯度 def set_parameter_requires_grad(model, feature_extracting): if feature_extracting:for param in model.parameters():param.requires_grad Falseimport torchvision.models as models 冻结参数的梯度 feature_extract True model models.resnet50(pretrainedTrue) set_parameter_requires_grad(model, feature_extract) 修改模型 num_ftrs model.fc.in_features model.fc nn.Linear(in_features512, out_features4, biasTrue) model.fc Linear(in_features512, out_features4, biasTrue) 注在训练过程中model仍会回传梯度但是参数更新只会发生在fc层。4 半精度训练半精度优势减少显存占用提高GPU同时加载的数据量设置半精度训练导入torch.cuda.amp的autocast包在模型定义中的forward函数上设置autocast装饰器在训练过程中在数据输入模型之后添加with autocast() 适用范围适用于数据的size较大的数据集比如3D图像、视频等 5 总结自定义损失函数可以通过二种方式函数方式和类方式建议全程使用PyTorch提供的张量计算方法。通过使用PyTorch中的scheduler动态调整学习率也支持自定义scheduler模型微调主要使用已有的预训练模型调整其中的参数构建目标模型在目标数据集上训练模型。半精度训练主要适用于数据的size较大的数据集比如3D图像、视频等。 PyTorch学习笔记七PyTorch可视化_ 1 可视化网络结构打印模型基础信息使用print()函数只能打印出基础构件的信息不能显示每一层的shape和对应参数量的大小 import torchvision.models as modelsmodel models.resnet18()print(model)ResNet((conv1): Conv2d(3, 64, kernel_size(7, 7), stride(2, 2), padding(3, 3), biasFalse)(bn1): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(maxpool): MaxPool2d(kernel_size3, stride2, padding1, dilation1, ceil_modeFalse)(layer1): Sequential((0): BasicBlock((conv1): Conv2d(64, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn1): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(64, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue))(1): BasicBlock((conv1): Conv2d(64, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn1): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(64, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)))(layer2): Sequential((0): BasicBlock((conv1): Conv2d(64, 128, kernel_size(3, 3), stride(2, 2), padding(1, 1), biasFalse)(bn1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(128, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(downsample): Sequential((0): Conv2d(64, 128, kernel_size(1, 1), stride(2, 2), biasFalse)(1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)))(1): BasicBlock((conv1): Conv2d(128, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(128, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)))(layer3): Sequential((0): BasicBlock((conv1): Conv2d(128, 256, kernel_size(3, 3), stride(2, 2), padding(1, 1), biasFalse)(bn1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(downsample): Sequential((0): Conv2d(128, 256, kernel_size(1, 1), stride(2, 2), biasFalse)(1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)))(1): BasicBlock((conv1): Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)))(layer4): Sequential((0): BasicBlock((conv1): Conv2d(256, 512, kernel_size(3, 3), stride(2, 2), padding(1, 1), biasFalse)(bn1): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(downsample): Sequential((0): Conv2d(256, 512, kernel_size(1, 1), stride(2, 2), biasFalse)(1): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)))(1): BasicBlock((conv1): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn1): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(relu): ReLU(inplaceTrue)(conv2): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1), biasFalse)(bn2): BatchNorm2d(512, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)))(avgpool): AdaptiveAvgPool2d(output_size(1, 1))(fc): Linear(in_features512, out_features1000, biasTrue) )可视化网络结构使用torchinfo库进行模型网络的结构输出可以得到更加详细的信息包括模块信息每一层的类型、输出shape和参数量、模型整体的参数量、模型大小、一次前向或者反向传播需要的内存大小等 import torchvision.models as models from torchinfo import summaryresnet18 models.resnet18() # 实例化模型 # 其中batch_size为1图片的通道数为3图片的高宽为224 summary(model, (1, 3, 224, 224))Layer (type:depth-idx) Output Shape Param #ResNet -- -- ├─Conv2d: 1-1 [1, 64, 112, 112] 9,408 ├─BatchNorm2d: 1-2 [1, 64, 112, 112] 128 ├─ReLU: 1-3 [1, 64, 112, 112] -- ├─MaxPool2d: 1-4 [1, 64, 56, 56] -- ├─Sequential: 1-5 [1, 64, 56, 56] -- │ └─BasicBlock: 2-1 [1, 64, 56, 56] -- │ │ └─Conv2d: 3-1 [1, 64, 56, 56] 36,864 │ │ └─BatchNorm2d: 3-2 [1, 64, 56, 56] 128 │ │ └─ReLU: 3-3 [1, 64, 56, 56] -- │ │ └─Conv2d: 3-4 [1, 64, 56, 56] 36,864 │ │ └─BatchNorm2d: 3-5 [1, 64, 56, 56] 128 │ │ └─ReLU: 3-6 [1, 64, 56, 56] -- │ └─BasicBlock: 2-2 [1, 64, 56, 56] -- │ │ └─Conv2d: 3-7 [1, 64, 56, 56] 36,864 │ │ └─BatchNorm2d: 3-8 [1, 64, 56, 56] 128 │ │ └─ReLU: 3-9 [1, 64, 56, 56] -- │ │ └─Conv2d: 3-10 [1, 64, 56, 56] 36,864 │ │ └─BatchNorm2d: 3-11 [1, 64, 56, 56] 128 │ │ └─ReLU: 3-12 [1, 64, 56, 56] -- ├─Sequential: 1-6 [1, 128, 28, 28] -- │ └─BasicBlock: 2-3 [1, 128, 28, 28] -- │ │ └─Conv2d: 3-13 [1, 128, 28, 28] 73,728 │ │ └─BatchNorm2d: 3-14 [1, 128, 28, 28] 256 │ │ └─ReLU: 3-15 [1, 128, 28, 28] -- │ │ └─Conv2d: 3-16 [1, 128, 28, 28] 147,456 │ │ └─BatchNorm2d: 3-17 [1, 128, 28, 28] 256 │ │ └─Sequential: 3-18 [1, 128, 28, 28] 8,448 │ │ └─ReLU: 3-19 [1, 128, 28, 28] -- │ └─BasicBlock: 2-4 [1, 128, 28, 28] -- │ │ └─Conv2d: 3-20 [1, 128, 28, 28] 147,456 │ │ └─BatchNorm2d: 3-21 [1, 128, 28, 28] 256 │ │ └─ReLU: 3-22 [1, 128, 28, 28] -- │ │ └─Conv2d: 3-23 [1, 128, 28, 28] 147,456 │ │ └─BatchNorm2d: 3-24 [1, 128, 28, 28] 256 │ │ └─ReLU: 3-25 [1, 128, 28, 28] -- ├─Sequential: 1-7 [1, 256, 14, 14] -- │ └─BasicBlock: 2-5 [1, 256, 14, 14] -- │ │ └─Conv2d: 3-26 [1, 256, 14, 14] 294,912 │ │ └─BatchNorm2d: 3-27 [1, 256, 14, 14] 512 │ │ └─ReLU: 3-28 [1, 256, 14, 14] -- │ │ └─Conv2d: 3-29 [1, 256, 14, 14] 589,824 │ │ └─BatchNorm2d: 3-30 [1, 256, 14, 14] 512 │ │ └─Sequential: 3-31 [1, 256, 14, 14] 33,280 │ │ └─ReLU: 3-32 [1, 256, 14, 14] -- │ └─BasicBlock: 2-6 [1, 256, 14, 14] -- │ │ └─Conv2d: 3-33 [1, 256, 14, 14] 589,824 │ │ └─BatchNorm2d: 3-34 [1, 256, 14, 14] 512 │ │ └─ReLU: 3-35 [1, 256, 14, 14] -- │ │ └─Conv2d: 3-36 [1, 256, 14, 14] 589,824 │ │ └─BatchNorm2d: 3-37 [1, 256, 14, 14] 512 │ │ └─ReLU: 3-38 [1, 256, 14, 14] -- ├─Sequential: 1-8 [1, 512, 7, 7] -- │ └─BasicBlock: 2-7 [1, 512, 7, 7] -- │ │ └─Conv2d: 3-39 [1, 512, 7, 7] 1,179,648 │ │ └─BatchNorm2d: 3-40 [1, 512, 7, 7] 1,024 │ │ └─ReLU: 3-41 [1, 512, 7, 7] -- │ │ └─Conv2d: 3-42 [1, 512, 7, 7] 2,359,296 │ │ └─BatchNorm2d: 3-43 [1, 512, 7, 7] 1,024 │ │ └─Sequential: 3-44 [1, 512, 7, 7] 132,096 │ │ └─ReLU: 3-45 [1, 512, 7, 7] -- │ └─BasicBlock: 2-8 [1, 512, 7, 7] -- │ │ └─Conv2d: 3-46 [1, 512, 7, 7] 2,359,296 │ │ └─BatchNorm2d: 3-47 [1, 512, 7, 7] 1,024 │ │ └─ReLU: 3-48 [1, 512, 7, 7] -- │ │ └─Conv2d: 3-49 [1, 512, 7, 7] 2,359,296 │ │ └─BatchNorm2d: 3-50 [1, 512, 7, 7] 1,024 │ │ └─ReLU: 3-51 [1, 512, 7, 7] -- ├─AdaptiveAvgPool2d: 1-9 [1, 512, 1, 1] -- ├─Linear: 1-10 [1, 1000] 513,000Total params: 11,689,512 Trainable params: 11,689,512 Non-trainable params: 0 Total mult-adds (G): 1.81Input size (MB): 0.60 Forward/backward pass size (MB): 39.75 Params size (MB): 46.76 Estimated Total Size (MB): 87.11Copy to clipboardErrorCopied2 CNN可视化 CNN卷积核可视化 model models.vgg11(pretrainedTrue) dict(model.features.named_children()){0: Conv2d(3, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1)),1: ReLU(inplaceTrue),2: MaxPool2d(kernel_size2, stride2, padding0, dilation1, ceil_modeFalse),3: Conv2d(64, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1)),4: ReLU(inplaceTrue),5: MaxPool2d(kernel_size2, stride2, padding0, dilation1, ceil_modeFalse),6: Conv2d(128, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)),7: ReLU(inplaceTrue),8: Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)),9: ReLU(inplaceTrue),10: MaxPool2d(kernel_size2, stride2, padding0, dilation1, ceil_modeFalse),11: Conv2d(256, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)),12: ReLU(inplaceTrue),13: Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)),14: ReLU(inplaceTrue),15: MaxPool2d(kernel_size2, stride2, padding0, dilation1, ceil_modeFalse),16: Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)),17: ReLU(inplaceTrue),18: Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)),19: ReLU(inplaceTrue),20: MaxPool2d(kernel_size2, stride2, padding0, dilation1, ceil_modeFalse)}import matplotlib.pyplot as pltconv1 dict(model.features.named_children())[3] # 得到第3层的卷积层参数 kernel_set conv1.weight.detach() num len(conv1.weight.detach()) print(kernel_set.shape) # 该代码仅可视化其中一个维度的卷积核第3层的卷积核有128*64个 for i in range(0, 1):i_kernel kernel_set[i]plt.figure(figsize(20, 17))if (len(i_kernel)) 1:for idx, filer in enumerate(i_kernel):plt.subplot(9, 9, idx1) plt.axis(off)plt.imshow(filer[ :, :].detach(),cmapbwr)torch.Size([128, 64, 3, 3])CNN特征图可视化使用PyTorch提供的hook结构得到网络在前向传播过程中的特征图。 CNN class activation map可视化用于在CNN可视化场景下判断图像中哪些像素点对预测结果是重要的可使用grad-cam库进行操作使用FlashTorch快速实现CNDD可视化可以使用flashtorch库可视化梯度和卷积核 3 使用TensorBoard可视化训练过程可视化基本逻辑TensorBoard记录模型每一层的feature map、权重和训练loss等并保存在用户指定的文件夹中通过网页形式进行可视化展示模型结构可视化使用add_graph方法在TensorBoard下展示模型结构 import torch.nn as nnclass Net(nn.Module):def __init__(self):super(Net, self).__init__()self.conv1 nn.Conv2d(in_channels3,out_channels32,kernel_size 3)self.pool nn.MaxPool2d(kernel_size 2,stride 2)self.conv2 nn.Conv2d(in_channels32,out_channels64,kernel_size 5)self.adaptive_pool nn.AdaptiveMaxPool2d((1,1))self.flatten nn.Flatten()self.linear1 nn.Linear(64,32)self.relu nn.ReLU()self.linear2 nn.Linear(32,1)self.sigmoid nn.Sigmoid()def forward(self,x):x self.conv1(x)x self.pool(x)x self.conv2(x)x self.pool(x)x self.adaptive_pool(x)x self.flatten(x)x self.linear1(x)x self.relu(x)x self.linear2(x)y self.sigmoid(x)return ymodel Net() print(model)Net((conv1): Conv2d(3, 32, kernel_size(3, 3), stride(1, 1))(pool): MaxPool2d(kernel_size2, stride2, padding0, dilation1, ceil_modeFalse)(conv2): Conv2d(32, 64, kernel_size(5, 5), stride(1, 1))(adaptive_pool): AdaptiveMaxPool2d(output_size(1, 1))(flatten): Flatten(start_dim1, end_dim-1)(linear1): Linear(in_features64, out_features32, biasTrue)(relu): ReLU()(linear2): Linear(in_features32, out_features1, biasTrue)(sigmoid): Sigmoid() )from torch.utils.tensorboard import SummaryWriterwriter SummaryWriter(./runs) writer.add_graph(model, input_to_model torch.rand(1, 3, 224, 224)) writer.close()在当前目录下执行tensorboard --logdir./runs命令打开TensorBoard可视化页面看到模型网络结构。图像可视化对于单张图片的显示使用add_image对于多张图片的显示使用add_images有时需要使用torchvision.utils.make_grid将多张图片拼成一张图片后用writer.add_image显示连续变量可视化使用add_scalar方法对连续变量或时序变量的变化过程进行可视化展示 for i in range(500):x iy x ** 2writer.add_scalar(x, x, i) #日志中记录x在第step i 的值writer.add_scalar(y, y, i) #日志中记录y在第step i 的值 writer.close() Copy to clipboardErrorCopied参数分布可视化使用add_histogram方法对参数或变量的分布进行可视化展示 import numpy as np# 创建正态分布的张量模拟参数矩阵 def norm(mean, std):t std * torch.randn((100, 20)) meanreturn tfor step, mean in enumerate(range(-10, 10, 1)):w norm(mean, 1)writer.add_histogram(w, w, step)writer.flush() writer.close() Copy to clipboardErrorCopied4 总结本次任务主要介绍了PyTorch可视化包括可视化网络结构、CNN卷积层可视化和使用TensorBoard可视化训练过程。使用torchinfo库可视化模型网络结构展示模块信息每一层的类型、输出shape和参数量、模型整体的参数量、模型大小、一次前向或者反向传播需要的内存大小等。使用grad-cam库可视化重要像素点能够快速确定重要区域进行可解释性分析或模型优化改进。通过TensorBoard工具调用相关方法创建训练记录可视化模型结构、图像、连续变量和参数分布等。 PyTorch学习笔记八PyTorch生态简介一、 torchvision图像 1.torchvision.datasets 计算机视觉领域常见的数据集包括CIFAR、EMNIST、Fashion-MNIST等 torchvision.datasets主要包含了一些我们在计算机视觉中常见的数据集在0.10.0版本的torchvision下有以下的数据集 CaltechCelebACIFARCityscapesEMNISTFakeDataFashion-MNISTFlickrImageNetKinetics-400KITTIKMNISTPhotoTourPlaces365QMNISTSBDSEMEIONSTL10SVHNUCF101VOCWIDERFace 2.torchvision.transforms 数据预处理方法可以进行图片数据的放大、缩小、水平或垂直翻转等 from torchvision import transforms data_transform transforms.Compose([transforms.ToPILImage(), # 这一步取决于后续的数据读取方式如果使用内置数据集则不需要transforms.Resize(image_size),transforms.ToTensor() ])3.torchvision.models 预训练模型包括图像分类、语义分割、物体检测、实例分割、人体关键点检测、视频分类等模型为了提高训练效率减少不必要的重复劳动PyTorch官方也提供了一些预训练好的模型供我们使用可以点击这里进行查看现在有哪些预训练模型下面我们将对如何使用这些模型进行详细介绍。此处我们以torchvision0.10.0 为例如果希望获取更多的预训练模型可以使用使用pretrained-models.pytorch仓库。现有预训练好的模型可以分为以下几类 Classification 在图像分类里面PyTorch官方提供了以下模型并正在不断增多。 AlexNetVGGResNetSqueezeNetDenseNetInception v3GoogLeNetShuffleNet v2MobileNetV2MobileNetV3ResNextWide ResNetMNASNetEfficientNetRegNet持续更新这些模型是在ImageNet-1k进行预训练好的具体的使用我们会在后面进行介绍。除此之外我们也可以点击这里去查看这些模型在ImageNet-1k的准确率。 Semantic Segmentation 语义分割的预训练模型是在COCO train2017的子集上进行训练的提供了20个类别包括background, aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, diningtable, dog, horse, motorbike, person, pottedplant, sheep, sofa,train, tvmonitor。 FCN ResNet50FCN ResNet101DeepLabV3 ResNet50DeepLabV3 ResNet101LR-ASPP MobileNetV3-LargeDeepLabV3 MobileNetV3-Large未完待续具体我们可以点击这里进行查看预训练的模型的mean IOU和global pixelwise acc Object Detectioninstance Segmentation and Keypoint Detection 物体检测实例分割和人体关键点检测的模型我们同样是在COCO train2017进行训练的在下方我们提供了实例分割的类别和人体关键点检测类别 COCO_INSTANCE_CATEGORY_NAMES [ ‘background’, ‘person’, ‘bicycle’, ‘car’, ‘motorcycle’, ‘airplane’, ‘bus’,‘train’, ‘truck’, ‘boat’, ‘traffic light’, ‘fire hydrant’, ‘N/A’, ‘stop sign’, ‘parking meter’, ‘bench’, ‘bird’, ‘cat’, ‘dog’, ‘horse’, ‘sheep’, ‘cow’, ‘elephant’, ‘bear’, ‘zebra’, ‘giraffe’, ‘N/A’, ‘backpack’, ‘umbrella’, ‘N/A’, ‘N/A’,‘handbag’, ‘tie’, ‘suitcase’, ‘frisbee’, ‘skis’, ‘snowboard’, ‘sports ball’,‘kite’, ‘baseball bat’, ‘baseball glove’, ‘skateboard’, ‘surfboard’, ‘tennis racket’,‘bottle’, ‘N/A’, ‘wine glass’, ‘cup’, ‘fork’, ‘knife’, ‘spoon’, ‘bowl’,‘banana’, ‘apple’, ‘sandwich’, ‘orange’, ‘broccoli’, ‘carrot’, ‘hot dog’, ‘pizza’,‘donut’, ‘cake’, ‘chair’, ‘couch’, ‘potted plant’, ‘bed’, ‘N/A’, ‘dining table’,‘N/A’, ‘N/A’, ‘toilet’, ‘N/A’, ‘tv’, ‘laptop’, ‘mouse’, ‘remote’, ‘keyboard’, ‘cell phone’,‘microwave’, ‘oven’, ‘toaster’, ‘sink’, ‘refrigerator’, ‘N/A’, ‘book’,‘clock’, ‘vase’, ‘scissors’, ‘teddy bear’, ‘hair drier’, ‘toothbrush’] COCO_PERSON_KEYPOINT_NAMES [‘nose’,‘left_eye’,‘right_eye’,‘left_ear’,‘right_ear’,‘left_shoulder’,‘right_shoulder’,‘left_elbow’,‘right_elbow’,‘left_wrist’,‘right_wrist’,‘left_hip’,‘right_hip’,‘left_knee’,‘right_knee’,‘left_ankle’,‘right_ankle’] Faster R-CNNMask R-CNNRetinaNetSSDliteSSD未完待续同样的我们可以点击这里查看这些模型在COCO train 2017上的box AP,keypoint AP,mask AP Video classification 视频分类模型是在 Kinetics-400上进行预训练的 ResNet 3D 18ResNet MC 18ResNet (21) D未完待续同样我们也可以点击这里查看这些模型的。 4.torchvision.io 视频、图片和文件的IO操作包括读取、写入、编解码处理等 5.torchvision.ops 计算机视觉的特定操作包括但不仅限于NMSRoIAlignMASK R-CNN中应用的一种方法RoIPoolFast R-CNN中用到的一种方法 6.torchvision.utils 图片拼接、可视化检测和分割等操作 torchvision.utils 为我们提供了一些可视化的方法可以帮助我们将若干张图片拼接在一起、可视化检测和分割的效果。具体方法可以点击这里进行查看。总的来说torchvision的出现帮助我们解决了常见的计算机视觉中一些重复且耗时的工作并在数据集的获取、数据增强、模型预训练等方面大大降低了我们的工作难度可以让我们更加快速上手一些计算机视觉任务。 2 PyTorchVideo视频简介PyTorchVideo是一个专注于视频理解工作的深度学习库提供加速视频理解研究所需的可重用、模块化和高效的组件使用PyTorch开发支持不同的深度学习视频组件如视频模型、视频数据集和视频特定转换。特点基于PyTorch提供Model Zoo支持数据预处理和常见数据采用模块化设计支持多模态优化移动端部署使用方式TochHub、PySlowFast、PyTorch Lightning 3 torchtext文本 torchtext的主要组成部分 torchtext可以方便的对文本进行预处理例如截断补长、构建词表等。torchtext主要包含了以下的主要组成部分数据处理工具 torchtext.data.functional、torchtext.data.utils数据集 torchtext.data.datasets词表工具 torchtext.vocab评测指标 torchtext.metrics 简介torchtext是PyTorch的自然语言处理NLP的工具包可对文本进行预处理例如截断补长、构建词表等操作构建数据集使用Field类定义不同类型的数据评测指标使用torchtext.data.metrics下的方法对NLP任务进行评测本节参考 torchtext官方文档atnlp/torchtext-summary transforms实战 from PIL import Image from torchvision import transforms import matplotlib.pyplot as plt %matplotlib inline # 加载原始图片 img Image.open(./lenna.jpg) print(img.size) plt.imshow(img) ## transforms.CenterCrop(size) # 对给定图片进行沿中心切割 # 对图片沿中心放大切割超出图片大小的部分填0 img_centercrop1 transforms.CenterCrop((500,500))(img) print(img_centercrop1.size) # 对图片沿中心缩小切割超出期望大小的部分剔除 img_centercrop2 transforms.CenterCrop((224,224))(img) print(img_centercrop2.size) plt.subplot(1,3,1),plt.imshow(img),plt.title(Original) plt.subplot(1,3,2),plt.imshow(img_centercrop1),plt.title(500 * 500) plt.subplot(1,3,3),plt.imshow(img_centercrop2),plt.title(224 * 224) plt.show() ## transforms.ColorJitter(brightness0, contrast0, saturation0, hue0) # 对图片的亮度对比度饱和度色调进行改变 img_CJ transforms.ColorJitter(brightness1,contrast0.5,saturation0.5,hue0.5)(img) print(img_CJ.size) plt.imshow(img_CJ) ## transforms.Grayscale(num_output_channels) img_grey_c3 transforms.Grayscale(num_output_channels3)(img) img_grey_c1 transforms.Grayscale(num_output_channels1)(img) plt.subplot(1,2,1),plt.imshow(img_grey_c3),plt.title(channels3) plt.subplot(1,2,2),plt.imshow(img_grey_c1),plt.title(channels1) plt.show() ## transforms.Resize # 等比缩放 img_resize transforms.Resize(224)(img) print(img_resize.size) plt.imshow(img_resize) ## transforms.Scale # 等比缩放不推荐使用此转换以支持调整大小 img_scale transforms.Scale(224)(img) print(img_scale.size) plt.imshow(img_scale) ## transforms.RandomCrop # 随机裁剪成指定大小 # 设立随机种子 import torch torch.manual_seed(31) # 随机裁剪 img_randowm_crop1 transforms.RandomCrop(224)(img) img_randowm_crop2 transforms.RandomCrop(224)(img) print(img_randowm_crop1.size) plt.subplot(1,2,1),plt.imshow(img_randowm_crop1) plt.subplot(1,2,2),plt.imshow(img_randowm_crop2) plt.show() ## transforms.RandomHorizontalFlip # 随机左右旋转 # 设立随机种子可能不旋转 import torch torch.manual_seed(31)img_random_H transforms.RandomHorizontalFlip()(img) print(img_random_H.size) plt.imshow(img_random_H) ## transforms.RandomVerticalFlip # 随机垂直方向旋转 img_random_V transforms.RandomVerticalFlip()(img) print(img_random_V.size) plt.imshow(img_random_V) ## transforms.RandomResizedCrop # 随机裁剪成指定大小 img_random_resizecrop transforms.RandomResizedCrop(224,scale(0.5,0.5))(img) print(img_random_resizecrop.size) plt.imshow(img_random_resizecrop) ## 对图片进行组合变化 tranforms.Compose() # 对一张图片的操作可能是多种的我们使用transforms.Compose()将他们组装起来 transformer transforms.Compose([transforms.Resize(256),transforms.transforms.RandomResizedCrop((224), scale (0.5,1.0)),transforms.RandomVerticalFlip(), ]) img_transform transformer(img) plt.imshow(img_transform)

查看全文

http://www.hkea.cn/news/14571374/