基于PyTorch的VGG19图像分类——从CPU到DLP的完整实践

发布时间:2026/5/30 14:36:10

基于PyTorch的VGG19图像分类——从CPU到DLP的完整实践 【智能计算系统】实验三基于PyTorch的VGG19图像分类——从CPU到DLP的完整实践附完整代码本文是智能计算系统课程实验三的完整实现使用PyTorch框架实现基于VGG19网络的图像分类并在CPU和DLP平台上进行推理。通过对比实验一、二展示使用编程框架的便捷性和DLP的加速效果。一、实验概述本实验目的是掌握PyTorch编程框架的使用在CPU平台上使用PyTorch实现基于VGG19网络的图像分类并在DLP平台上完成图像分类。实验环境硬件CPU、DLP软件Torch 1.6.0、CNNL高性能算子库、CNRT运行时库、Python 3.7.4二、VGG19网络介绍VGG19是Visual Geometry Group在2014年提出的深度卷积神经网络在ImageNet图像分类任务上取得了优异的成绩。网络结构特点使用3×3的小卷积核通过堆叠增加网络深度使用2×2的最大池化层进行下采样包含16个卷积层和3个全连接层总参数量约1.44亿三、核心代码实现3.1 VGG19网络定义使用PyTorch的nn.Sequential构建VGG19网络importtorchimporttorch.nnasnn cfgs[64,R,64,R,M,128,R,128,R,M,256,R,256,R,256,R,256,R,M,512,R,512,R,512,R,512,R,M,512,R,512,R,512,R,512,R,M]defvgg19():layers[conv1_1,relu1_1,conv1_2,relu1_2,pool1,conv2_1,relu2_1,conv2_2,relu2_2,pool2,conv3_1,relu3_1,conv3_2,relu3_2,conv3_3,relu3_3,conv3_4,relu3_4,pool3,conv4_1,relu4_1,conv4_2,relu4_2,conv4_3,relu4_3,conv4_4,relu4_4,pool4,conv5_1,relu5_1,conv5_2,relu5_2,conv5_3,relu5_3,conv5_4,relu5_4,pool5,flatten,fc6,relu6,fc7,relu7,fc8,softmax]layer_containernn.Sequential()in_channels3num_classes1000conv_cfgs[cforcincfgsifisinstance(c,int)]cfg_idx0fori,layer_nameinenumerate(layers):iflayer_name.startswith(conv):out_channelsconv_cfgs[cfg_idx]cfg_idx1layer_container.add_module(layer_name,nn.Conv2d(in_channels,out_channels,kernel_size3,padding1))in_channelsout_channelseliflayer_name.startswith(relu):layer_container.add_module(layer_name,nn.ReLU(inplaceTrue))eliflayer_name.startswith(pool):layer_container.add_module(layer_name,nn.MaxPool2d(kernel_size2,stride2))eliflayer_nameflatten:layer_container.add_module(layer_name,nn.Flatten())eliflayer_namefc6:layer_container.add_module(layer_name,nn.Linear(25088,4096))eliflayer_namefc7:layer_container.add_module(layer_name,nn.Linear(4096,4096))eliflayer_namefc8:layer_container.add_module(layer_name,nn.Linear(4096,num_classes))eliflayer_namesoftmax:layer_container.add_module(layer_name,nn.Softmax(dim1))returnlayer_container3.2 生成.pth权重文件从.mat文件加载预训练权重并保存为.pth格式importscipy.iofromcollectionsimportOrderedDictdefgenerate_pth():datasscipy.io.loadmat(VGG_PATH)modelvgg19()new_state_dictOrderedDict()fori,param_nameinenumerate(model.state_dict()):nameparam_name.split(.)ifname[-1]weight:new_state_dict[param_name]torch.from_numpy(datas[str(i)]).float()else:new_state_dict[param_name]torch.from_numpy(datas[str(i)][0]).float()model.load_state_dict(new_state_dict)torch.save(model.state_dict(),models/vgg19.pth)3.3 图像预处理使用torchvision.transforms进行图像预处理fromPILimportImagefromtorchvisionimporttransformsdefload_image(path):imageImage.open(path).convert(RGB)transformtransforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize(mean[0.485,0.456,0.406],std[0.229,0.224,0.225])])imagetransform(image)imageimage.unsqueeze(0)returnimage3.4 CPU平台推理importtimeif__name____main__:input_imageload_image(IMAGE_PATH)netvgg19()net.load_state_dict(torch.load(VGG_PATH,map_locationcpu))net.eval()sttime.time()probnet(input_image)print(cpu infer time:{:.3f} s.format(time.time()-st))withopen(./labels/imagenet_classes.txt)asf:classes[line.strip()forlineinf.readlines()]_,indicestorch.sort(prob,descendingTrue)print(Classification result: id %s, prob %f %(classes[indices[0][0]],prob[0][indices[0][0]].item()))ifclasses[indices[0][0]]strawberry:print(TEST RESULT PASS.)3.5 DLP平台推理使用torch_mlu在DLP上进行推理importtorch_mluimporttorch_mlu.core.mlu_modelasctif__name____main__:input_imageload_image(IMAGE_PATH)netvgg19()net.load_state_dict(torch.load(VGG_PATH,map_locationcpu))net.eval()# 使用JIT trace优化example_forward_inputtorch.rand((1,3,224,224),dtypetorch.float)net_tracetorch.jit.trace(net,example_forward_input,check_traceFalse)# 移动到DLP设备input_imageinput_image.to(ct.mlu_device())net_tracenet_trace.to(ct.mlu_device())sttime.time()probnet_trace(input_image)print(mlu370cnnl backend infer time:{:.3f} s.format(time.time()-st))probprob.cpu()withopen(./labels/imagenet_classes.txt)asf:classes[line.strip()forlineinf.readlines()]_,indicestorch.sort(prob,descendingTrue)print(Classification result: id %s, prob %f %(classes[indices[0][0]],prob[0][indices[0][0]].item()))ifclasses[indices[0][0]]strawberry:print(TEST RESULT PASS.)四、运行结果平台推理时间分类结果CPU约0.5-1.0秒strawberry概率约0.99DLP约0.01-0.05秒strawberry概率约0.99性能提升约10-50倍五、与实验一、二的对比对比项实验一实验二实验三代码复杂度手动实现约100行pycnnl约50行PyTorch约30行网络类型三层全连接三层全连接VGG19卷积网络参数量约100万约100万约1.44亿推理平台CPUDLPCPU DLP六、评分标准分数要求60分正确生成.pth文件80分CPU上正确推理得到正确分类结果100分DLP上正确推理处理时间相比CPU有明显提升七、实验总结通过本实验我掌握了PyTorch框架的使用PyTorch提供了简洁的API来构建复杂的神经网络使用nn.Sequential可以方便地堆叠各种网络层torchvision.transforms提供了丰富的图像预处理工具torch_mlu库可以方便地将模型迁移到DLP平台相比手动实现使用框架可以大大提高开发效率GitHub仓库地址https://github.com/NiMark886/smart-computing-exp3-vgg19-pytorchGitee仓库地址https://gitee.com/NiMark886/smart-computing-exp3-vgg19-pytorch

相关新闻