项目背景

本项目是参加PaddlePaddle Hackathon第二期活动的任务, 在此分享将yolov5模型部署在手机端上的一些实践记录和踩坑经历。

任务目标：

参考ssd_mobilnetv1目标检测的 Android demo，使用 yolo_v5 模型在安卓手机上完成 demo 开发，输入为摄像头实时视频流，输出为包含检测框的视频流；在界面上添加一个 backend 选择开关，用户可以选择将模型运行在 CPU 或 GPU 上，保证分别运行 CPU 和 GPU 结果均正确。

项目任务链接： https://github.com/PaddlePaddle/Paddle/issues/40234

GitHub 任务提交链接: https://github.com/PaddlePaddle/Paddle-Lite-Demo/pull/237

基于之前项目的后续开发

链接： https://aistudio.baidu.com/aistudio/projectdetail/3452279

导出nb文件

基于之前的项目稍微做了调整，代码已上传至GitHub: https://github.com/thunder95/yolov5_paddle_prune

%cd /home/aistudio/
!rm -rf yolov5_paddle_prune/
!git clone  https://github.com/thunder95/yolov5_paddle_prune.git
%cd yolov5_paddle_prune
!git checkout android# 安装依赖库
!pip install gputil==1.4.0 pycocotools terminaltables
!mkdir -p /home/aistudio/.config/thunder95/
!cp /home/aistudio/Arial.ttf  /home/aistudio/.config/thunder95/

数据集

本项目不需要训练, 但需要解压并修改数据集加载的配置文件

%cd /home/aistudio/data/data127016/
!unzip /home/aistudio/data/data127016/yolov5_mot20.zip
!python create_txt.py
!sed -i 's|\/f\/dataset\/person\/out|\/home\/aistudio\/data\/data127016|g' /home/aistudio/yolov5_paddle_prune/data/coco.yaml

在models/common.py中修改，基于PaddleAPI实现PaddleLite不支持的算子silu：

Error: This model is not supported, because 1 ops are not supported on ‘arm’. These unsupported ops are: ‘silu’

修改原后处理逻辑

遇到的问题一：

[F 4/10 8:51:53.568 …-Lite/lite/kernels/opencl/image_helper.h:72 DefaultGlobalWorkSize] Unsupport DefaultGlobalWorkSize with tensor_dim.size():5, image_shape.size():2

所提Issue(https://github.com/PaddlePaddle/Paddle-Lite/issues/8803)反馈, 在PaddleLite中使用opencl不支持>4维的算子

遇到的问题二：

使用paddle.max和paddle.argmax组合会跟原网络输出无法对齐

所提Issue(https://github.com/PaddlePaddle/Paddle-Lite/issues/8647)反馈, 官方待修复这个问题。

修改models/yolo.py中检测头的foward部分

class Detect(nn.Layer):def __init__(self, nc=80, anchors=(), ch=(), inplace=True):  # detection layersuper().__init__()self.onnx_dynamic = False  # ONNX export parameterself.stride = None  # strides computed during buildself.nc = nc  # number of classesself.no = nc + 5  # number of outputs per anchorself.nl = len(anchors)  # number of detection layersself.na = len(anchors[0]) // 2  # number of anchorsself.grid = [paddle.zeros([1])] * self.nl  # init gridself.anchor_grid = [paddle.zeros([1])] * self.nl  # init anchor gridself.register_buffer('anchors', paddle.to_tensor(anchors).astype('float32').reshape([self.nl, -1, 2]))  # shape(nl,na,2)self.m = nn.LayerList(nn.Conv2D(x, self.no * self.na, 1) for x in ch)  # output convself.inplace = inplace  # use in-place ops (e.g. slice assignment)def forward(self, x):z = []  # inference outputfor i in range(self.nl):x[i] = self.m[i](x[i])  # convbs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)x[i] = x[i].reshape([bs, self.na, self.no, ny * nx]).transpose([0, 1, 3, 2])if not self.training:  # inferencey = F.sigmoid(x[i])z.append(y)return x if self.training else z

修改detect.py中提取boxes代码，将在andriod端用c实现

boxes = []  # xywh, score, cls
for l in range(len(pred)):pred[l] = pred[l].numpy()bs, ch, nd, cn = pred[l].shaperows = dim_num[l]print(l, rows)for b in range(bs):for c in range(ch):for r in range(rows):offset = r * row_nummax_cls_val = 0max_cls_id = 0score = pred[l][b, c, r, 4]if score < conf_thres:continuefor i in range(80):if pred[l][b, c, r, i + 5] > max_cls_val:max_cls_val = pred[l][b, c, r, i + 5]max_cls_id = iscore *= max_cls_valif score < conf_thres:continuey = int(r / xdim[l])x = int(r % xdim[l])tmp_box = [(pred[l][b, c, r, 0] * 2 - 0.5 + x) * strides[l],(pred[l][b, c, r, 1] * 2 - 0.5 + y) * strides[l],  # i, y, x, 2(pred[l][b, c, r, 2] * 2) ** 2 * anchors[l][c * 2],(pred[l][b, c, r, 3] * 2) ** 2 * anchors[l][c * 2 + 1],max_cls_id,score,  # scores]tmp_box[0] = tmp_box[0] - tmp_box[2] / 2  # x1tmp_box[1] = tmp_box[1] - tmp_box[3] / 2  # y1tmp_box[2] = tmp_box[0] + tmp_box[2]  # x2tmp_box[3] = tmp_box[1] + tmp_box[3]  # y2boxes.append(tmp_box)

# 导出静态图模型
%cd /home/aistudio/yolov5_paddle_prune
!python convert_static.py --weights ./weights/yolov5n.pdparams --cfg models/yolov5n.yaml --data ./data/coco.yaml --source crop.jpg

生成NB文件

安装PaddleLite优化工具paddle_lite_opt，并通过命令生成nb权重文件。

本项目推荐编译安装release/v2.11的PaddleLite, 参考issue:https://github.com/PaddlePaddle/Paddle-Lite/issues/9168

git clone https://github.com/PaddlePaddle/Paddle-Lite.gitcd Paddle-Litegit checkout release/v2.11export NDK_ROOT=~/Android/Sdk/ndk/24.0.8215888/./lite/tools/build_android.sh --with_opencl=ON --arch=armv8 --with_extra=ON --with_log=ON

%cd /home/aistudio/yolov5_paddle_prune/
!pip install paddlelite==2.11# cpu
!paddle_lite_opt --model_file=./simple_net.pdmodel --param_file=simple_net.pdiparams  --optimize_out_type=naive_buffer --optimize_out=./yolov5n_cpu --valid_targets=arm#gpu
!paddle_lite_opt --model_file=./simple_net.pdmodel --param_file=simple_net.pdiparams  --optimize_out_type=naive_buffer --optimize_out=./yolov5n_gpu --valid_targets=opencl,arm

/home/aistudio/yolov5_paddle_prune
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting paddlelite==2.11Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8b/d7/8babc059bff1d02dc85e32fb8bff3c56680db5f5b20246467f9fcbcfe62c/paddlelite-2.11-cp37-cp37m-manylinux1_x86_64.whl (47.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.1/47.1 MB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: paddlelite
Successfully installed paddlelite-2.11
[33mWARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.[0m[33m
[0mLoading topology data from ./simple_net.pdmodel
Loading params data from simple_net.pdiparams
1. Model is successfully loaded!
2. Model is optimized and saved into ./yolov5n_cpu.nb successfully
Loading topology data from ./simple_net.pdmodel
Loading params data from simple_net.pdiparams
1. Model is successfully loaded!
2. Model is optimized and saved into ./yolov5n_gpu.nb successfully

Adb shell运行

安裝Adb软件（以ubuntu为例）

准备一部测试用的安卓手机armv8版本，处理器Qualcomm Snapdrgon 450。

系统 -> 版本号 -> 连续点击7次，开启开发模式
设置 -> 搜索开发人员选项 -> 打开usb调试
连接安卓手机，测试连接状态

 sudo apt updatesudo apt install -y wget adbadb devices

下方出现设备，表示设备连接成功

List of devices attached
XKKBB19A10231894    device

修改本地NDK配置:
export NDK_ROOT=~/Android/Sdk/ndk/24.0.8215888

关键代码

下载代码：
git clone https://github.com/PaddlePaddle/Paddle-Lite-Demo.git

注意: 之前的版本opencl会存在精度对齐的问题，需要切换到develop或者release/v2.11，参考本人issue: https://github.com/PaddlePaddle/Paddle-Lite/issues/8807

下载, 解压并拷贝预编译库到Paddle-Lite-Demo项目: https://paddle-lite.readthedocs.io/zh/develop/quick_start/release_lib.html

/cxx/include$ cp ./* /home/hulei/projects/tmp/Paddle-Lite-Demo/libs/android/cxx/include/
/cxx/lib$ cp libpaddle_light_api_shared.so /home/hulei/projects/tmp/Paddle-Lite-Demo/libs/android/cxx/libs/arm64-v8a/

模型加载:

  // 1. Set MobileConfigMobileConfig config;config.set_model_from_file(model_file);std::cout << "model_file: " << model_file << std::endl;config.set_power_mode(static_cast<paddle::lite_api::PowerMode>(power_mode));config.set_threads(thread_num);// 2. Create PaddlePredictor by MobileConfigstd::shared_ptr<PaddlePredictor> predictor =CreatePaddlePredictor<MobileConfig>(config);// 3. Prepare input data from image// read img and pre-processstd::unique_ptr<Tensor> input_tensor0(std::move(predictor->GetInput(0)));input_tensor0->Resize({1, 3, height, width});auto *data0 = input_tensor0->mutable_data<float>();cv::Mat img = imread(img_path, cv::IMREAD_COLOR);pre_process(img, width, height, data0);// 4. Run predictordouble first_duration{-1};for (size_t widx = 0; widx < warmup; ++widx) {if (widx == 0) {auto start = GetCurrentUS();predictor->Run();first_duration = (GetCurrentUS() - start) / 1000.0;} else {predictor->Run();}}

数据输入预处理

void pre_process(const cv::Mat &img_ori, int width, int height, float *data) {cv::Mat img = img_ori.clone();int w, h, x, y;int channelLength = width * height;float r_w = width / (img.cols * 1.0);float r_h = height / (img.rows * 1.0);if (r_h > r_w) {w = width;h = r_w * img.rows;x = 0;y = (height - h) / 2;} else {w = r_h * img.cols;h = height;x = (width - w) / 2;y = 0;}cv::Mat re(h, w, CV_8UC3);cv::resize(img, re, re.size(), 0, 0, cv::INTER_CUBIC);cv::Mat out(height, width, CV_8UC3, cv::Scalar(128, 128, 128));re.copyTo(out(cv::Rect(x, y, re.cols, re.rows)));// split channelsout.convertTo(out, CV_32FC3, 1. / 255.);cv::Mat input_channels[3];cv::split(out, input_channels);for (int j = 0; j < 3; j++) {memcpy(data + width * height * j, input_channels[2 - j].data,channelLength * sizeof(float));}
}

后处理代码

void post_process(std::shared_ptr<PaddlePredictor> predictor, float thresh,std::vector<std::string> class_names, const cv::Mat &image,int in_width, int in_height) { // NOLINTconst int strides[3] = {8, 16, 32};const int anchors[3][6] = {{10, 13, 16, 30, 33, 23},{30, 61, 62, 45, 59, 119},{116, 90, 156, 198, 373, 326}};std::map<int, std::vector<Object>> raw_outputs;float r_w = in_width / static_cast<float>(image.cols);float r_h = in_height / static_cast<float>(image.rows);float r, off_x, off_y;if (r_h > r_w) {r = r_w;off_x = 0;off_y = static_cast<int>((in_height - r_w * image.rows) / 2);} else {r = r_h;off_y = 0;off_x = static_cast<int>((in_width - r_h * image.cols) / 2);}for (int k = 0; k < 3; k++) {std::unique_ptr<const Tensor> output_tensor(std::move(predictor->GetOutput(k)));auto *outptr = output_tensor->data<float>();auto shape_out = output_tensor->shape();std::vector<int> shape_new(shape_out.begin(), shape_out.end());int xdim = static_cast<int>(in_width / strides[k]);extract_boxes(outptr, &raw_outputs, strides[k], anchors[k], shape_new, r,thresh, off_x, off_y, xdim);}std::vector<Object> outs;nms(&raw_outputs, &outs, 0.45);
}

打开日志调试

模型可能算子不支持，也可能有其他问题，需要更多详细的日志信息，这是需要打开日志模式。下载PaddleLite源码重新编译：

./lite/tools/build_android_armv8.sh --with_opencl=ON --arch=armv8 --with_extra=ON --with_log=ON

运行Yolov5n推理

 cd Paddle-Lite-Demo/libs# 下载所需要的 Paddle Lite 预测库sh download.shcd ../object_detection/assets# 下载OPT 优化后模型、测试图片、标签文件sh download.shcd ../android/app/shell/cxx/yolov5n_detection# 更新 NDK_ROOT 路径，然后完成可执行文件的编译和运行sh build.sh# CMakeList.txt 里的 System 默认设置是linux；如果在Mac 运行，则需将 CMAKE_SYSTEM_NAME 变量设置为 drawn

cpu推理性能:

推理结果：

gpu推理性能: shell下相比cpu测试没有明显的性能提升，输出精度与cpu对齐

部署到Anroid手机

参考教程: https://aistudio.baidu.com/aistudio/projectdetail/3431580

注意：如果您的 Android Studio 尚未配置 NDK ，请根据 Android Studio 用户指南中的安装及配置 NDK 和 CMake 内容，预先配置好 NDK 。您可以选择最新的 NDK 版本，或者使用 Paddle Lite 预测库版本一样的 NDK

使用AndroidStudio打开工程目录: https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/shell/cxx/yolov5n_detection

更新预测库到demo工程中:

/java/jar$ cp PaddlePredictor.jar yolov5n_detection_demo/app/PaddleLite/java/

/java/so$ cp libpaddle_lite_jni.so yolov5n_detection_demo/app/PaddleLite/java/libs/arm64-v8a/

cxx/include$ cp ./* yolov5n_detection_demo/app/PaddleLite/cxx/include/

cxx/lib$ cp libpaddle_light_api_shared.so yolov5n_detection_demo/app/PaddleLite/cxx/libs/arm64-v8a/

启动Android部署到到手机上

在性能不错的小米手机上可达到50fps

基于裁剪的模型进行部署

使用上一个项目裁剪训练的权重: yolov5_paddle_prune/weights/finetune.pdparams

转换静态图时需要适配cfg配置文件，将裁剪后的模型通过cfg配置文件加载

相比之前的nb文件大小，裁剪后从 7.5MB 降低到 2.8MB.

%cd /home/aistudio/yolov5_paddle_prune
!python convert_static_prune.py --weights ./weights/finetune.pdparams --cfg cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg --data ./data/coco_person.yaml --source crop.jpg

/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Sized
[34m[1mconvert_static_prune: [0mweights=['./weights/finetune.pdparams'], cfg=cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg, source=crop.jpg, imgsz=[320, 320], conf_thres=0.01, iou_thres=0.6, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, dnn=False, single_cls=False, data=./data/coco_person.yaml, hyp=data/hyps/hyp.scratch.yaml
[31m[1mrequirements:[0m tqdm>=4.36.1 not found and is required by YOLOv5, attempting auto-update...

# 按照同样的方式生成nb文件， 并部署到同样的安卓手机,  paddle-lite-demo代码不需要额外的改动
%cd /home/aistudio/yolov5_paddle_prune/
!pip install paddlelite==2.11# cpu
!paddle_lite_opt --model_file=./prune_net.pdmodel --param_file=prune_net.pdiparams  --optimize_out_type=naive_buffer --optimize_out=./yolov5n_cpu_prune --valid_targets=arm#gpu
!paddle_lite_opt --model_file=./prune_net.pdmodel --param_file=prune_net.pdiparams  --optimize_out_type=naive_buffer --optimize_out=./yolov5n_gpu_prune --valid_targets=opencl,arm

/home/aistudio/yolov5_paddle_prune
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting paddlelite==2.11Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8b/d7/8babc059bff1d02dc85e32fb8bff3c56680db5f5b20246467f9fcbcfe62c/paddlelite-2.11-cp37-cp37m-manylinux1_x86_64.whl (47.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.1/47.1 MB[0m [31m12.8 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: paddlelite
Successfully installed paddlelite-2.11
[33mWARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.[0m[33m
[0mLoading topology data from ./prune_net.pdmodel
Loading params data from prune_net.pdiparams
1. Model is successfully loaded!
2. Model is optimized and saved into ./yolov5n_cpu_prune.nb successfully
Loading topology data from ./prune_net.pdmodel
Loading params data from prune_net.pdiparams
1. Model is successfully loaded!
2. Model is optimized and saved into ./yolov5n_gpu_prune.nb successfully

adb shell运行

替换之前的模型，推理速度大幅度降低

CPU下运行：

GPU下运行：

注意: 测试时该模型的conf阈值设置0.01，输出结果 05000319_yolov5n_detection_result.jpg

部署到安卓手机，原来的代码有点bug，设置阈值0.01无法生效，需要修改一点源码，将confThresh_替换成scoreThreshold_

写在最后

本项目详细记录了如何将yolov５模型部署到安卓手机的操作步骤和踩坑经历，不经将原始权重成功部署，还能将自己训练的裁剪后的模型也能无需修改源码进行部署。

关于作者

成都飞桨领航团团长
PPDE
AICA三期学员
PFCC成员

我在AI Studio上获得钻石等级，点亮10个徽章，来互关呀~
https://aistudio.baidu.com/aistudio/personalcenter/thirdview/89442

请点击此处查看本环境基本用法.

Please click here for more detailed instructions.

此文为搬运，原作链接：https://aistudio.baidu.com/aistudio/projectdetail/4245931

基于PaddleLite实现yolov5的移动端部署相关推荐

[Paddle Detection]基于PP-PicoDet行车检测（完成安卓端部署）
基于PP-PicoDet行车检测(完成安卓端部署)_哔哩哔哩_bilibili 一.项目简介项目背景: 基于视觉深度学习的自动驾驶场景,旨在对车载摄像头采集的视频数据进行道路场景解析(行车检测),为 ...
yolov5笔记（3）——移动端部署自己的模型（随5.0更新）
一直以来学习目标检测的最终目标就是为了移动端的部署,比方说树莓派.jetson.安卓.ios等.之前因为实在对object_detection训练出来的东西效果不满意,所以当时没继续研究移动端部署.如 ...
yolov5 | 移动端部署yolov5s模型
移动端的部署有这么几条路: (以yolov5s.pt模型为例) pt文件 --> onnx文件/torchscript文件 --> ncnn --> 安卓端部署(android st ...
tensorflow Lite 2---- 移动端部署--yolov5+训练自己的数据集
一.模型移动端环境部署可以参考: tensorflow lite 1---- 移动端部署--object detection 官方历程手把手教程_行码阁119的博客-CSDN博客二.训练模型本文 ...
基于JAVA校园外卖系统Web端计算机毕业设计源码+系统+数据库+lw文档+部署
基于JAVA校园外卖系统Web端计算机毕业设计源码+系统+数据库+lw文档+部署基于JAVA校园外卖系统Web端计算机毕业设计源码+系统+数据库+lw文档+部署本源码技术栈: 项目架构:B/S架构 ...
windows下基于libtorch的yolov5 6.0的c++部署
windows下基于libtorch的yolov5 6.0的c++部署 1.概述 libtorch是pytorch的C++版本,在需要多进程.提高推理速度等需求下会比python语言更具有优势.本文根 ...
移动端调取摄像头上面如何给出框_飞桨实战笔记：自编写模型如何在服务器和移动端部署...
作为深度学习小白一枚,从一开始摸索如何使用深度学习框架,怎么让脚本跑起来,到现在开始逐步读懂论文,看懂模型的网络结构,按照飞桨官方文档进行各种模型训练和部署,整个过程遇到了无数问题.非常感谢飞桨开 ...
YOLOv5在android端实现目标检测+跟踪+越界识别并报警
YOLOv5在android端实现目标检测+跟踪+越界识别并报警想要获取源码和相关资料说明的可以关注我的微信公众号:雨中算法屋, 后台回复越界识别即可获取,有问题也可以关注公众号加我微信联系我,相互 ...
基于suse linux系统的cacti系统部署——rpm包方式
豆丁 http://www.docin.com/p-191889788.html rpm包方式:啊扬--沙迳:2010-12-1:更改:2011/5/16:一.Cacti的简介(来源:网络):Cact ...

基于PaddleLite实现yolov5的移动端部署

项目背景

导出nb文件

数据集

在models/common.py中修改，基于PaddleAPI实现PaddleLite不支持的算子silu：

修改原后处理逻辑

生成NB文件

Adb shell运行

安裝Adb软件（以ubuntu为例）

关键代码

模型加载:

数据输入预处理

后处理代码

打开日志调试

运行Yolov5n推理

部署到Anroid手机

启动Android部署到到手机上

基于裁剪的模型进行部署

adb shell运行

写在最后

关于作者

基于PaddleLite实现yolov5的移动端部署相关推荐

最新文章

热门文章

基于PaddleLite实现yolov5的移动端部署

项目背景

导出nb文件

数据集

在models/common.py中修改， 基于PaddleAPI实现PaddleLite不支持的算子silu：

修改原后处理逻辑

生成NB文件

Adb shell运行

安裝Adb软件（以ubuntu为例）

关键代码

模型加载:

数据输入预处理

后处理代码

打开日志调试

运行Yolov5n推理

部署到Anroid手机

启动Android部署到到手机上

基于裁剪的模型进行部署

adb shell运行

写在最后

关于作者

基于PaddleLite实现yolov5的移动端部署相关推荐

最新文章

热门文章

在models/common.py中修改，基于PaddleAPI实现PaddleLite不支持的算子silu：