首发极术社区
如对Arm相关技术感兴趣，欢迎私信aijishu20加入技术微信群。

R329开发板系列教程

本系列教程主要介绍在R329(MaixSense）板卡上进行AI模型部署。预计分为以下几个部分：

Zhouyi Compass 部署及仿真 (申请样板必看)
R329开发板运行AIPU可执行程序！
R329开发板调用摄像头及屏幕进行实时模型运行
R329开发板基于python的模型运行
R329开发板Debian系统初体验

本文为第一篇，介绍 Zhouyi Compass的部署及仿真

Zhouyi Compass的部署及仿真

Zhouyi Compass是周易NPU的工具合集，这里主要介绍AIPU NN compiler和simulator的使用。

0. NN compiler 工作流程简述

NN compiler是用于转换神经网络模型到AIPU可执行程序的编译器。
内部执行原理为：

模型解析器：转换预训练模型到IR (Intermediate Representation)
1. pb(tf1.0~1.15)
2. tflite(tf1.0~1.15)
3. caffemodel(version 1)
4. onnx(up to opsets 15)
量化模块：转换 float IR到int8 IR
生成模块：使用int8 IR 生成AIPU可执行文件，可以在真实芯片上运行

1. 下载环境

使用矽速科技提供的docker环境进行开发：

注：请保证至少有20GB的空闲磁盘空间

# 方法一，从docker hub下载，需要梯子
sudo docker pull zepan/zhouyi
# 方法二，百度云下载镜像文件（压缩包约2.9GB，解压后约5.3GB）
# 链接：https://pan.baidu.com/s/1yaKBPDxR_oakdTnqgyn5fg
# 提取码：f8dr
gunzip zhouyi_docker.tar.gz
sudo docker load --input zhouyi_docker.tar

下载好docker后即可运行其中的例程测试环境是否正常：

sudo docker run -i -t zepan/zhouyi  /bin/bashcd ~/demos/tflite
./run_sim.sh
python3 quant_predict.py

2. 生成模型文件

目前 NN compiler 支持pb,tflite,caffemodel,onnx格式，用户需要先转换自己的模型格式到对应格式

常见预训练模型文件在 github上可以下载：
https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models

下载好预训练的ckpt文件后，转换ckpt到冻结的pb文件, 这里建议使用tf1.13~1.15之间的版本

# 导出图
python3 export_inference_graph.py \--alsologtostderr \--model_name=resnet_v1_50 \--image_size=224 \--labels_offset=1 \ # resnet_50 specific, default is 0--output_file=/tmp/resnet_v1_50_inf.pb
# 使用预训练权重冻结
python3 freeze_graph.py \--input_graph=/tmp/resnet_v1_50_inf.pb \--input_checkpoint=/tmp/resnet_v1.ckpt \--input_binary=true --output_graph=/tmp/resnet_v1_50_frozen.pb \--output_node_names= resnet_v1_50/predictions/Reshape_1

3. 准备量化矫正数据集

量化矫正数据集分为两部分

数据文件：经过预处理的数据，即最终送给模型输入的数据，如 [image.numpy(), …]
标签文件：label文件, 如 np.array(label_list)

这两类文件需要使用numpy文件格式存储，示例代码片段：

label_data = open(label_file)
filename_list = []
label_list = []
for line in label_data:filename_list.append(line.rstrip('\n').split(' ')[0])label_list.append(int(line.rstrip('\n').split(' ')[1]))
label_data.close()
img_num = len(label_list)images = np.zeros([img_num, input_height, input_width, input_channel], np.float32)
for file_name, img_idx in zip(filename_list, range(img_num)):   image_file = os.path.join(img_dir, file_name)img_s = tf.gfile.GFile(image_file, 'rb').read()  image = tf.image.decode_jpeg(img_s)image = tf.cast(image, tf.float32)image = tf.clip_by_value(image, 0., 255.)image = aspect_preserving_resize(image, min(input_height, input_width), input_channel) image = central_crop(image, input_height, input_width)image = tf.image.resize_images(image, [input_height, input_width])image = (image - mean) / varimage = image.numpy()_, _, ch = image.shapeif ch == 1:image = tf.tile(image, multiples=[1,1,3])image = image.numpy()images[img_idx] = imagenp.save('dataset.npy', images)labels = np.array(label_list)
np.save('label.npy', labels)

4. 编辑NN compiler配置文件

得到pb和校准数据集后，我们就可以编辑NN编译器的配置文件来生成AIPU的可执行文件

[Common]
mode=build   #build表示构建aipu可执行程序，run表示使用simulator模拟运行[Parser]
model_name = resnet_50
detection_postprocess =
model_domain = image_classification
output = resnet_v1_50/predictions/Reshape
input_model = ./resnet_50_model/frozen.pb
input = Placeholder
input_shape = [1,224,224,3][AutoQuantizationTool]
model_name = resnet_50
quantize_method = SYMMETRIC
ops_per_channel = DepthwiseConv
calibration_data = ./preprocess_resnet_50_dataset/dataset.npy
calibration_label = ./preprocess_resnet_50_dataset/label.npy
preprocess_mode = normalize
quant_precision=int8
reverse_rgb = False
label_id_offset = 0# build模式下的写法
[GBuilder]
target=Z1_0701
outputs=./resnet_50_model/aipu_resnet_50.bin
profile= True# run模式下的写法
[GBuilder]
inputs=./resnet_50_model/input.bin  #输入图像的二进制文件，按HWC排序的bin
outputs=output_resnet_50.bin   #输出结果
simulator=./aipu_simulator_z1  #模拟器路径，这里放到同路径下
profile= True
target=Z1_0701

5. 仿真AIPU执行结果

编辑完cfg文件后，即可执行获得运行结果

aipubuild config/resnet_50_build_run.cfg

执行后得到运算结果：output_resnet_50.bin
以及在执行输出过程中可以得到最后一层的反量化系数：

 [I]         layer_id: 76, layer_top:resnet_v1_50/predictions/Reshape_0, output_scale:[7.5395403]

这里的demo是1000分类，所以 output_resnet_50.bin 是1000字节的int8结果，除以这个 output_scale 就是实际的float输出结果。
这里简单使用int8格式进行解析，得到最大概率对应的类别，可以看到和实际图片类别一致

outputfile = './output_resnet_50.bin'
npyoutput = np.fromfile(outputfile, dtype=np.int8)
outputclass = npyoutput.argmax()
print("Predict Class is %d"%outputclass)

6. 申请开发板需要提供的仿真测试内容

原始模型文件（可选）
矫正集的data.npy和label.npy
NN compiler的cfg文件
simulator执行的输入输出结果，比较运算量化误差
详细申请流程请参见https://aijishu.com/e/1120000000214336

相关阅读：

R329开发板产品介绍
2021极术通讯-服务器芯片市场酝酿变局

R329教程一|周易 AIPU 部署及仿真教程相关推荐

怎样运用云服务器搭建传奇世界联网手游教程，linux系统部署游戏详细教程
传世 linux 架设教程服务器系统: linux - centos7.6 第一步:安装宝塔面板,各种服务器系统不一样,安装方法也不一样,详情可参考宝塔官方网站. yum install -y ...
walking机器人仿真教程-应用-自动查找ArUco Marker位置进行自主对接
系列文章目录 walking机器人仿真教程-启动仿真环境 walking机器人仿真教程-查看仿真环境相关话题 walking机器人仿真教程-仿真控制 walking机器人仿真教程-激光建图-仿真sla ...
Laravel 5 基础教程 || 1.安装与部署 - 表严肃
Laravel 5 基础教程 || 1.安装与部署 - 表严肃教程目录 1.安装与部署 - 表严肃 2.路由:链接与控制器的连接者 - 表严肃 3.控制器:任务的分发者 - 表严肃 Laravel是 ...
ros2与turtlebot3仿真教程-turtlebot3建图
系列文章目录 ros2与turtlebot3仿真教程-目录 ros2与turtlebot3仿真教程-安装ros2 ros2与turtlebot3仿真教程-安装turtlebot3 ros2与turtl ...
ros2与turtlebot3仿真教程-turtlebot3导航
系列文章目录 ros2与turtlebot3仿真教程-目录 ros2与turtlebot3仿真教程-安装ros2 ros2与turtlebot3仿真教程-安装turtlebot3 ros2与turtl ...
ros2与turtlebot3仿真教程-turtlebot3自走避障
系列文章目录 ros2与turtlebot3仿真教程-目录 ros2与turtlebot3仿真教程-安装ros2 ros2与turtlebot3仿真教程-安装turtlebot3 ros2与turtl ...
ros2与turtlebot3仿真教程-安装turtlebot3
系列文章目录 ros2与turtlebot3仿真教程-目录 ros2与turtlebot3仿真教程-安装ros2 ros2与turtlebot3仿真教程-安装turtlebot3 ros2与turtl ...
ros2与turtlebot3仿真教程-turtlebot3遥控
系列文章目录 ros2与turtlebot3仿真教程-目录 ros2与turtlebot3仿真教程-安装ros2 ros2与turtlebot3仿真教程-安装turtlebot3 ros2与turtl ...
【周易AIPU 仿真】在R329上部署VGG_16网络模型
首发极术社区如对Arm相关技术感兴趣,欢迎私信aijishu20加入技术微信群. 前言经过一周多时间的探索,参考了n篇历程,跑通了俩个网络模型,这里记录一下VGG_16网络模型的部署.全部操作都是 ...

R329教程一|周易 AIPU 部署及仿真教程