FCN网络训练训练——从零开始
FCN网络训练训练——从零开始
一 数据集准备
- 在/fcn.berkeleyvision.org/data/下新建文件夹 sbdd
- trianval:
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz
在该压缩包中找到dataset文件夹,将该文件夹拷贝到/fcn.berkeleyvision.org/data/sbdd 下 - test:
http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6
二 下载预训练模型
- 下载VGG-16的预训练模型放至/fcn.berkeleyvision.org/ilsvrc-nets/目录下,并重命名为vgg16-fcn.caffemodel
- 下载地址
VGG官网 http://www.robots.ox.ac.uk/~vgg/research/very_deep/
三 修改代码
一般情况下不要直接修改test.prototxt和trainval.prototxt,而是执行net.py这个脚本,执行完成后也不要将test.prototxt和trainval.prototxt中的fc6和fc7替换为其他名称.
修改val.proyotxt
layer {name: "data"type: "Python"top: "data"top: "label"python_param {module: "voc_layers"layer: "VOCSegDataLayer"#annotated by yan 20171118#param_str: "{\'voc_dir\': \'../data/pascal/VOC2011\', \'seed\': 1337, \'split\': \'seg11valid\', \'mean\': (104.00699, 116.66877, 122.67892)}"param_str: "{\'sbdd_dir\': \'../data/sbdd/dataset\', \'seed\': 1337, \'split\': \'val\', \'mean\': (104.00699, 116.66877, 122.67892)}"}
}
- 修改voc_layers.py
# annotated by yan
# import caffe# import numpy as np
# from PIL import Image# import random# class VOCSegDataLayer(caffe.Layer):
# """
# Load (input image, label image) pairs from PASCAL VOC
# one-at-a-time while reshaping the net to preserve dimensions.# Use this to feed data to a fully convolutional network.
# """# def setup(self, bottom, top):
# """
# Setup data layer according to parameters:# - voc_dir: path to PASCAL VOC year dir
# - split: train / val / test
# - mean: tuple of mean values to subtract
# - randomize: load in random order (default: True)
# - seed: seed for randomization (default: None / current time)# for PASCAL VOC semantic segmentation.# example# params = dict(voc_dir="/path/to/PASCAL/VOC2011",
# mean=(104.00698793, 116.66876762, 122.67891434),
# split="val")
# """
# # config
# params = eval(self.param_str)
# self.voc_dir = params['voc_dir']
# self.split = params['split']
# self.mean = np.array(params['mean'])
# self.random = params.get('randomize', True)
# self.seed = params.get('seed', None)# # two tops: data and label
# if len(top) != 2:
# raise Exception("Need to define two tops: data and label.")
# # data layers have no bottoms
# if len(bottom) != 0:
# raise Exception("Do not define a bottom.")# # load indices for images and labels
# split_f = '{}/ImageSets/Segmentation/{}.txt'.format(self.voc_dir,
# self.split)
# self.indices = open(split_f, 'r').read().splitlines()
# self.idx = 0# # make eval deterministic
# if 'train' not in self.split:
# self.random = False# # randomization: seed and pick
# if self.random:
# random.seed(self.seed)
# self.idx = random.randint(0, len(self.indices)-1)# def reshape(self, bottom, top):
# # load image + label image pair
# self.data = self.load_image(self.indices[self.idx])
# self.label = self.load_label(self.indices[self.idx])
# # reshape tops to fit (leading 1 is for batch dimension)
# top[0].reshape(1, *self.data.shape)
# top[1].reshape(1, *self.label.shape)# def forward(self, bottom, top):
# # assign output
# top[0].data[...] = self.data
# top[1].data[...] = self.label# # pick next input
# if self.random:
# self.idx = random.randint(0, len(self.indices)-1)
# else:
# self.idx += 1
# if self.idx == len(self.indices):
# self.idx = 0# def backward(self, top, propagate_down, bottom):
# pass# def load_image(self, idx):
# """
# Load input image and preprocess for Caffe:
# - cast to float
# - switch channels RGB -> BGR
# - subtract mean
# - transpose to channel x height x width order
# """
# im = Image.open('{}/JPEGImages/{}.jpg'.format(self.voc_dir, idx))
# in_ = np.array(im, dtype=np.float32)
# in_ = in_[:,:,::-1]
# in_ -= self.mean
# in_ = in_.transpose((2,0,1))
# return in_# def load_label(self, idx):
# """
# Load label image as 1 x height x width integer array of label indices.
# The leading singleton dimension is required by the loss.
# """
# im = Image.open('{}/SegmentationClass/{}.png'.format(self.voc_dir, idx))
# label = np.array(im, dtype=np.uint8)
# label = label[np.newaxis, ...]
# return label# class SBDDSegDataLayer(caffe.Layer):
# """
# Load (input image, label image) pairs from the SBDD extended labeling
# of PASCAL VOC for semantic segmentation
# one-at-a-time while reshaping the net to preserve dimensions.# Use this to feed data to a fully convolutional network.
# """# def setup(self, bottom, top):
# """
# Setup data layer according to parameters:# - sbdd_dir: path to SBDD `dataset` dir
# - split: train / seg11valid
# - mean: tuple of mean values to subtract
# - randomize: load in random order (default: True)
# - seed: seed for randomization (default: None / current time)# for SBDD semantic segmentation.# N.B.segv11alid is the set of segval11 that does not intersect with SBDD.
# Find it here: https://gist.github.com/shelhamer/edb330760338892d511e.# example# params = dict(sbdd_dir="/path/to/SBDD/dataset",
# mean=(104.00698793, 116.66876762, 122.67891434),
# split="valid")
# """
# # config
# params = eval(self.param_str)
# self.sbdd_dir = params['sbdd_dir']
# self.split = params['split']
# self.mean = np.array(params['mean'])
# self.random = params.get('randomize', True)
# self.seed = params.get('seed', None)# # two tops: data and label
# if len(top) != 2:
# raise Exception("Need to define two tops: data and label.")
# # data layers have no bottoms
# if len(bottom) != 0:
# raise Exception("Do not define a bottom.")# # load indices for images and labels
# split_f = '{}/{}.txt'.format(self.sbdd_dir,
# self.split)
# self.indices = open(split_f, 'r').read().splitlines()
# self.idx = 0# # make eval deterministic
# if 'train' not in self.split:
# self.random = False# # randomization: seed and pick
# if self.random:
# random.seed(self.seed)
# self.idx = random.randint(0, len(self.indices)-1)# def reshape(self, bottom, top):
# # load image + label image pair
# self.data = self.load_image(self.indices[self.idx])
# self.label = self.load_label(self.indices[self.idx])
# # reshape tops to fit (leading 1 is for batch dimension)
# top[0].reshape(1, *self.data.shape)
# top[1].reshape(1, *self.label.shape)# def forward(self, bottom, top):
# # assign output
# top[0].data[...] = self.data
# top[1].data[...] = self.label# # pick next input
# if self.random:
# self.idx = random.randint(0, len(self.indices)-1)
# else:
# self.idx += 1
# if self.idx == len(self.indices):
# self.idx = 0# def backward(self, top, propagate_down, bottom):
# pass# def load_image(self, idx):
# """
# Load input image and preprocess for Caffe:
# - cast to float
# - switch channels RGB -> BGR
# - subtract mean
# - transpose to channel x height x width order
# """
# im = Image.open('{}/img/{}.jpg'.format(self.sbdd_dir, idx))
# in_ = np.array(im, dtype=np.float32)
# in_ = in_[:,:,::-1]
# in_ -= self.mean
# in_ = in_.transpose((2,0,1))
# return in_# def load_label(self, idx):
# """
# Load label image as 1 x height x width integer array of label indices.
# The leading singleton dimension is required by the loss.
# """
# import scipy.io
# mat = scipy.io.loadmat('{}/cls/{}.mat'.format(self.sbdd_dir, idx))
# label = mat['GTcls'][0]['Segmentation'][0].astype(np.uint8)
# label = label[np.newaxis, ...]
# return label
import caffeimport numpy as np
from PIL import Imageimport randomclass VOCSegDataLayer(caffe.Layer):"""Load (input image, label image) pairs from PASCAL VOCone-at-a-time while reshaping the net to preserve dimensions.Use this to feed data to a fully convolutional network."""def setup(self, bottom, top):"""Setup data layer according to parameters:- sbdd_dir: path to SBDD `dataset` dir- split: train / seg11valid- mean: tuple of mean values to subtract- randomize: load in random order (default: True)- seed: seed for randomization (default: None / current time)for SBDD semantic segmentation.N.B.segv11alid is the set of segval11 that does not intersect with SBDD.Find it here: https://gist.github.com/shelhamer/edb330760338892d511e.exampleparams = dict(sbdd_dir="/path/to/SBDD/dataset",mean=(104.00698793, 116.66876762, 122.67891434),split="valid")"""# configparams = eval(self.param_str)self.sbdd_dir = params['sbdd_dir']self.split = params['split']self.mean = np.array(params['mean'])self.random = params.get('randomize', True)self.seed = params.get('seed', None)# two tops: data and labelif len(top) != 2:raise Exception("Need to define two tops: data and label.")# data layers have no bottomsif len(bottom) != 0:raise Exception("Do not define a bottom.")# load indices for images and labelssplit_f = '{}/{}.txt'.format(self.sbdd_dir,self.split)self.indices = open(split_f, 'r').read().splitlines()self.idx = 0# make eval deterministicif 'train' not in self.split:self.random = False# randomization: seed and pickif self.random:random.seed(self.seed)self.idx = random.randint(0, len(self.indices)-1)def reshape(self, bottom, top):# load image + label image pairself.data = self.load_image(self.indices[self.idx])self.label = self.load_label(self.indices[self.idx])# reshape tops to fit (leading 1 is for batch dimension)top[0].reshape(1, *self.data.shape)top[1].reshape(1, *self.label.shape)def forward(self, bottom, top):# assign outputtop[0].data[...] = self.datatop[1].data[...] = self.label# pick next inputif self.random:self.idx = random.randint(0, len(self.indices)-1)else:self.idx += 1if self.idx == len(self.indices):self.idx = 0def backward(self, top, propagate_down, bottom):passdef load_image(self, idx):"""Load input image and preprocess for Caffe:- cast to float- switch channels RGB -> BGR- subtract mean- transpose to channel x height x width order"""im = Image.open('{}/img/{}.jpg'.format(self.sbdd_dir, idx))in_ = np.array(im, dtype=np.float32)in_ = in_[:,:,::-1]in_ -= self.meanin_ = in_.transpose((2,0,1))return in_def load_label(self, idx):"""Load label image as 1 x height x width integer array of label indices.The leading singleton dimension is required by the loss."""import scipy.iomat = scipy.io.loadmat('{}/cls/{}.mat'.format(self.sbdd_dir, idx))label = mat['GTcls'][0]['Segmentation'][0].astype(np.uint8)label = label[np.newaxis, ...]return labelclass SBDDSegDataLayer(caffe.Layer):"""Load (input image, label image) pairs from the SBDD extended labelingof PASCAL VOC for semantic segmentationone-at-a-time while reshaping the net to preserve dimensions.Use this to feed data to a fully convolutional network."""def setup(self, bottom, top):"""Setup data layer according to parameters:- sbdd_dir: path to SBDD `dataset` dir- split: train / seg11valid- mean: tuple of mean values to subtract- randomize: load in random order (default: True)- seed: seed for randomization (default: None / current time)for SBDD semantic segmentation.N.B.segv11alid is the set of segval11 that does not intersect with SBDD.Find it here: https://gist.github.com/shelhamer/edb330760338892d511e.exampleparams = dict(sbdd_dir="/path/to/SBDD/dataset",mean=(104.00698793, 116.66876762, 122.67891434),split="valid")"""# configparams = eval(self.param_str)self.sbdd_dir = params['sbdd_dir']self.split = params['split']self.mean = np.array(params['mean'])self.random = params.get('randomize', True)self.seed = params.get('seed', None)# two tops: data and labelif len(top) != 2:raise Exception("Need to define two tops: data and label.")# data layers have no bottomsif len(bottom) != 0:raise Exception("Do not define a bottom.")# load indices for images and labelssplit_f = '{}/{}.txt'.format(self.sbdd_dir,self.split)self.indices = open(split_f, 'r').read().splitlines()self.idx = 0# make eval deterministicif 'train' not in self.split:self.random = False# randomization: seed and pickif self.random:random.seed(self.seed)self.idx = random.randint(0, len(self.indices)-1)def reshape(self, bottom, top):# load image + label image pairself.data = self.load_image(self.indices[self.idx])self.label = self.load_label(self.indices[self.idx])# reshape tops to fit (leading 1 is for batch dimension)top[0].reshape(1, *self.data.shape)top[1].reshape(1, *self.label.shape)def forward(self, bottom, top):# assign outputtop[0].data[...] = self.datatop[1].data[...] = self.label# pick next inputif self.random:self.idx = random.randint(0, len(self.indices)-1)else:self.idx += 1if self.idx == len(self.indices):self.idx = 0def backward(self, top, propagate_down, bottom):passdef load_image(self, idx):"""Load input image and preprocess for Caffe:- cast to float- switch channels RGB -> BGR- subtract mean- transpose to channel x height x width order"""im = Image.open('{}/img/{}.jpg'.format(self.sbdd_dir, idx))in_ = np.array(im, dtype=np.float32)in_ = in_[:,:,::-1]in_ -= self.meanin_ = in_.transpose((2,0,1))return in_def load_label(self, idx):"""Load label image as 1 x height x width integer array of label indices.The leading singleton dimension is required by the loss."""import scipy.iomat = scipy.io.loadmat('{}/cls/{}.mat'.format(self.sbdd_dir, idx))label = mat['GTcls'][0]['Segmentation'][0].astype(np.uint8)label = label[np.newaxis, ...]return label
- 修改solve.py,参考
# annotated by yan 20171128
# import caffe
# import surgery, score# import numpy as np
# import os
# import sys# try:
# import setproctitle
# setproctitle.setproctitle(os.path.basename(os.getcwd()))
# except:
# pass# weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'# # init
# caffe.set_device(int(sys.argv[1]))
# caffe.set_mode_gpu()# solver = caffe.SGDSolver('solver.prototxt')
# solver.net.copy_from(weights)# # surgeries
# interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
# surgery.interp(solver.net, interp_layers)# # scoring
# val = np.loadtxt('../data/segvalid11.txt', dtype=str)# for _ in range(25):
# solver.step(4000)
# score.seg_tests(solver, False, val, layer='score')import sys
import caffe
import surgery, scoreimport numpy as np
import ostry:import setproctitlesetproctitle.setproctitle(os.path.basename(os.getcwd()))
except:passvgg_weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'
vgg_proto = '../ilsvrc-nets/VGG_ILSVRC_16_layers_deploy.prototxt'
# init
#caffe.set_device(int(sys.argv[1]))
caffe.set_device(0)
caffe.set_mode_gpu()solver = caffe.SGDSolver('solver.prototxt')
vgg_net = caffe.Net(vgg_proto, vgg_weights, caffe.TRAIN)
surgery.transplant(solver.net, vgg_net)
del vgg_net# surgeries
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)# scoring
val = np.loadtxt('../data/sbdd/dataset/val.txt', dtype=str)for _ in range(25):solver.step(4000)score.seg_tests(solver, False, val, layer='score')
crop.py
infer.py
score.py
surgery.py
voc_layers.py
voc_helper.py
FCN网络训练训练——从零开始相关推荐
- FCN网络的训练——以燃气表数字识别为例
原文http://blog.csdn.net/hduxiejun/article/details/54234766 FCN网络的训练--以燃气表数字识别为例 目录 用 [TOC]来生成目录: FCN网 ...
- FCN网络的训练——以SIFT-Flow 数据集为例(转)
FCN网络的训练--以SIFT-Flow 数据集为例 参考文章: http://blog.csdn.net/u013059662/article/details/52770198 caffe的安装配置 ...
- caffe FCN网络的训练——以SIFT-Flow 数据集为例
原文:http://www.cnblogs.com/xuanxufeng/p/6243342.html 我在练习中根据操作稍微修改了一些内容, caffe fcn学习资料收集: 可以参考这个训练: h ...
- FCN网络的训练——以SIFT-Flow 数据集为例
参考文章: http://blog.csdn.net/u013059662/article/details/52770198 caffe的安装配置,以及fcn的使用在我前边的文章当中都已经提及到了,这 ...
- 【21】FCN网络训练及理解
准备工作 代码地址:https://github.com/bat67/pytorch-FCN-easiest-demo 论文参考:全卷积网络 FCN 详解 FCN详解与pytorch简单实现(附详细代 ...
- HALCON 21.11:深度学习笔记---网络和训练过程(4)
HALCON 21.11:深度学习笔记---网络和训练过程(4) HALCON 21.11.0.0中,实现了深度学习方法.关于网络和训练过程如下: 在深度学习中,任务是通过网络发送输入图像来执行的.整 ...
- ResNet网络的训练和预测
ResNet网络的训练和预测 简介 Introduction 图像分类与CNN 图像分类 是指将图像信息中所反映的不同特征,把不同类别的目标区分开来的图像处理方法,是计算机视觉中其他任务,比如目标检测 ...
- 南京邮电大学网络攻防训练平台(NCTF)-异性相吸-Writeup
南京邮电大学网络攻防训练平台(NCTF)-异性相吸-Writeup 题目描述 文件下载地址 很明显,文件之间进行亦或就可得到flag,不再多说,直接上脚本 1 #coding:utf-8 2 file ...
- CNN tflearn处理mnist图像识别代码解说——conv_2d参数解释,整个网络的训练,主要就是为了学那个卷积核啊。...
官方参数解释: Convolution 2D tflearn.layers.conv.conv_2d (incoming, nb_filter, filter_size, strides=1, pad ...
最新文章
- Node入门--6--文件系统-创建删除
- 【深度学习】深度神经网络框架的探索(从Regression说起)
- 深度学习与计算机视觉系列(7)_神经网络数据预处理,正则化与损失函数
- rsync 3.1.1源代码编译安装配置
- 《vue+vant 文本超出两行部分省略号显示》
- Intel QuickAssist Technology and OpenSSL – Benchmarks and Setup Tips
- java图形界面重写坐标_重写自由软件和计算机图形的历史
- 给一个div innerhtml 后 没有内容显示的问题_实战:仅用18行JavaScript构建一个倒数计时器...
- hive查询数据库总条数
- 计算机金融学校排名2015,金融学院2015级各专业排名情况统计表
- 如何用eclipse读取.txt文件
- 9大常见光固化3D打印树脂分析
- Coinbase眼中的侧链和layer2解决方案
- Python使用openpyxl模块小批量处理Excel文件
- Solidworks如何生成爆炸图
- 基于javaweb房屋租赁管理系统的设计与实现
- 移动硬盘 linux找不到,求助!linux对usb设备的接入应该是自动的吧,我的移动硬盘找不到...
- fcpx插件:童年印象回忆复古视觉特效和转场Stupid Raisins Slide Pop
- $http请求之options
- 向Mysql批量插入50万条数据