A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images

Abstract
I. Introduction
- 这个方法分为三个步骤，
- 这篇文章的贡献
II. Related Works
- Volume-based Generator
- Surface-based Generator
- 结构推断
III. The Proposed Approach
- 第一阶段从RGB图像$I$到三维图像点集$K$
- 第二阶段从骨架到基础网格
- - 全局指导的子体积生成
  - 基于图的体积校正
  - 基础网格提取
- 第三阶段网格改良
- - GCNN网格变形

Abstract

从单张RGB图片中学习物体三维表面
其他算法对于复杂拓扑结构的恢复效果不佳
本文提出一个骨架桥接（skeleton-bridged），阶段性学习方法来解决这一个难题
选择骨架的原因：1. 保留了良好的拓扑结构 2. 学习难度低
为了从输入图像中学习骨架，我们设计了一种深层结构，其解码器基于一种新颖的平行流设计，分别用于合成曲线和曲面的骨架点
我们还提出了输入图像的多阶段使用，以纠正可能在每个阶段累积的预测误差
ShapeNet-Skeleton dataset（Not available yet）

I. Introduction

这是一个有难度的逆问题
最近有人使用mesh reconstructions-> 网格变形, 遇到复杂拓扑结构就会失效

这个方法分为三个步骤，

从输入图像生成骨架点: CurSkeNet, SurSkeNet
我们通过首先将获得的骨架转换为粗体积表示，然后使用训练好的3D CNN来细化粗体来生成基础网格。
我们给予提取好的基础网格图和变形节点，利用GCNN生成最终网格图，

这篇文章的贡献

提出了阶段性学习的方法，利用了点云，体积和网格各自的优点
骨架桥接的方法
腐蚀学习方法来验证算法功效

II. Related Works

Volume-based Generator

由于其效率和拓扑不敏感性，我们还利用基于体积的生成器将推断的骨架点云转换为实体体积，有效地弥合了骨架和表面网格之间的差距

Surface-based Generator

点云时对表面进行的采样—系数分散
网格，多折叠表面最自然的离散化。CNN难以学习到网格的生成
有一些方法是通过学习变形来近似网格
我们首先借用AtlasNet中的想法来推断中观骨架点，然后将其转换为基础网格。最后，进一步采用Pixel2mesh的方法来生成几何细节

结构推断

比起估计几何形状，现在更多的研究目的是恢复三维结构。
立方体很难用来拟合曲面
需要大量人工标记数据

III. The Proposed Approach

本方法分为三个阶段

第一阶段从RGB图像III到三维图像点集KKK

采用训练好的resnet作为encoder
创新一个decoder
CurSkeNet和SurSkeNet均基于多层感知器（MLP），其设置与AtlasNet相同：4个全连接层，1024，512，256，3。前三层Relu，最后一层tanh。
关于潜在向量是如何与1D和2D数据结合的，文章中似乎没有仔细说明。但是我们可以通过借助AtlasNet的复现来了解到。应该是每一个潜在向量与单个点（1D，2D）concatenate在一起。

class PointGenCon(nn.Module):def __init__(self, bottleneck_size = 2500):self.bottleneck_size = bottleneck_sizesuper(PointGenCon, self).__init__()self.conv1 = torch.nn.Conv1d(self.bottleneck_size, self.bottleneck_size, 1)self.conv2 = torch.nn.Conv1d(self.bottleneck_size, self.bottleneck_size//2, 1)self.conv3 = torch.nn.Conv1d(self.bottleneck_size//2, self.bottleneck_size//4, 1)self.conv4 = torch.nn.Conv1d(self.bottleneck_size//4, 3, 1)self.th = nn.Tanh()self.bn1 = torch.nn.BatchNorm1d(self.bottleneck_size)self.bn2 = torch.nn.BatchNorm1d(self.bottleneck_size//2)self.bn3 = torch.nn.BatchNorm1d(self.bottleneck_size//4)def forward(self, x):batchsize = x.size()[0]# print(x.size())x = F.relu(self.bn1(self.conv1(x)))x = F.relu(self.bn2(self.conv2(x)))x = F.relu(self.bn3(self.conv3(x)))x = self.th(self.conv4(x))return xclass SVR_AtlasNet(nn.Module):def __init__(self, num_points = 2048, bottleneck_size = 1024, nb_primitives = 5, pretrained_encoder = False, cuda=True):super(SVR_AtlasNet, self).__init__()self.usecuda = cudaself.num_points = num_pointsself.bottleneck_size = bottleneck_sizeself.nb_primitives = nb_primitivesself.pretrained_encoder = pretrained_encoderself.encoder = resnet.resnet18(pretrained=self.pretrained_encoder, num_classes=1024)self.decoder = nn.ModuleList([PointGenCon(bottleneck_size = 2 +self.bottleneck_size) for i in range(0, self.nb_primitives)])def forward(self, x):x = x[:,:3,:,:].contiguous()x = self.encoder(x)outs = []for i in range(0, self.nb_primitives):rand_grid = Variable(torch.cuda.FloatTensor(x.size(0), 2, self.num_points//self.nb_primitives))rand_grid.data.uniform_(0, 1)y = x.unsqueeze(2).expand(x.size(0), x.size(1), rand_grid.size(2)).contiguous()y = torch.cat( (rand_grid, y.type_as(rand_grid)), 1).contiguous()outs.append(self.decoder[i](y))return torch.cat(outs, 2).contiguous().transpose(2,1).contiguous()def decode(self, x):outs = []for i in range(0, self.nb_primitives):rand_grid = Variable(torch.cuda.FloatTensor(x.size(0), 2, self.num_points//self.nb_primitives))rand_grid.data.uniform_(0, 1)y = x.unsqueeze(2).expand(x.size(0), x.size(1), rand_grid.size(2)).contiguous()y = torch.cat( (rand_grid, y.type_as(rand_grid)), 1).contiguous()outs.append(self.decoder[i](y))return torch.cat(outs, 2).contiguous().transpose(2,1).contiguous()def forward_inference(self, x, grid):x = self.encoder(x)outs = []for i in range(0, self.nb_primitives):if self.usecuda:rand_grid = Variable(torch.cuda.FloatTensor(grid[i]))else:rand_grid = Variable(torch.FloatTensor(grid[i]))rand_grid = rand_grid.transpose(0, 1).contiguous().unsqueeze(0)rand_grid = rand_grid.expand(x.size(0), rand_grid.size(1), rand_grid.size(2)).contiguous()# print(rand_grid.sizerand_grid())y = x.unsqueeze(2).expand(x.size(0), x.size(1), rand_grid.size(2)).contiguous()y = torch.cat( (rand_grid, y), 1).contiguous()outs.append(self.decoder[i](y))return torch.cat(outs, 2).contiguous().transpose(2,1).contiguous()def forward_inference_from_latent_space(self, x, grid):outs = []for i in range(0, self.nb_primitives):rand_grid = Variable(torch.cuda.FloatTensor(grid[i]))rand_grid = rand_grid.transpose(0, 1).contiguous().unsqueeze(0)rand_grid = rand_grid.expand(x.size(0), rand_grid.size(1), rand_grid.size(2)).contiguous()# print(rand_grid.sizerand_grid())y = x.unsqueeze(2).expand(x.size(0), x.size(1), rand_grid.size(2)).contiguous()y = torch.cat( (rand_grid, y), 1).contiguous()outs.append(self.decoder[i](y))return torch.cat(outs, 2).contiguous().transpose(2,1).contiguous()

K是三维点集
训练使用 Chamfer Distance
Lcd=∑x∈Kmin⁡y∈K∗∥x−y∥22+∑y∈K∗min⁡x∈K∥x−y∥22\mathcal{L}_{c d}=\sum_{x \in K} \min _{y \in K^{*}}\|x-y\|_{2}^{2}+\sum_{y \in K^{*}} \min _{x \in K}\|x-y\|_{2}^{2} Lcd=x∈K∑y∈K∗min∥x−y∥22+y∈K∗∑x∈Kmin∥x−y∥22
Laplacian smoothness 来保证连续性
Llap=∑x∈K∥x−1∣N(x)∣∑p∈N(x)p∥2\mathcal{L}_{l a p}=\sum_{x \in K}\left\|x-\frac{1}{|\mathcal{N}(x)|} \sum_{p \in \mathcal{N}(x)} p\right\|_{2} Llap=x∈K∑∥∥∥∥∥∥x−∣N(x)∣1p∈N(x)∑p∥∥∥∥∥∥2

第二阶段从骨架到基础网格

首先把三维骨架点集转换为体积表示VkV_kVk
VkV_kVk通过3DCNN转换为VVV，本文还提到要通过原图像III来校正累计阶段预测误差。
再通过使用Marching Cubes方法，把VVV转换为基础网格MbM_bMb

全局指导的子体积生成

本文还考虑了低精度和高精度体积表示的问题。对于低精度64364^3643的数据进行全局处理，对于高精度的1283128^31283，就是进行局部出来，对子体积64364^3643进行分别的处理。

基于图的体积校正

一个单独的编码译码网络，使用Resnet-18作为编码器，然后用几个3D去卷积层作为解码器。
生成一个32332^3323的体积
因为3DCNN是使用训练好的数据，所以，应该会在3DDeCNN中插入解码器的输出。

基础网格提取

Marching Cube提取基础网格
QEM算法得到简化网格减少运算负担
基础网络的存储格式是点集合P={I∈R3}P=\{I\in R^3 \}P={I∈R3}+面集合S={Q∈R3}S=\{Q\in R^3 \}S={Q∈R3}。平面用法向量去表示。在三维中，平面可以由法向量(a,b,c)表示，因为平面可以表示成ax+by+cz=0ax+by+cz=0ax+by+cz=0。

第三阶段网格改良

MbM_bMb lack surface details

GCNN网格变形

基于图的卷及网络
hpl+1=w0hpl+∑q∈N(p)w1hqlh_{p}^{l+1}=w_{0} h_{p}^{l}+\sum_{q \in \mathcal{N}(p)} w_{1} h_{q}^{l} hpl+1=w0hpl+q∈N(p)∑w1hql
关于图卷积以及VGG16是如何与之结合的可以参考Pixel2Mesh。

【论文笔记】【CVPR2019】A Skeleton-bridged DL Approach for Generating Meshes of Complex Topo from 1 RGB Img相关推荐

【论文笔记】PassGAN: A Deep Learning Approach for Password Guessing
title: "[论文笔记]PassGAN: A Deep Learning Approach for Password Guessing" date: 2019-10-12 la ...
论文笔记：DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs
论文笔记:DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs co ...
论文笔记：An Adaptive Feature Norm Approach for Unsupervised Domain Adaptation
论文链接最近把毕业论文的题目确定了,这个专栏专门放相关论文的阅读笔记,个人理解可能有限,欢迎大家指正! 标题理解阅读论文之前,首先对标题进行一定的理解,才能更好的理解论文的内容. 论文完整标题为: ...
论文笔记 SiamMask : Fast Online Object Tracking and Segmentation: A Unifying Approach
论文连接:[1812.05050] Fast Online Object Tracking and Segmentation: A Unifying Approach 论文连接:[1812.05050 ...
论文笔记：Bootstrap Your Own Latent A New Approach to Self-Supervised Learning
论文笔记:Bootstrap Your Own Latent A New Approach to Self-Supervised Learning abstract: 介绍了BYOL网络(原理):依赖 ...
（CoRL2020）DIRL: Domain-Invariant Representation Learning Approach for Sim-to-Real Transfer 论文笔记
(CoRL2020)DIRL: Domain-Invariant Representation Learning Approach for Sim-to-Real Transfer 论文笔记 pape ...
[论文总结] 深度学习在农业领域应用论文笔记5
深度学习在农业领域应用论文笔记5 1. Channel pruned YOLO V5s-based deep learning approach for rapid and accurate appl ...
论文笔记【A Comprehensive Study of Deep Video Action Recognition】
论文链接:A Comprehensive Study of Deep Video Action Recognition 目录 A Comprehensive Study of Deep Video A ...
[深度学习论文笔记]医学图像分割U型网络大合集
[深度学习论文笔记]医学图像分割U型网络大合集 2015 U-Net: Convolutional Networks for Biomedical Image Segmentation (MICCAI ...

【论文笔记】【CVPR2019】A Skeleton-bridged DL Approach for Generating Meshes of Complex Topo from 1 RGB Img

A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images

Abstract

I. Introduction

这个方法分为三个步骤，

这篇文章的贡献

II. Related Works

Volume-based Generator

Surface-based Generator

结构推断

III. The Proposed Approach

第一阶段从RGB图像III到三维图像点集KKK

第二阶段从骨架到基础网格

全局指导的子体积生成

基于图的体积校正

基础网格提取

第三阶段网格改良

GCNN网格变形

【论文笔记】【CVPR2019】A Skeleton-bridged DL Approach for Generating Meshes of Complex Topo from 1 RGB Img相关推荐

最新文章

热门文章

【论文笔记】【CVPR2019】A Skeleton-bridged DL Approach for Generating Meshes of Complex Topo from 1 RGB Img

A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images

Abstract

I. Introduction

这个方法分为三个步骤，

这篇文章的贡献

II. Related Works

Volume-based Generator

Surface-based Generator

结构推断

III. The Proposed Approach

第一阶段 从RGB图像III到三维图像点集KKK

第二阶段 从骨架到基础网格

全局指导的子体积生成

基于图的体积校正

基础网格提取

第三阶段 网格改良

GCNN网格变形

【论文笔记】【CVPR2019】A Skeleton-bridged DL Approach for Generating Meshes of Complex Topo from 1 RGB Img相关推荐

最新文章

热门文章

第一阶段从RGB图像III到三维图像点集KKK

第二阶段从骨架到基础网格

第三阶段网格改良