视觉图神经网络:一张图片可以看作是图的节点

Vision GNN: An Image is Worth Graph of Nodes. Pytorch Code

本文提出了一种 GNN 通用视觉模型,是来自中国科学院大学,北京华为诺亚方舟实验室的学者们在通用视觉模型方面有价值的探索。

1. 背景和动机

在现代计算机视觉任务中,通用视觉模型最早以 CNN 为主。近期 Vision Transformer,Vision MLP 为代表的新型主干网络的研究进展将通用视觉模型推向了一个前所未有的高度。

不同的主干网络对于输入图片的处理方式也不一样,如下图所示是一张图片的网格表示,序列表示和图表示。图像数据通常表示为欧几里得空间 (Euclidean space) 中的规则像素网格,CNN 通过在图片上进行滑动窗口操作引入平移不变形和局部性。而 Vision Transformer,Vision MLP 为代表的新型主干网络将图片视为图片块的序列,比如一般将 224×224 大小的图片分为196个 16×16 的图片块。

但是无论是上面的网格表示还是序列表示,图片都以一种非常规则的方式被建模了,也就是说,每个图片块之间的 “联系” 已经固化。比如图1中这条 “鱼” 的 “鱼头” 可能分布在多个图片块中,这些 Patch 按照网格表示或者序列表示都没有 “特殊” 的联系,但是它们在语义上其实都表示 “鱼头”。这或许就是传统的图片建模方法的不完美之处。

2.本文思路

本文提出以一种更加灵活的方式来处理图片:计算机视觉的一个基本任务是识别图像中的物体。由于图片中的物体通常不是形状规则的方形,所以经典的网格表示或者序列表示在处理图片时显得冗余且不够灵活。比如一个对象可以被视为由很多部分的组合:例如,一个人可以粗略地分为头部、上身、手臂和腿,这些由关节连接的部分自然形成一个图结构。

在网格表示中,像素或小块仅仅通过空间位置排序。在序列表示中,2D 图像被切分成为一系列小块。在图表示中,节点通过其内容链接起来,不受本地位置的约束。网格表示和序列表示都可以视为是图表示的特例。因此,将一张图片视为图是相对于前二者更加灵活且有效。

本文基于把图片视为图表示的观点,本文提出一种基于图表示的新型通用视觉架构 ViG。将输入图像分成许多小块,并将每个小块视为图中的一个节点。在构建好了输入图片的图表征之后,作者使用 ViG 模型在所有节点之间交换信息。ViG 的基本组成单元包括两部分:用于图形信息处理的 GCN (图形卷积网络) 模块和用于节点特征变换的 FFN (前馈网络) 模块。在图像识别,目标检测等视觉任务中证明了该方法的有效性。

3.具体方法

3.1 一张图片的表示

由于笔者不会在aistudio的Markdown里面打公式,所以就以图片方式展示原理。如下图所示

3.2使用图卷积网络作为骨干

3.3每一个图神经网络快的构成

4 模型架构

由于本文只复现了VIG-s模型,而VIG-s模型是一个金子塔结构,所以这边我们展示一下VIG-s模型以及其他延伸版的结构。

讲了这么多原理,接下来咱们就来上代码!走起~

本文复现了VIG-s模型,基于PaddleViT,如果对PaddleViT感兴趣的同学可以去github上浏览。

代码位置:PaddleViT/imageclassification/VIG

5 VIG中核心组成块代码解读

VIG当中最核心的便是图神经网络block了,下面我们来展示其Paddle的实现方式。

class GCN_block(nn.Layer):def __init__(self,in_channels, kernel_size=9, dilation=1, conv='edge', act='relu', norm=None,bias=True,  stochastic=False, epsilon=0.0, r=1, n=196, drop_path=0.0, relative_pos=False):super().__init__()self.channels = in_channelsself.n = n #节点数self.r = r self.fc1 = nn.Sequential(nn.Conv2D(in_channels, in_channels, 1, stride=1, padding=0),nn.BatchNorm2D(in_channels),) #映射层self.graph_conv = DyGraphConv2d(in_channels, in_channels * 2, kernel_size, dilation, conv,act, norm, bias, stochastic, epsilon, r) #图卷积层self.fc2 = nn.Sequential(nn.Conv2D(in_channels * 2, in_channels, 1, stride=1, padding=0),nn.BatchNorm2D(in_channels),) #映射层self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() # VIT中的DropPathself.relative_pos = Noneif relative_pos:  #是否使用相对连接关系print('using relative_pos')relative_pos_tensor = paddle.to_tensor(np.float32(get_2d_relative_pos_embed(in_channels,int(n**0.5)))).unsqueeze(0).unsqueeze(1)relative_pos_tensor = F.interpolate(relative_pos_tensor, size=(n, n//(r*r)), mode='bicubic', align_corners=False)self.relative_pos = add_parameter(self,-relative_pos_tensor.squeeze(1))self.relative_pos.stop_gradient=Truedef _get_relative_pos(self, relative_pos, H, W): """这个是得到每一个节点在整个图中的相对位置"""if relative_pos is None or H * W == self.n:return relative_poselse:N = H * WN_reduced = N // (self.r * self.r)return F.interpolate(relative_pos.unsqueeze(0), size=(N, N_reduced), mode="bicubic").squeeze(0)def forward(self, x):"""这里便是图神经网络块的前向传播可以对照前面原理理解一下"""_tmp = xx = self.fc1(x)B, C, H, W = x.shaperelative_pos = self._get_relative_pos(self.relative_pos, H, W)x = self.graph_conv(x, relative_pos)x = self.fc2(x)x = self.drop_path(x) + _tmpreturn x

接着,就是VIG整个网络组网了,除了图神经网络块,其余部分和Vision Transformer类似。

具体的每个模块可以到PaddleViT/imageclassification/VIG下面查找

class DeepGCN(nn.Layer):def __init__(self,layers,k = 9,conv = 'mr',act = 'gelu',norm = 'batch',bias = True,dropout = 0.0,use_dilation = True,epsilon = 0.2,use_stochastic = False,drop_path = 0.0,channels = [48,96,240,384],n_classes = 1000,emb_dims = 1024,**kwargs):super().__init__()self.n_blocks = sum(layers)dpr = [x.item() for x in paddle.linspace(0, drop_path, self.n_blocks)]num_knn = [int(x.item()) for x in paddle.linspace(k, k, self.n_blocks)]max_dilation = 49 // max(num_knn)reduce_ratios = [4, 2, 1, 1]self.stem = Stem(out_dim=channels[0], act=act)self.pos_embed = add_parameter(self,paddle.zeros((1, channels[0], 224//4, 224//4)))HW = 224 // 4 * 224 // 4self.backbone = []idx = 0for i in range(len(layers)):if i > 0:self.backbone.append(Downsample(channels[i-1], channels[i]))HW = HW // 4for j in range(layers[i]):self.backbone+= [nn.Sequential(GCN_block(channels[i], num_knn[idx], min(idx // 4 + 1, max_dilation), conv, act, norm,bias, use_stochastic, epsilon, reduce_ratios[i], n=HW, drop_path=dpr[idx],relative_pos=True),FFN(channels[i], channels[i] * 4, act=act, drop_path=dpr[idx]))]idx += 1self.backbone = nn.LayerList(self.backbone)self.backbone = nn.Sequential(*self.backbone)self.prediction = nn.Sequential(nn.Conv2D(channels[-1], 1024, 1),nn.BatchNorm2D(1024),act_layer(act),nn.Dropout(dropout),nn.Conv2D(1024, n_classes, 1))self.apply(self.cls_init_weights)def cls_init_weights(self, m):if isinstance(m, nn.Conv2D):kaiming(m.weight)#nn.initializer.KaimingNormal(m.weight)# trunc_normal_(m.weight)if isinstance(m, nn.Conv2D) and m.bias is not None:zeros_(m.bias)def forward(self, inputs):x = self.stem(inputs) + self.pos_embedB, C, H, W = x.shapefor i in range(len(self.backbone)):x = self.backbone[i](x)x = F.adaptive_avg_pool2d(x, 1)return self.prediction(x).squeeze(-1).squeeze(-1)

6. ImageNet上验证准确率

说了这么多,最终咱还是要看模型性能的,我们在ImageNet的验证集上验证准确率。代码如下

#运行一次就行
%cd data/
/
!tar -xf data105740/ILSVRC2012_val.tar
/home/aistudio/data
%cd /home/aistudio/PaddleViT/image_classification/VIG/
/home/aistudio/PaddleViT/image_classification/VIG
!pip install yacs pyyaml
!python -m paddle.distributed.launch --gpus 0 main_multi_gpu.py -cfg='./configs/vig_s.yaml' -dataset='imagenet2012' -batch_size=256 -data_path='/home/aistudio/data/ILSVRC2012_val' -pretrained='./vig_s.pdparams' -eval -amp
LAUNCH INFO 2022-07-02 23:17:43,153 -----------  Configuration  ----------------------
LAUNCH INFO 2022-07-02 23:17:43,153 devices: None
LAUNCH INFO 2022-07-02 23:17:43,154 elastic_level: -1
LAUNCH INFO 2022-07-02 23:17:43,154 elastic_timeout: 30
LAUNCH INFO 2022-07-02 23:17:43,154 gloo_port: 6767
LAUNCH INFO 2022-07-02 23:17:43,154 host: None
LAUNCH INFO 2022-07-02 23:17:43,154 job_id: default
LAUNCH INFO 2022-07-02 23:17:43,154 legacy: False
LAUNCH INFO 2022-07-02 23:17:43,154 log_dir: log
LAUNCH INFO 2022-07-02 23:17:43,154 log_level: INFO
LAUNCH INFO 2022-07-02 23:17:43,154 master: None
LAUNCH INFO 2022-07-02 23:17:43,154 max_restart: 3
LAUNCH INFO 2022-07-02 23:17:43,154 nnodes: 1
LAUNCH INFO 2022-07-02 23:17:43,154 nproc_per_node: None
LAUNCH INFO 2022-07-02 23:17:43,154 rank: -1
LAUNCH INFO 2022-07-02 23:17:43,154 run_mode: collective
LAUNCH INFO 2022-07-02 23:17:43,154 server_num: None
LAUNCH INFO 2022-07-02 23:17:43,154 servers:
LAUNCH INFO 2022-07-02 23:17:43,154 trainer_num: None
LAUNCH INFO 2022-07-02 23:17:43,154 trainers:
LAUNCH INFO 2022-07-02 23:17:43,154 training_script: 0
LAUNCH INFO 2022-07-02 23:17:43,154 training_script_args: ['main_multi_gpu.py', '-cfg=./configs/vig_s.yaml', '-dataset=imagenet2012', '-batch_size=256', '-data_path=/home/aistudio/data/ILSVRC2012_val', '-pretrained=./vig_s.pdparams', '-eval', '-amp']
LAUNCH INFO 2022-07-02 23:17:43,154 with_gloo: 0
LAUNCH INFO 2022-07-02 23:17:43,154 --------------------------------------------------
LAUNCH WARNING 2022-07-02 23:17:43,154 Compatible mode enable with args ['--gpus']
-----------  Configuration Arguments -----------
backend: auto
cluster_topo_path: None
elastic_pre_hook: None
elastic_server: None
enable_auto_mapping: False
force: False
gpus: 0
heter_devices:
heter_worker_num: None
heter_workers:
host: None
http_port: None
ips: 127.0.0.1
job_id: None
log_dir: log
np: None
nproc_per_node: None
rank_mapping_path: None
run_mode: None
scale: 0
server_num: None
servers:
training_script: main_multi_gpu.py
training_script_args: ['-cfg=./configs/vig_s.yaml', '-dataset=imagenet2012', '-batch_size=256', '-data_path=/home/aistudio/data/ILSVRC2012_val', '-pretrained=./vig_s.pdparams', '-eval', '-amp']
worker_num: None
workers:
------------------------------------------------
WARNING 2022-07-02 23:17:43,155 launch.py:519] Not found distinct arguments and compiled with cuda or xpu or npu or mlu. Default use collective mode
WARNING 2022-07-02 23:17:43,155 launch.py:519] Not found distinct arguments and compiled with cuda or xpu or npu or mlu. Default use collective mode
launch train in GPU mode!
INFO 2022-07-02 23:17:43,157 launch_utils.py:561] Local start 1 processes. First process distributed environment info (Only For Debug): +=======================================================================================+|                        Distributed Envs                      Value                    |+---------------------------------------------------------------------------------------+|                       PADDLE_TRAINER_ID                        0                      ||                 PADDLE_CURRENT_ENDPOINT                 127.0.0.1:33289               ||                     PADDLE_TRAINERS_NUM                        1                      ||                PADDLE_TRAINER_ENDPOINTS                 127.0.0.1:33289               ||                     PADDLE_RANK_IN_NODE                        0                      ||                 PADDLE_LOCAL_DEVICE_IDS                        0                      ||                 PADDLE_WORLD_DEVICE_IDS                        0                      ||                     FLAGS_selected_gpus                        0                      ||             FLAGS_selected_accelerators                        0                      |+=======================================================================================+INFO 2022-07-02 23:17:43,157 launch_utils.py:561] Local start 1 processes. First process distributed environment info (Only For Debug): +=======================================================================================+|                        Distributed Envs                      Value                    |+---------------------------------------------------------------------------------------+|                       PADDLE_TRAINER_ID                        0                      ||                 PADDLE_CURRENT_ENDPOINT                 127.0.0.1:33289               ||                     PADDLE_TRAINERS_NUM                        1                      ||                PADDLE_TRAINER_ENDPOINTS                 127.0.0.1:33289               ||                     PADDLE_RANK_IN_NODE                        0                      ||                 PADDLE_LOCAL_DEVICE_IDS                        0                      ||                 PADDLE_WORLD_DEVICE_IDS                        0                      ||                     FLAGS_selected_gpus                        0                      ||             FLAGS_selected_accelerators                        0                      |+=======================================================================================+INFO 2022-07-02 23:17:43,157 launch_utils.py:566] details about PADDLE_TRAINER_ENDPOINTS can be found in log/endpoints.log, and detail running logs maybe found in log/workerlog.0
INFO 2022-07-02 23:17:43,157 launch_utils.py:566] details about PADDLE_TRAINER_ENDPOINTS can be found in log/endpoints.log, and detail running logs maybe found in log/workerlog.0
launch proc_id:4013 idx:0
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Sized
----- Imagenet2012 val_list.txt len = 50000
INFO 2022-07-02 23:17:45,105 cloud_utils.py:122] get cluster from args:job_server:None pods:['rank:0 id:None addr:127.0.0.1 port:None visible_gpu:[] trainers:["gpu:[\'0\'] endpoint:127.0.0.1:35773 rank:0"]'] job_stage_flag:None hdfs:None
2022-07-02 23:17:47,117 MASTER_LOG ----- world_size = 1, local_rank = 0
----- AMP: False
BASE: ['']
DATA:BATCH_SIZE: 256BATCH_SIZE_EVAL: 256CROP_PCT: 0.9DATASET: imagenet2012DATA_PATH: /home/aistudio/data/ILSVRC2012_valIMAGENET_MEAN: [0.485, 0.456, 0.406]IMAGENET_STD: [0.229, 0.224, 0.225]IMAGE_CHANNELS: 3IMAGE_SIZE: 224NUM_WORKERS: 1
EVAL: True
MODEL:ATTENTION_DROPOUT: 0.0CHANNELS: [80, 160, 400, 640]DOWNSAMPLES: [True, True, True, True]DROPOUT: 0.0DROPPATH: 0.1EMBED_DIMS: 1024LAYERS: [2, 2, 6, 2]LAYER_SCALE_INIT_VALUE: 1e-05MLP_RATIOS: [4, 4, 4, 4]NAME: DeepGCN_sNUM_CLASSES: 1000PRETRAINED: ./vig_s.pdparamsRESUME: NoneTYPE: DeepGCN
REPORT_FREQ: 20
SAVE: ./output/eval-20220702-23-17
SAVE_FREQ: 10
SEED: 0
TRAIN:ACCUM_ITER: 1AUTO_AUGMENT: FalseBASE_LR: 0.001COLOR_JITTER: 0.4CUTMIX_ALPHA: 1.0CUTMIX_MINMAX: NoneEND_LR: 1e-05GRAD_CLIP: NoneLAST_EPOCH: 0LINEAR_SCALED_LR: 1024MIXUP_ALPHA: 0.8MIXUP_MODE: batchMIXUP_PROB: 1.0MIXUP_SWITCH_PROB: 0.5MODEL_EMA: FalseMODEL_EMA_DECAY: 0.99996MODEL_EMA_FORCE_CPU: TrueNUM_EPOCHS: 300OPTIMIZER:BETAS: (0.9, 0.999)EPS: 1e-08NAME: AdamWRANDOM_ERASE_COUNT: 1RANDOM_ERASE_MODE: pixelRANDOM_ERASE_PROB: 0.25RANDOM_ERASE_SPLIT: FalseRAND_AUGMENT: TrueRAND_AUGMENT_LAYERS: 2RAND_AUGMENT_MAGNITUDE: 9SMOOTHING: 0.1WARMUP_EPOCHS: 5WARMUP_START_LR: 1e-06WEIGHT_DECAY: 0.05
VALIDATE_FREQ: 1
W0702 23:17:47.117699  4028 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0702 23:17:47.121506  4028 gpu_context.cc:306] device: 0, cuDNN Version: 7.6.
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
using relative_pos
2022-07-02 23:17:51,458 MASTER_LOG ----- Total # of val batch (single gpu): 196
2022-07-02 23:17:51,737 MASTER_LOG ----- Pretrained: Load model state from ./vig_s.pdparams
2022-07-02 23:17:51,737 MASTER_LOG ----- Start Validation
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Sized
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/distributed/parallel.py:158: UserWarning: Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything."Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything."
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/parallel.py:631: UserWarning: The program will return to single-card operation. Please check 1, whether you use spawn or fleetrun to start the program. 2, Whether it is a multi-card program. 3, Is the current environment multi-card.warnings.warn("The program will return to single-card operation. "
2022-07-02 23:17:57,053 MASTER_LOG Step[0000/0196], Avg Loss: 1.0437, Avg Acc@1: 0.7734, Avg Acc@5: 0.9414
2022-07-02 23:18:45,671 MASTER_LOG Step[0020/0196], Avg Loss: 0.9593, Avg Acc@1: 0.8013, Avg Acc@5: 0.9526
2022-07-02 23:19:33,569 MASTER_LOG Step[0040/0196], Avg Loss: 0.9620, Avg Acc@1: 0.8043, Avg Acc@5: 0.9524
2022-07-02 23:20:23,634 MASTER_LOG Step[0060/0196], Avg Loss: 0.9678, Avg Acc@1: 0.8021, Avg Acc@5: 0.9513
2022-07-02 23:21:14,096 MASTER_LOG Step[0080/0196], Avg Loss: 0.9643, Avg Acc@1: 0.8029, Avg Acc@5: 0.9518
2022-07-02 23:22:03,238 MASTER_LOG Step[0100/0196], Avg Loss: 0.9651, Avg Acc@1: 0.8027, Avg Acc@5: 0.9516
2022-07-02 23:22:53,329 MASTER_LOG Step[0120/0196], Avg Loss: 0.9654, Avg Acc@1: 0.8034, Avg Acc@5: 0.9512
2022-07-02 23:23:43,297 MASTER_LOG Step[0140/0196], Avg Loss: 0.9659, Avg Acc@1: 0.8028, Avg Acc@5: 0.9513
2022-07-02 23:24:33,482 MASTER_LOG Step[0160/0196], Avg Loss: 0.9675, Avg Acc@1: 0.8027, Avg Acc@5: 0.9509
2022-07-02 23:25:23,225 MASTER_LOG Step[0180/0196], Avg Loss: 0.9677, Avg Acc@1: 0.8025, Avg Acc@5: 0.9509
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingfrom collections import Sized
/opt/conda/envs/python35-paddle120-env/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdownlen(cache))
2022-07-02 23:25:59,995 MASTER_LOG ----- Validation: Validation Loss: 0.9671, Validation Acc@1: 0.8030, Validation Acc@5: 0.9511, time: 488.26
INFO 2022-07-02 23:26:01,674 launch.py:402] Local processes completed.
INFO 2022-07-02 23:26:01,674 launch.py:402] Local processes completed.

从上面验证结果看,在ImageNet上VIG-s模型准确率为 Validation Acc@1: 0.8030, Validation Acc@5: 0.9511

和官方给的82.1%还是有差距,但总体性能还是不错的。

7. 训练代码

当然,不能光给有测试代码,来上训练代码,但是由于ImageNet数据集过大,因此这里提供示例的代码,大家训练的话将ImageNet数据集按照代码要求准备一下就可以啦!

!python -m paddle.distributed.launch --gpus 0 main_multi_gpu.py -cfg='./configs/vig_s.yaml' -dataset='imagenet2012' -batch_size=256 -data_path='/home/aistudio/data/ILSVRC2012_val' -pretrained='./vig_s.pdparams' -amp

8.总结

通用视觉模型一般以序列的结构或者网格的结构来处理图片信息,本文作者创新性地提出以图的方式来处理图片:计算机视觉的一个基本任务是识别图像中的物体。由于图片中的物体通常不是形状规则的方形,所以经典的网格表示或者序列表示在处理图片时显得冗余且不够灵活。本文提出一种基于图表示的新型通用视觉架构 ViG。将输入图像分成许多小块,并将每个小块视为图中的一个节点。基于这些节点构造图形可以更好地表示不规则复杂的物体。

在构建好了输入图片的图表征之后,作者使用 ViG 模型在所有节点之间交换信息。ViG 的基本组成单元包括两部分:用于图形信息处理的 GCN (图形卷积网络) 模块和用于节点特征变换的 FFN (前馈网络) 模块。直接在图像图形结构上使用图形卷积存在过平滑问题,性能较差。因此作者在每个节点内部引入了更多的特征变换来促进信息的多样性。在图像识别,目标检测等视觉任务中证明了该方法的有效性。

个人一些感悟

1.图结构是否是比自注意力更优的结构有待探讨。

2.图神经网络在特征表征上新的应用,有新意。

参考文献

图神经网络试图打入CV主流?中科大华为等联合开源ViG:首次用于视觉任务的GNN

开源链接:原项目https://aistudio.baidu.com/aistudio/projectdetail/4288323

视觉图神经网络:Vision GNN相关推荐

  1. 2020必火的图神经网络(GNN)是什么?有什么用?

    导读:近年来,作为一项新兴的图数据学习技术,图神经网络(GNN)受到了非常广泛的关注.2018年年末,发生了一件十分有趣的事情,该领域同时发表了三篇综述类型论文,这种"不约而同"体 ...

  2. 图神经网络(GNN)

    近年来,图神经网络(GNN)领域取得了快速和难以置信的进展,图神经网络又被称为图深度学习.图表示学习,已经成为机器学习特别是深度学习领域发展最快的研究课题之一.GNN开始运用于药物发现.疾病分类.电路 ...

  3. 图神经网络框架DGL教程-第3章:构建图神经网络(GNN)模块

    更多图神经网络和深度学习内容请关注: 第3章:构建图神经网络(GNN)模块 DGL NN模块是用户构建GNN模型的基本模块.根据DGL所使用的后端深度神经网络框架, DGL NN模块的父类取决于后端所 ...

  4. 图神经网络简介,什么是图神经网络,GNN

    目录 什么是图? 二.怎么把一些内容表示成图 2.1 怎么把图片表示成图 2.2 将一句话表示成图 2.3 其他信息转换成图的例子 2.3.1 分子结构表示成图 2.3.2 社会人物关系表示成图 2. ...

  5. som神经网络聚类简单例子_ICML 2020:6篇必读图神经网络(GNN)论文 | 附下载

    国际机器学习大会ICML,是机器学习领域全球最具影响力的学术会议之一.受疫情影响,第37届ICML将于7月13日至18日在线上举行. ICML 2020共提交4990篇论文,入选论文创新高,共有108 ...

  6. 图神经网络(GNN)简介

    资料来源:Manuchi,通过图片(CC0) Graph Neural Network(GNN)由于具有分析图形结构数据的能力而受到了广泛的关注.本文对Graph Neural Network进行了简 ...

  7. 图神经网络(GNN)综述

    A Comprehensive Survey on Graph Neural Networks 一.什么是图神经网络? 在过去的几年中,神经网络的兴起与应用成功推动了模式识别和数据挖掘的研究.许多曾经 ...

  8. 图神经网络(GNN)模型原理及应用综述

    从数据结构到算法:图网络方法初探 论文<Graph Neural Networks: A Review of Methods and Applications> 木牛马论文阅读笔记http ...

  9. 清华计算机系唐杰,清华大学—唐杰:图神经网络(GNN)及认知推理

    讲座主题:图神经网络(GNN)及认知推理 讲座时间:2019年10月11日下午13:30--14:30 讲座地点:信息管理学院918会议室 主讲人:唐杰教授 主讲人简介:唐杰,清华大学计算机系教授.系 ...

最新文章

  1. php为什么需要配置路由器,laravel 配置路由 api和web定义的路由的区别详解
  2. 数据库设计规范之对象设计使用规范
  3. 给迷茫的Java员一些中肯建议,你还在虚度光阴吗?
  4. 《漫画算法2》源码整理-7 第K大的数字
  5. openssl 签发sm2证书_首个NSA公开披露的软件系统漏洞——CVE20200601数字证书验证漏洞分析与实验...
  6. XSSFORK:新一代XSS自动扫描测试工具(精)
  7. CodeForces - 555A Case of Matryoshkas(思维)
  8. 生产力提升! 自己动手自定义Visual Studio 2019的 类创建模板,制作简易版Vsix安装包
  9. Android Studio 使用笔记:快捷键
  10. mysql 按周分组_如何在MySQL中按周分组?
  11. Ubuntu时间管理方法
  12. 16 年前,Google 为何花 5000 万美元买下 Android?
  13. oracle中怎么sqlprompt,自定义sqlplus登录过后的sqlprompt
  14. 软件项目管理测试题----含答案
  15. 计算机组成原理:时钟周期、机器周期和指令周期
  16. Docker未授权漏洞复现(合天网安实验室)
  17. 微信小程序之实现到商品列表跳转商品详情页
  18. 射频信号源及射频信号测试接口案例-纳米软件
  19. 桂 林 理 工 大 学实 验 报 告实验五 数组
  20. 一个测试工程师应具备那些素质和技能?

热门文章

  1. 路由与交换技术:ACL配置
  2. const关键字及其作用(用法),C语言const详解
  3. JavaScript 移动端的tap事件
  4. 良好的客户忠诚度是CRM的最终目标吗?
  5. 实现ImageView的双指缩放
  6. 正则表达式 匹配点号_Python入门:正则表达式(Regular Expression)
  7. springboot+vue的全平台支付系统
  8. ztree配置async异步加载子节点,展开不触发请求的问题解决记录
  9. kindle阅读html,html – Kindle Fire SilkBrowser阅读CSS
  10. Qt - QSettings实现用户偏好保存