CVPR 2018


  1. 今年一共收到3309篇文章,其中979篇被录用。投录比约为29.5%。
  2. 收录论文按专家评分,分为三个层次:Poster, Spotlight, Oral。
  3. Spotlight(亮点论文)一共有224篇,占收录论文(224/979)的22.88%。
  4. Oral(演示论文)一共有70篇,占收录论文(70/979)的7.1%。


所以说,不光中篇CVPR难,中篇spotlight更难,中篇oral基本可以说是灰常难了。就这么说吧,今年国内所有高校加起来中的CVPR oral是个位数

当然,最牛的还是Best paper 和best student paper,只会分别选出1篇。

今年的best paper给了来自Stanford和Berkeley的合作论文,论文标题为:

Taskonomy: Disentangling Task Transfer Learning



Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies



标题 第一单位 下载地址
Deep_Learning_of_Graph_Matching Lund University /content_cvpr_2018 /CameraReady/1830.pdf
SPLATNet: Sparse Lattice Networks for Point Cloud Processing UMass Amherst
CodeSLAM-learning a Compact, Optimisable Representation for Dense Visual SLAM 帝国理工
Efficient Optimization for Rank-based Loss Functions IIIT Hyderabad

best paper (2篇) > honorable mention(提名奖 4篇) > Oral (70篇) > Spotlight(224篇) > poster(其他)


  1. 高质量论文开始看,至少优先看spotlight或者oral论文。
  2. 在自己的领域找论文看,别想做什么CVPR的集大成者,如果你是CVPR oral大神,那么当我这条没说过。
  3. 哪里有CVPR论文分享会就去听,听原作者自己讲一个小时,比自己看一礼拜更管用。如果没有现场版,看看视频也是好的。

论文被引量同样可以看出论文的质量。截止到2019年3月份,CVPR2018论文google scholar被引量排名:

number title cited times level
1 Squeeze-and-Excitation Networks 554 Oral
2 Learning Transferable Architectures for Scalable Image Recognition 335 Spotlight
3 ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices 332 Poster
4 MobileNetV2: Inverted Residuals and Linear Bottlenecks 256 Poster
5 Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering 227 Oral



1 DensePose: Multi-Person Dense Human Pose Estimation In The Wild
2 Context Encoding for Semantic Segmentation
3 Augmented Skeleton Space Transfer for Depth-based Hand Pose Estimation
4 Semi-parametric Image Synthesis
5 Practical Block-wise Neural Network Architecture Generation
6 Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
7 PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
8 Illuminant Spectra-based Source Separation Using Flash Photography
9 SPLATNet: Sparse Lattice Networks for Point Cloud Processing
10 Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
11 Deep Layer Aggregation
12 Left-Right Comparative Recurrent Model for Stereo Matching
13 Analytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input
14 An Analysis of Scale Invariance in Object Detection - SNIP
15 Finding Tiny Faces in the Wild with Generative Adversarial Network
16 Taskonomy: Disentangling Task Transfer Learning
17 High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
18 Finding “It”: Weakly-Supervised Reference-Aware Visual Grounding in Instructional Video
19 Unsupervised Discovery of Object Landmarks as Structural Representations
20 Rotation Averaging and Strong Duality
21 Im2Flow: Motion Hallucination from Static Images for Action Recognition
22 Group Consistent Similarity Learning via Deep CRFs for Person Re-Identification
23 3D-RCNN: Instance-level 3D Scene Understanding via Render-and-Compare
24 Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
25 Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation
26 Squeeze-and-Excitation Networks
27 DoubleFusion: Real-time Capture of Human Performance with Inner Body Shape from a Single Depth Sensor
28 Learning to Find Good Correspondences
29 Actor and Action Video Segmentation from a Sentence
30 Maximum Classifier Discrepancy for Unsupervised Domain Adaptation
31 Detail-Preserving Pooling in Deep Networks
32 Convolutional Neural Networks with Alternately Updated Clique
33 Deep Learning of Graph Matching
34 Synthesizing Images of Humans in Unseen Poses
35 Neural Inverse Kinematics for Unsupervised Motion Retargetting
36 Direction-aware Spatial Context Features for Shadow Detection
37 Density Adaptive Point Set Registration
38 Hybrid Camera Pose Estimation
39 Relation Networks for Object Detection
40 Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects
41 Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View
42 Polarimetric Dense Monocular SLAM
43 Wasserstein Introspective Neural Networks
44 The Perception-Distortion Tradeoff
45 Discriminative Learning of Latent Features for Zero-Shot Recognition
46 Photometric Stereo in Participating Media Considering Shape-Dependent Forward Scatter
47 Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
48 Trapping Light for Time of Flight
49 Feature Space Transfer for Data Augmentation
50 Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250Hz
51 CodeSLAM --- Learning a Compact, Optimisable Representation for Dense Visual SLAM
52 FlipDial: A Generative Model for Two-Way Visual Dialogue
53 OATM: Occlusion Aware Template Matching by Consensus Set Maximization
54 Surface Networks
55 VirtualHome: Simulating Household Activities via Programs
56 Egocentric Activity Recognition on a Budget
57 Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
58 Efficient Optimization for Rank-based Loss Functions
59 MakeupGAN: Makeup Transfer via Cycle-Consistent Adversarial Networks
60 Revisiting Deep Intrinsic Image Decompositions
61 StarGAN: Unified Generative Adversarial Networks for Controllable Multi-Domain Image-to-Image Translation
62 Ordinal Depth Supervision for 3D Human Pose Estimation
63 Multi-Cell Classification by Convolutional Dictionary Learning with Class Proportion Priors
64 Accurate and Diverse Sampling of Sequences based on a ``Best of Many'' Sample Objective
65 MapNet: An Allocentric Spatial Memory for Mapping Environments
66 A Globally Optimal Solution to the Non-Minimal Relative Pose Problem
67 A Volumetric Descriptive Network for 3D Object Synthesis
68 Learning Face Age Progression: A Pyramid Architecture of GANs



