Abstract

Segment Anything (SA) project: a new task, model, and dataset for image segmentation.

we built the largest segmentation dataset to date (by far:迄今为止), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks.The Segment Anything Model (SAM) and corresponding dataset (SA-1B) releasing at SA to foster research into foundation models for computer vision.

Introduction

Large language models pre-trained on web-scale datasets are revolutionizing NLP (彻底改变)with strong zero-shot and few-shot generalization. These “foundation models” can generalize to tasks and data distributions beyond those seen during training. (zero-shot and few-shot generalization零样本和少样本泛化

Foundation models have also been explored in computer vision ,albeit to a lesser extent. (尽管程度较小)

Our goal is to build a foundation model for image segmentation. That is, we seek to develop a promptable model and pre-train it on a broad dataset using a task that enables powerful generalization. With this model, we aim to solve a range of downstream segmentation problems on new data distributions using prompt engineering. 

The success of this plan hinges on(取决于)three components: task, model, and data. To develop them, we address the following questions about image segmentation:

1. What task will enable zero-shot generalization?

2. What is the corresponding model architecture?

3. What data can power this task and model?

These questions are entangled and require a comprehen- sive solution.(错综复杂需要一个综合的解决方案。)

Surprisingly, we find that a simple design satisfies all three constraints: a powerful image encoder computes an image embedding, a prompt encoder embeds prompts, and then the two information sources are combined in a lightweight mask decoder that predicts segmentation masks. We refer to this model as the Segment Anything Model, or SAM .

data engine has three stages:

  1. assisted-manual
  2. semi-automatic
  3. and fully automatic

2. Segment Anything Task

Segment Anything论文阅读笔记相关推荐

  1. 《Segment as Points for Efficient Online Multi-Object Tracking and Segmentation》论文阅读笔记

    <Segment as Points for Efficient Online Multi-Object Tracking and Segmentation>论文阅读笔记 1.介绍 2.相 ...

  2. 论文阅读笔记--Aesthetics-Driven Stereoscopic 3-D Image Recomposition With Depth Adaptation-2018

    论文阅读笔记:美学引导的带有深度适应的立体3D图像重构 I.介绍 II.相关工作 A.单目(2D)图像处理 1)美学驱动的重构(Recomposition) 2)图像分割与抠图(Segment and ...

  3. [论文阅读笔记53]2021深度神经方法的关系三元组抽取综述

    1. 题目 Deep Neural Approaches to Relation Triplets Extraction: A Comprehensive Survey Tapas Nayak†, N ...

  4. 论文阅读笔记(8):Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering

    论文阅读笔记(8):Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering F ...

  5. 全卷积(FCN)论文阅读笔记:Fully Convolutional Networks for Semantic Segmentation

    论文阅读笔记:Fully Convolutional Networks forSemantic Segmentation 这是CVPR 2015拿到best paper候选的论文. 论文下载地址:Fu ...

  6. DnCNN论文阅读笔记【MATLAB】

    DnCNN论文阅读笔记 论文信息: 论文代码:https://github.com/cszn/DnCNN Abstract 提出网络:DnCNNs 关键技术: Residual learning an ...

  7. Learning Multiview 3D point Cloud Registration论文阅读笔记

    Learning multiview 3D point cloud registration Abstract 提出了一种全新的,端到端的,可学习的多视角三维点云配准算法. 多视角配准往往需要两个阶段 ...

  8. FCGF论文阅读笔记

    FCGF论文阅读笔记 0. Abstract 从三维点云或者扫描帧中提取出几何特征是许多任务例如配准,场景重建等的第一步.现有的领先的方法都是将low-level的特征作为输入,或者在有限的感受野上提 ...

  9. PointConv论文阅读笔记

    PointConv论文阅读笔记 Abstract 本文发表于CVPR. 其主要内容正如标题,是提出了一个对点云进行卷积的Module,称为PointConv.由于点云的无序性和不规则性,因此应用卷积比 ...

最新文章

  1. tensorflow 1.x Saver(保存与加载模型) 预测
  2. Hibernate的配置文件 Hibernate.cfg.xml与xxx.hbm.xml
  3. Java_斐波那契数列_兔子生兔子算法
  4. 平板电脑桌面添加计算机快捷键,驰为Vi10教你Windows 10中的这些实用快捷键
  5. 解决打开虚拟机 VMware Workstation 报错无法改变虚拟机的电源状态 Operation inconsistent with current state问题
  6. php如何判断二维数组为空,PHP判断数组为空的具体方式
  7. mysql一对一级联_MySQL 表的一对一、一对多、多对多问题
  8. HDFS SnapShot原理
  9. Nodejs教程09:实现一个带接口请求的简单服务器
  10. Ising模型(伊辛模型)
  11. CentOS 6.6 安装 Node.js
  12. 实习成长之路:面试官说的MySQL高可用-------主备一致到底是什么?
  13. ctbs 应用服务器,CTBS服务器配置方案-高级版
  14. 关于二进制补码+CS5463
  15. python jupter输入文字行_少儿Python编程_第十六讲:图形界面开发
  16. 大学c语言基础知识选修课,北京交通大学选修课选课指南
  17. 使用JavaScript使浏览器进入全屏或退出全屏
  18. 搜狗如何打特殊符号 - 搜狗特殊符号的打法!!
  19. 导出 MySQL 数据库表结构设计文档
  20. Effective Scala

热门文章

  1. 标记蛋白抗体多肽1267539-32-1,Cyanine5 azide,花青素CY5叠氮,知识点梳理
  2. 【Effection C++】读书笔记 条款07~条款08
  3. 活体检测如何判断摇头,点头,眨眼,张嘴
  4. ionic实战之实现图片列表以及图片浏览
  5. 国泰君安联手腾讯云,券商奋起追赶数字化浪潮
  6. SISO decoder for a general (n,n-1) SPC code(补充章节3)
  7. javaBean的概念是什么
  8. 学习踩坑:在Vue项目中使用svg标签却无法改变图标的颜色
  9. 建议使用 Bcrypt加密算法 代替 MD5/SHA1
  10. 1万套Solidworks非标自动化设备3D图纸机械设计SW模型库建模三维