这是我的推广信息,以激励自己更好的分享自己的知识和经验!也希望看到的你能够多多支持,谢谢!

1. 滴滴云AI大师:

目前滴滴云正在大力推广自己的云计算服务,需要购买的朋友们用我的AI大师码 「2049」在滴滴云上购买 GPU / vGPU / 机器学习产品可额外享受 9 折优惠,点击这里前往滴滴云官网。

一、一些网址

  • 数据集官网首页:http://cocodataset.org/#home

  • 数据集下载:

    • 可用迅雷去下载官方链接,速度还是挺快的。如果速度不快,可能你需要找“正确版本”的迅雷
    • 也可以去这个高中生搭建的下载站下载:http://bendfunction.f3322.net:666/share/。 他的首页是这样子的:http://bendfunction.f3322.net:666/
    • https://pjreddie.com/projects/coco-mirror/
  • 数据集格式介绍:http://cocodataset.org/#format-data。 本篇博文就是参照它写的。

  • 重要的网址
    在学习过程中,博主发现了写的更为详细和全面的一些介绍博客,记录在这里供大家参考:
    COCO数据集的标注格式-知乎专栏
    COCO 数据集格式了解-CSDN博客

二、数据集整体介绍

COCO数据集是一个大型的、丰富的物体检测,分割和字幕数据集。这个数据集以scene understanding为目标,主要从复杂的日常场景中截取,图像中的目标通过精确的segmentation进行位置的标定。图像包括91类目标,328,000影像和2,500,000个label。目前为止有语义分割的最大数据集,提供的类别有80 类,有超过33 万张图片,其中20 万张有标注,整个数据集中个体的数目超过150 万个。

MS COCO数据集包含很多的分支,截至2019年6月26日的情况如下:

2014 Train/Val: Detection 2015, Captioning 2015, Detection 2016, Keypoints 2016
2014 Testing: Captioning 2015
2015 Testing: Detection 2015, Detection 2016, Keypoints 2016
2017 Train/Val/Test: Detection 2017, Keypoints 2017, Stuff 2017, Detection 2018, Keypoints 2018, Stuff 2018, Panoptic 2018
2017 Unlabeled: [optional data for any competition]

你应该下载哪一个分支

你如果关注2017或者2018年的任务,你只需要下载2017年的图像,忽略其他的分支。

COCO 2017包含以下几个文件

数据集格式

COCO有5种类型的标注,分别是:物体检测、关键点检测、实例分割、全景分割、图片标注,都是对应一个json文件。json是一个大字典,都包含如下的关键字:

{"info" : info,"images" : [image], "annotations" : [annotation], "licenses" : [license],
}

其中info对应的内容如下:

info{"year" : int, "version" : str, "description" : str, "contributor" : str, "url" : str, "date_created" : datetime,
}

其中images对应的是一个list,对应了多张图片。list的每一个元素是一个字典,对应一张图片。格式如下:

info{"id" : int, "width" : int, "height" : int, "file_name" : str, "license" : int, "flickr_url" : str, "coco_url" : str, "date_captured" : datetime,
}

license的内容如下:

license{"id" : int, "name" : str, "url" : str,
}

虽然每个json文件都有"info", “images” , “annotations”, "licenses"关键字,但不同的任务对应的json文件中annotation的形式不同,分别如下:

目标检测

Each object instance annotation contains a series of fields, including the category id and segmentation mask of the object. The segmentation format depends on whether the instance represents a single object (iscrowd=0 in which case polygons are used) or a collection of objects (iscrowd=1 in which case RLE is used). Note that a single object (iscrowd=0) may require multiple polygons, for example if occluded. Crowd annotations (iscrowd=1) are used to label large groups of objects (e.g. a crowd of people). In addition, an enclosing bounding box is provided for each object (box coordinates are measured from the top left image corner and are 0-indexed). Finally, the categories field of the annotation structure stores the mapping of category id to category and supercategory names. See also the detection task.

annotation{"id" : int, "image_id" : int, "category_id" : int, "segmentation" : RLE or [polygon], "area" : float, "bbox" : [x,y,width,height], "iscrowd" : 0 or 1,
}categories[{"id" : int, "name" : str, "supercategory" : str,
}]

关键点检测

A keypoint annotation contains all the data of the object annotation
(including id, bbox, etc.) and two additional fields. First,
“keypoints” is a length 3k array where k is the total number of
keypoints defined for the category. Each keypoint has a 0-indexed
location x,y and a visibility flag v defined as v=0: not labeled (in
which case x=y=0), v=1: labeled but not visible, and v=2: labeled and
visible. A keypoint is considered visible if it falls inside the
object segment. “num_keypoints” indicates the number of labeled
keypoints (v>0) for a given object (many objects, e.g. crowds and
small objects, will have num_keypoints=0). Finally, for each category,
the categories struct has two additional fields: “keypoints,” which is
a length k array of keypoint names, and “skeleton”, which defines
connectivity via a list of keypoint edge pairs and is used for
visualization. Currently keypoints are only labeled for the person
category (for most medium/large non-crowd person instances). See also
the keypoint task.

annotation{"keypoints" : [x1,y1,v1,...], "num_keypoints" : int, "[cloned]" : ...,
}categories[{"keypoints" : [str], "skeleton" : [edge], "[cloned]" : ...,
}]"[cloned]": denotes fields copied from object detection annotations defined above.

实例分割

The stuff annotation format is identical and fully compatible to the
object detection format above (except iscrowd is unnecessary and set
to 0 by default). We provide annotations in both JSON and png format
for easier access, as well as conversion scripts between the two
formats. In the JSON format, each category present in an image is
encoded with a single RLE annotation (see the Mask API for more
details). The category_id represents the id of the current stuff
category. For more details on stuff categories and supercategories see
the stuff evaluation page. See also the stuff task.

全景分割

For the panoptic task, each annotation struct is a per-image
annotation rather than a per-object annotation. Each per-image
annotation has two parts: (1) a PNG that stores the class-agnostic
image segmentation and (2) a JSON struct that stores the semantic
information for each image segment. In more detail:

To match an annotation with an image, use the image_id field (that is
annotation.image_id == image.id). For each annotation, per-pixel
segment ids are stored as a single PNG at annotation.file_name. The
PNGs are in a folder with the same name as the JSON, i.e.,
annotations/name/ for annotations/name.json. Each segment (whether
it’s a stuff or thing segment) is assigned a unique id. Unlabeled
pixels (void) are assigned a value of 0. Note that when you load the
PNG as an RGB image, you will need to compute the ids via
ids=R+G256+B256^2. For each annotation, per-segment info is stored
in annotation.segments_info. segment_info.id stores the unique id of
the segment and is used to retrieve the corresponding mask from the
PNG (ids==segment_info.id). category_id gives the semantic category
and iscrowd indicates the segment encompasses a group of objects
(relevant for thing categories only). The bbox and area fields provide
additional info about the segment. The COCO panoptic task has the same
thing categories as the detection task, whereas the stuff categories
differ from those in the stuff task (for details see the panoptic
evaluation page). Finally, each category struct has two additional
fields: isthing that distinguishes stuff and thing categories and
color that is useful for consistent visualization.

annotation{"image_id" : int, "file_name" : str, "segments_info" : [segment_info],
}segment_info{"id" : int,
"category_id" : int,
"area" : int,
"bbox" : [x,y,width,height],
"iscrowd" : 0 or 1,
}categories[{"id" : int,
"name" : str,
"supercategory" : str,
"isthing" : 0 or 1,
"color" : [R,G,B],
}]

图像标注

These annotations are used to store image captions. Each caption
describes the specified image and each image has at least 5 captions
(some images have more). See also the captioning task.

annotation{"id" : int, "image_id" : int, "caption" : str,
}

MS COCO数据集详解相关推荐

  1. 全卷积神经网路【U-net项目实战】LUNA 2016 数据集详解

    文章目录 1.LUNA 2016 数据集详解 2.mdh数据格式详解 3.python读取mdh的方法 4.annotations.csv坐标转换 5.LUNA16数据集肺结节显示 1.LUNA 20 ...

  2. MS COCO数据集人体关键点评估(Keypoint Evaluation)(来自官网)

    COCO系列文章: MS COCO数据集目标检测评估(Detection Evaluation)(来自官网) MS COCO数据集人体关键点评估(Keypoint Evaluation)(来自官网) ...

  3. LUNA 2016 数据集详解

    LUNA 2016 数据集详解 LUNA16数据集的由来 LUNA 2016 数据集来自2016年LUng Nodule Analysis比赛,这里是其官方网站. LUNA16数据集是最大公用肺结节数 ...

  4. MS COCO数据集

    1. MS COCO数据集介绍 MS COCO的全称是Microsoft Common Objects in Context,起源于微软于2014年出资标注的Microsoft COCO数据集. 官网 ...

  5. 对MS coco数据集的ann file协议的探究

    文章目录 1. 工作场景 2. 资料收集 3. 解决方案 3.1 探究coco数据集中ann file 协议 3.1.1 annotations字段:重要程度☆☆☆ 3.1.2 images和cate ...

  6. ILSVRC2015_VID数据集详解

    数据集下载地址:http://bvisionweb1.cs.unc.edu/ilsvrc2015/ILSVRC2015_VID.tar.gz 总说: 数据集包括3862 snippets用于训练,55 ...

  7. MS COCO数据集标注格式解析

    COCO是微软提供的一个图像识别的数据集.其中包括3个tasks,分别是object instances, object keypoints, 和image captions,存储格式为JSON. 基 ...

  8. MS coco数据集下载链接

    coco数据集因为官网被墙了,所以无法看到下载链接,翻墙后拷贝过来,直接用链接下载就可以. 网页格式拷贝过来后就与官网的不一样, 凑合看. Images 2014 Train images [83K/ ...

  9. MS coco数据集下载

    登录ms-co-co数据集官网,一直不能进入,翻墙之后开看到下载链接.有了下载链接下载还是很快的,在我这儿晚上下载,速度能达到7M/s,所以也不上传到网盘了,有需要的人等夜深人静的时候下载效果更佳哦. ...

最新文章

  1. java Proxy.newProxyInstance 动态代理 简介
  2. GeoServer+MySQL安装及配置过程
  3. 写操作系统用的C语言和写应用程序的C语言不是一个
  4. 微擎html注释,微擎界面设计规范
  5. (转)SystemProcessesAndThreadsInformation
  6. 深入理解SpringBoot (4)
  7. ARCGIS知乎上的好文章
  8. 为linux扩展swap分区
  9. 【Java程序设计】输入输出
  10. 开通写scdn博客第一天
  11. springfox.documentation.spi.service.contexts.ParameterExpansionContext.findAnnotation(Ljava/lang/Cla
  12. C++学习(一七八)Android的arm64-v8a、armeabi-v7a、armeabi、x86
  13. 生活,人艰不拆......
  14. python opencv图像笔记
  15. 头条python后台一面凉经
  16. 北京消费扶贫双创中心启用 2000余种特色产品长期展销
  17. 移远NB模组(BC26)OPENCPU开发——MQTT上传接收
  18. html设计带图形的边框,css怎么设置图片的边框?
  19. 计算机学院迎接新生标语,大学迎接新生横幅标语:好巧我们见面了
  20. Tableau学习(一)

热门文章

  1. Java实践项目---单机五子棋
  2. 网络请求失败原因及解决
  3. python模拟账号密码登录_使用python模拟用户登录
  4. idea自动导包错误
  5. 采购-采购申请APP-PO-14082,PO借记账户为必备项,但无法确定。
  6. oracle+12514+C#,Oracle10g ORA-12514
  7. 关于CSS学习中出现的外边距合并问题
  8. 37岁接触Python,危机中抓住新机,3年搭建Python金融“金字塔”
  9. wxpython 之 GDI(一)
  10. 编写五子棋游戏的趣事