The Quick, Draw! Dataset

The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!.

先看看数据集提供了哪些格式:

Full dataset seperated by categories

Sketch-RNN QuickDraw Dataset(sketch-rnn论文里用的格式)

This data is also used for training the Sketch-RNN model. An open source, TensorFlow implementation of this model is available in the Magenta Project, (link to GitHub repo). You can also read more about this model in this Google Research blog post. The data is stored in compressed .npz files, in a format suitable for inputs into a recurrent neural network.

In this dataset, 75K samples (70K Training, 2.5K Validation, 2.5K

Test) has been randomly selected from each category, processed with RDP line simplification with an epsilon parameter of 2.0. Each category will be stored in its own .npz file, for example, cat.npz.

We have also provided the full data for each category, if you want to

use more than 70K training examples. These are stored with the .full.npz extensions.

npz文件保存的坐标偏移量,而不是坐标↓↓↓

Each example in the dataset is stored as list of coordinate offsets:∆x,∆y, and a binary value representing whether the pen is lifted away from the paper. This format, we refer to as stroke-3, is described in this paper. Note that the data format described in the paper has 5 elements (stroke-5 format), and this conversion is done automatically inside the DataLoader. Below is an example sketch of a turtle using this format:

那我转换为图片岂不是要一个一个点去加偏移量了……抱歉我试了,耗时太久

我继续查看其他存储格式……

.ndjson 存的可是坐标啊↓↓↓,直接插值就成线条了哦

Each line contains one drawing. Here's an example of a single drawing:

{

"key_id":"5891796615823360",

"word":"nose",

"countrycode":"AE",

"timestamp":"2017-03-01 20:41:36.70725 UTC",

"recognized":true,

"drawing":[[[129,128,129,129,130,130,131,132,132,133,133,133,133,...]]]

}

The format of the drawing array is as following:

[

[ // First stroke [x0, x1, x2, x3, ...],

[y0, y1, y2, y3, ...],

[t0, t1, t2, t3, ...]

],

[ // Second stroke [x0, x1, x2, x3, ...],

[y0, y1, y2, y3, ...],

[t0, t1, t2, t3, ...]

],

... // Additional strokes]

怎么解析ndjson文件?????????

There is an example in examples/nodejs/simplified-parser.js showing how to read ndjson files in NodeJS.

这里有个安装node的插曲(我用nvm安装的,百度一堆教程)

git clone https://github.com/creationix/nvm.git

cd nvm

./install.sh

source ./nvm.sh

nvm install v6.2.2

node

用提供的simplified-parser.js可以直接解析,我将其解析后的数据保存为json,python就能读取了。node simplified-parser.js

var fs = require('fs');

var ndjson = require('ndjson'); // npm install ndjson

function parseSimplifiedDrawings(fileName, callback) {

var drawings = [];

var fileStream = fs.createReadStream(fileName)

fileStream

.pipe(ndjson.parse())

.on('data', function(obj) {

drawings.push(obj)

})

.on("error", callback)

.on("end", function() {

callback(null, drawings)

});

}

parseSimplifiedDrawings("dataset_path/full_simplified_cat.ndjson", function(err, drawings) {

if(err) return console.error(err);

drawings.forEach(function(d) {

// Do something with the drawing console.log(d.key_id, d.countrycode);

})

console.log("# of drawings:", drawings);

var filename = "dataset_path/full_simplified_cat.json";//这里保存 fs.writeFileSync(filename, JSON.stringify(drawings));//这里保存})

有了json文件,我就拿python转化为图片去。

import json

from scipy import interpolate

import pylab as pl

f = open("dataset_path/full_simplified_cat.json")

setting = json.load(f)

for j in range(0,10):    #先试试10个图

for i in range(0,len(setting[j]['drawing'])):

x = setting[j]['drawing'][i][0]

y = setting[j]['drawing'][i][1]

f=interpolate.interp1d(x,y,kind="slinear") #线性插值

pl.plot(x,y,'k')

ax = pl.gca()  #一个猫的所有线条画一起

ax.xaxis.set_ticks_position('top') # convert x,没有ax这几句猫就反着了

ax.invert_yaxis()

pl.axis('off')

pl.savefig("dataset_path/cat/%d.png"%j)

pl.close()  #不关闭的话所有图都画一起了

转化前是(比如第一个图):[{"word":"cat","countrycode":"VE","timestamp":"2017-03-02 23:25:10.07453 UTC","recognized":true,"key_id":"5201136883597312","drawing":[[[130,113,99,109,76,64,55,48,48,51,59,86,133,154,170,203,214,217,215,208,186,176,162,157,132],[72,40,27,79,82,88,100,120,134,152,165,184,189,186,179,152,131,114,100,89,76,0,31,65,70]],[[76,28,7],[136,128,128]],[[76,23,0],[160,164,175]],[[87,52,37],[175,191,204]],[[174,220,246,251],[134,132,136,139]],[[175,255],[147,168]],[[171,208,215],[164,198,210]],[[130,110,108,111,130,139,139,119],[129,134,137,144,148,144,136,130]],[[107,106],[96,113]]]},

解释一下坐标↓↓↓

第一笔x和y画了猫的轮廓

[[130,113,99,109,76,64,55,48,48,51,59,86,133,154,170,203,214,217,215,208,186,176,162,157,132],[72,40,27,79,82,88,100,120,134,152,165,184,189,186,179,152,131,114,100,89,76,0,31,65,70]]

第二笔画了好像是胡子吧

[[76,28,7],[136,128,128]]

依次画完

python载入图片序列_【Python】序列和图片之间的转化相关推荐

  1. python二维、三维、思维数组之间的转化

    python二维.三维.思维数组之间的转化 import numpy as np# 假设你有一个二维数组,形状为(32,10000) gray = np.random.randint(0, 256, ...

  2. python keyboard模块_[python] PyMouse、PyKeyboard用python操作鼠标和键盘

    1.PyUserInput 简介 PyUserInput是一个使用python的跨平台的操作鼠标和键盘的模块,非常方便使用.支持的平台及依赖如下: Linux - Xlib Mac - Quartz, ...

  3. python 病毒 基因_#Python#提取基因对应的蛋白质名

    提取基因对应的蛋白质官方名 最开始,是需要将基因跟其编码的蛋白质对应起来,找遍了各种数据库都没发现有相关的注释文件,Uniprot作为处理蛋白质的大佬,结果里都有,肯定有办法能够满足需求. 搜索TP5 ...

  4. java字符序列_字符序列(CharSequence)

    字符序列(CharSequence) 1.相关接口 java.lang.CharSequence 接口 java.lang.Appendable接口 java.lang.Comparable接口 ja ...

  5. python 动物分类_动物分类及图片.doc

    动物分类及图片 动物的分类爬行类动物.飞禽类动物.哺乳类动物.昆虫类动物.家禽类动物.鱼类动物.食肉类动物. 爬行类动物:蛇 蜥蜴蛇 蜥蜴 壁虎 .龟.鳖.鳄鱼等 属于脊椎动物亚门.它们的身体构造和生 ...

  6. python连通域标记_使用OpenCV获取图片连通域数量,并用不同颜色标记函

    一,原图和效果图 二,代码 //#########################产生随机颜色######################### cv::Scalar icvprGetRandomCo ...

  7. python 动物分类_动物分类及图片

    1 动物的分类 爬行类动物. 飞禽类动物. 哺乳类动物. 昆虫类动物. 家禽类动物. 鱼类动物. 食肉类动物. 爬行类动物 : 蛇 蜥蜴 蛇 蜥蜴 壁虎 .龟.鳖.鳄鱼 等 属于脊椎动物亚门.它们的身 ...

  8. python客户价值分析_[Python数据挖掘]第7章、航空公司客户价值分析

    一.背景和挖掘目标 二.分析方法与过程 客户价值识别最常用的是RFM模型(最近消费时间间隔Recency,消费频率Frequency,消费金额Monetary) 1.EDA(探索性数据分析) #对数据 ...

  9. 小甲鱼python课件源代码_[Python基础] 小甲鱼零基础入门Python学习视频+全套源码课件 Python视频教程 96讲...

    资源介绍 课程简介: 小甲鱼的Python课程,对初学者来说相当不错!97讲完全解读,会让大家对Python的认识从无到有,推荐给大家! 课程目录------------------- 第000讲 愉 ...

最新文章

  1. Java 11 已发布,String 还能这样玩!
  2. 判断元素(expected_conditions)
  3. Makefile中用宏定义进行条件编译(gcc -D)/在Makefile中进行宏定义-D
  4. 7月9日王者荣耀服务器维护,王者荣耀 7月9日体验服停机更新公告
  5. 怎样进行云迁移 企业才不会后悔!
  6. Weka中数据挖掘与机器学习系列之Weka系统安装(四)
  7. Android Studio 红米3 一直运行或者debug不成功,提示 Failed to establish session 解决方案
  8. (第一天)Oracle数据库学习
  9. 最新AWVS14安装使用教程(2021年10月11日)
  10. 谈刺蛇c语言程序,C语言程设计实验内容与答案.doc
  11. C语言编制排班系统流程图,智能排班系统流程图怎样绘制
  12. python投票问题,程序列出在python中获得多数投票的候选人
  13. 软件项目工作量评估方法简述之功能点方法(FPA)
  14. Shamir秘密共享算法
  15. 怎么样用香港主机搭建游戏网站
  16. Hive on Hbase
  17. python点滴 1
  18. 网络退化、过拟合、梯度消散/爆炸
  19. 使用FFMpeg合并bilibili缓存的视频文件
  20. java s3 与ceph的关系,ceph S3 对象存储的使用

热门文章

  1. 计算机中的目标程序是指什么意思,源程序、目标程序、可执行程序的含义是什么...
  2. 东方财富网-股吧论坛帖子信息采集
  3. 漫谈直播,从0认识直播并快速搭建专属直播平台
  4. 密码学之凯撒密码(C语言)
  5. 程序员为何要学会摸鱼?
  6. 在鸿蒙开发板上使用MQTT协议、OneNET实现第一个物联网项目
  7. 10.3 IDW插值
  8. 微信小程序循环 wxfor、wxfor-item(s)区别与联系、data-xxx、wx-key使用
  9. 计算机办公应用教程,计算机办公应用教程
  10. 抖音卡片/快手/小红书/h5浏览器/微博跳转微信/qq/微信公众号/指定链接