04.德国博士练习_04_index

文章目录

1. exercise01: update delete by query
2. exercise02: index template
3. exercise03: alias,reindex,pipeline use

1. exercise01: update delete by query

# ** EXAM OBJECTIVE: INDEXING DATA **
# GOAL: Create, update and delete indices while satisfying a given
# set of requirements
# REQUIRED SETUP:
# (i) a running Elasticsearch cluster with at least one node
# and a Kibana instance,
# (ii) the cluster has no index with name `hamlet`,
# (iii) the cluster has no template that applies to indices
# starting by `hamlet`# Create the index `hamlet-raw` with 1 primary shard and 3 replicas# Add a document to `hamlet-raw`, so that the document (i) has id
# "1", (ii) has default type, (iii) has one field named `line`
# with value "To be, or not to be: that is the question"# Update the document with id "1" by adding a field named
# `line_number` with value "3.1.64"
# Add a new document to `hamlet-raw`, so that the document (i) has
# the id automatically assigned by Elasticsearch, (ii) has
# default type, (iii) has a field named `text_entry` with value
# "Whether tis nobler in the mind to suffer", (iv) has a field
# named `line_number` with value "3.1.66"
# Update the last document by setting the value of `line_number` to
# "3.1.65"
# In one request, update all documents in `hamlet-raw` by adding a
# new field named `speaker` with value "Hamlet"# Update the document with id "1" by renaming the field `line` into
# `text_entry`

题解


PUT hamlet-raw
{"settings": {"number_of_replicas": 3,"number_of_shards": 1}
}PUT hamlet-raw/_doc/1
{"line":"To be, or not to be: that is the question"
}POST hamlet-raw/_update/1
{"doc" : {"line_number" : "3.1.64"}
}GET hamlet-raw/_doc/1POST hamlet-raw/_doc
{"text_entry": "text_entry","line_number": "3.1.66"
}# 根据返回的id进行操作
POST hamlet-raw/_update/2uDDLHYBznFAtuOD6g0k
{"doc":{"line_number": "3.1.65"}
}POST hamlet-raw/_update_by_query
{"script":{"lang":"painless","source":"ctx._source.speaker='Hamlet'"}
}GET hamlet-raw/_search使用ingest pipeline
PUT _ingest/pipeline/rename_field
{"description": "rename field","processors": [{"rename": {"field": "line","target_field": "text_entry"}}]
}
POST hamlet-raw/_update_by_query?pipeline=rename_field
{"query": {"ids": {"values": ["1"]}}
}
GET hamlet-raw/_search也可以用script来处理POST hamlet-raw/_update/2
{"script":{"lang":"painless","source": "ctx._source.text_entry=ctx._source.remove('line')"}
}

第二题

# Create the index `hamlet` and add some documents by running the
# following _bulk commandPUT hamlet/_doc/_bulk
{"index":{"_index":"hamlet","_id":0}}
{"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos
there?"}
{"index":{"_index":"hamlet","_id":1}}
{"line_number":"1.1.2","speaker":"FRANCISCO","text_entry":"Nay,
answer me: stand, and unfold yourself."}
{"index":{"_index":"hamlet","_id":2}}
{"line_number":"1.1.3","speaker":"BERNARDO","text_entry":"Long live
the king!"}
{"index":{"_index":"hamlet","_id":3}}
{"line_number":"1.2.1","speaker":"KING CLAUDIUS","text_entry":"Though
yet of Hamlet our dear brothers death"}
{"index":{"_index":"hamlet","_id":4}}
{"line_number":"1.2.2","speaker":"KING CLAUDIUS","text_entry":"The
memory be green, and that it us befitted"}
{"index":{"_index":"hamlet","_id":5}}
{"line_number":"1.3.1","speaker":"LAERTES","text_entry":"My
necessaries are embarkd: farewell:"}
{"index":{"_index":"hamlet","_id":6}}
{"line_number":"1.3.4","speaker":"LAERTES","text_entry":"But let me
hear from you."}
{"index":{"_index":"hamlet","_id":7}}
{"line_number":"1.3.5","speaker":"OPHELIA","text_entry":"Do you doubt
that?"}
{"index":{"_index":"hamlet","_id":8}}
{"line_number":"1.4.1","speaker":"HAMLET","text_entry":"The air bites
shrewdly; it is very cold."}
{"index":{"_index":"hamlet","_id":9}}
{"line_number":"1.4.2","speaker":"HORATIO","text_entry":"It is a
nipping and an eager air."}
{"index":{"_index":"hamlet","_idd":10}}
{"line_number":"1.4.3","speaker":"HAMLET","text_entry":"What hour
now?"}
{"index":{"_index":"hamlet","_id":11}}
{"line_number":"1.5.2","speaker":"Ghost","text_entry":"Mark me."}
{"index":{"_index":"hamlet","_id":12}}
{"line_number":"1.5.3","speaker":"HAMLET","text_entry":"I will."}# Create a script named `set_is_hamlet` and save it into the cluster
# state. The script (i) adds a field named `is_hamlet` to each
# document, (ii) sets the field to "true" if the document has
# `speaker` equals to "HAMLET", (iii) sets the field to "false"
# otherwise
# Update all documents in `hamlet` by running the `set_is_hamlet`
# scriptPretty convenient the “update_by_query” API, don’t you think? Do you also
know how to use its counterpart for deletion?
# Remove from `hamlet` the documents that have either "KING
# CLAUDIUS" or "LAERTES" as the value of `speaker`

这里需要注意的是先存储script，然后再使用的模式，之前很少这样用。


# 先用这个语法整一下
POST hamlet/_update_by_query
{"script":{"lang":"painless","source":"""if(ctx._source.speaker.equals('HAMLET')){ctx._source.is_hamlet=true;}else{ctx._source.is_hamlet=false;}"""}
}把上面的语句存储一下, search template也是可以这里存储
PUT _scripts/set_is_hamlet
{"script":{"lang":"painless","source":"""if(ctx._source.speaker.equals('HAMLET')){ctx._source.is_hamlet=true;}else{ctx._source.is_hamlet=false;}"""}
}使用存储的script
POST hamlet/_update_by_query
{"script":{"id":"set_is_hamlet"}
}GET hamlet/_search

删除操作


POST hamlet/_delete_by_query
{"query": {"terms": {"speaker.keyword": ["KING CLAUDIUS","LAERTES"]}}
}

2. exercise02: index template

# ** EXAM OBJECTIVE: INDEXING DATA **
# GOAL: Create index templates that satisfy a given set of
# requirements
# REQUIRED SETUP:
# (i) a running Elasticsearch cluster with at least one node
# and a Kibana instance,
# (ii) the cluster has no index with name `hamlet`,
# (iii) the cluster has no template that applies to indices
# starting by `hamlet`# Create the index template `hamlet_template`, so that the template
# (i) matches any index that starts by "hamlet_" or "hamlet-",
# (ii) allocates one primary shard and no replicas for each # matching index
# Create the indices `hamlet2` and `hamlet_test`
# Verify that only `hamlet_test` applies the settings defined in
# `hamlet_template`

template 没有办法进行部分update，update操作和创建操作一样，是直接的全部覆盖。


DELETE hamlet*DELETE _template/hamlet*PUT _template/hamlet_template
{"index_patterns":["hamlet_*","hamlet-*"],"settings":{"number_of_shards":1,"number_of_replicas":0}
}PUT hamlet2
PUT hamlet_testGET _cat/shards/hamlet2?v
GET _cat/shards/hamlet_test?v

# Update `hamlet_template` by defining a mapping for the type
# "_doc", so that (i) the type has three fields, named `speaker`,
# `line_number`, and `text_entry`, (ii) `text_entry` uses an
# "english" analyzer
Updates to an index template are not automatically reflected on the matching
indices that already exist. This is because index templates are only applied
once at index creation time.
# Verify that the updates in `hamlet_template` did not apply to the
# existing indices
# In one request, delete both `hamlet2` and `hamlet_test`


GET _template/hamlet_template
PUT _template/hamlet_template
{"index_patterns" : ["hamlet_*","hamlet-*"],"settings" : {"index" : {"number_of_shards" : "1","number_of_replicas" : "0"}},"mappings": {"properties": {"speaker":{"type":"text"},"line_number":{"type":"text"},"text_entry":{"type":"text","analyzer": "english"}}}
}GET hamlet_test
DELETE hamlet2,hamlet_test

# Create the index `hamlet-1` and add some documents by running the
# following _bulk command
PUT hamlet-1/_doc/_bulk
{"index":{"_index":"hamlet-1","_id":0}}
{"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos
there?"}
{"index":{"_index":"hamlet-1","_id":1}}
{"line_number":"1.1.2","speaker":"FRANCISCO","text_entry":"Nay,
answer me: stand, and unfold yourself."}
{"index":{"_index":"hamlet-1","_id":2}}
{"line_number":"1.1.3","speaker":"BERNARDO","text_entry":"Long live
the king!"}
{"index":{"_index":"hamlet-1","_id":3}}
{"line_number":"1.2.1","speaker":"KING CLAUDIUS","text_entry":"Though
yet of Hamlet our dear brothers death"}# Verify that the mapping of `hamlet-1` is consistent with what defined
in `hamlet_template`# Update `hamlet_template` so as to reject any document having a
# field that is not defined in the mapping
# Verify that you cannot index the following document in `hamlet-1` PUT
hamlet-1/_doc
{"author": "Shakespeare"
}

这里如果想要在update hamlet_template 的时候对hamlet-1生效，只能删掉hamlet-1然后进行重建


PUT hamlet-1/_mapping
{"dynamic":"strict"
}POST hamlet-1/_doc
{"author": "Shakespeare"
}

# Update `hamlet_template` so as to enable dynamic mapping again
# Update `hamlet_template` so as to (i) dynamically map to an
# integer any field that starts by "number_", (ii) dynamically
# map to unanalysed text any string field
# Create the index `hamlet-2` and add a document by running the
# following commandPOST hamlet-2/_doc/4
{"text_entry": "With turbulent and dangerous lunacy?","line_number": "3.1.4","number_act": "3","speaker": "KING CLAUDIUS"
}
# Verify that the mapping of `hamlet-2` is consistent with what
# defined in `hamlet_template`


GET _template/hamlet_template
PUT _template/hamlet_template
{"order": 0,"index_patterns": ["hamlet_*","hamlet-*"],"settings": {"index": {"number_of_shards": "1","number_of_replicas": "0"}},"mappings": {"dynamic": true,"dynamic_templates": [{"longs_as_strings": {"match": "number_*","mapping": {"type": "integer"}}},{"longs_as_strings": {"match_mapping_type": "string","mapping": {"type": "keyword"}}}],"properties": {"line_number": {"type": "text"},"text_entry": {"analyzer": "english","type": "text"},"speaker": {"type": "text"}}},"aliases": {}
}POST hamlet-2/_doc/4
{"text_entry": "With turbulent and dangerous lunacy?","line_number": "3.1.4","number_act": "3","speaker": "KING CLAUDIUS"
}GET hamlet-2/_mapping

3. exercise03: alias,reindex,pipeline use

# ** EXAM OBJECTIVE: INDEXING DATA **
# GOAL: Create an alias, reindex indices, and create data pipelines
# REQUIRED SETUP:
# (i) a running Elasticsearch cluster with at least one node
# and a Kibana instance,
# (ii) the cluster has no index with name `hamlet`,
# (iii) the cluster has no template that applies to indices
# starting by `hamlet`As usual, let’s begin by indexing some data.
# Create the indices `hamlet-1` and `hamlet-2`, each with two
# primary shards and no replicas
# Add some documents to `hamlet-1` by running the following commandPUT  hamlet-1/_doc/_bulk
{"index":{"_index":"hamlet-1","_id":0}}
{"line_number":"1.1.1","speaker":"BERNARDO","text_entry":"Whos
there?"}
{"index":{"_index":"hamlet-1","_id":1}}
{"line_number":"1.1.2","speaker":"FRANCISCO","text_entry":"Nay,
answer me: stand, and unfold yourself."}
{"index":{"_index":"hamlet-1","_id":2}}
{"line_number":"1.1.3","speaker":"BERNARDO","text_entry":"Long live
the king!"}
{"index":{"_index":"hamlet-1","_id":3}}
{"line_number":"1.2.1","speaker":"KING CLAUDIUS","text_entry":"Though
yet of Hamlet our dear brothers death"}# Add some documents to `hamlet-2`
by running the following commandPUT hamlet-2/_doc/_bulk
{"index":{"_index":"hamlet-2","_id":4}}
{"line_number":"2.1.1","speaker":"LORD POLONIUS","text_entry":"Give
him this money and these notes, Reynaldo."}
{"index":{"_index":"hamlet-2","_id":5}}
{"line_number":"2.1.2","speaker":"REYNALDO","text_entry":"I will, my
lord."}
{"index":{"_index":"hamlet-2","_id":6}}
{"line_number":"2.1.3","speaker":"LORD POLONIUS","text_entry":"You
shall do marvellous wisely, good Reynaldo,"}
{"index":{"_index":"hamlet-2","_id":7}}
{"line_number":"2.1.4","speaker":"LORD POLONIUS","text_entry":"Before
you visit him, to make inquire"}

# Create the alias `hamlet` that maps both `hamlet-1` and `hamlet-2`
# Verify that the documents grouped by `hamlet` are 8
By default, if your alias includes more than one index, you cannot index
documents using the alias name. But defaults can be overwritten, if you know
how.
# Configure `hamlet-1` to be the write index of the `hamlet` alias

DELETE hamlet*
PUT hamlet-1
{"settings": {"number_of_shards": 2,"number_of_replicas": 0}
}
PUT hamlet-2
{"settings": {"number_of_shards": 2,"number_of_replicas": 0}
}很久没有写alias相关的了，差点失手。。。冷静的查找文档
# Create the alias `hamlet` that maps both `hamlet-1` and `hamlet-2`
# Verify that the documents grouped by `hamlet` are 8
# Configure `hamlet-1` to be the write index of the `hamlet` aliasPOST /_aliases
{"actions": [{"add": {"index": "hamlet-1","alias": "hamlet","is_write_index": true}},{"add": {"index": "hamlet-2","alias": "hamlet"}}]
}PUT hamlet/_doc/1
{"message":"you want to be stronger"
}
GET hamlet/_count


# Add a document to `hamlet`, so that the document
# (i) has id "8",
# (ii) has "_doc" type,
# (iii) has a field `text_entry` with value  "With turbulent and dangerous lunacy?",
# (iv) has a field  `line_number` with value "3.1.4",
# (v) has a field `speaker`  with value "KING CLAUDIUS"# Create a script named `control_reindex_batch` and save it into the
# cluster state. The script checks whether a document has the
# field `reindexBatch`, and(i) in the affirmative case, it increments the field value by a script parameter named  `increment`, (ii) otherwise, the script adds the field to the  document setting its value to "1"

多练习这种script需要存储起来的场景。script的api可以参考painless guide部分


PUT _scripts/control_reindex_batch
{"script":{"lang":"painless","source": """if(ctx._source.containsKey('reindexBatch')){ctx._source.reindexBatch+=params.increment;}else{ctx._source.reindexBatch=1;}"""}
}POST hamlet-1/_update_by_query
{"script":{"id":"control_reindex_batch","params":{"increment":3}}
}
GET hamlet-1/_search


# Create the index `hamlet-new` with 2 primary shards and no  replicas
# Reindex `hamlet` into `hamlet-new`, while satisfying the following
# criteria:
(i) apply the `control_reindex_batch` script with the  `increment` parameter set to "1",
(ii) reindex using two  parallel slices# In one request, add `hamlet-new` to the alias `hamlet` and delete
# the `hamlet` and `hamlet-2` indices

PUT hamlet-new
{"settings": {"number_of_shards": 2,"number_of_replicas": 0}
}POST _reindex?slices=2
{"source": {"index": "hamlet"},"dest": {"index": "hamlet-new"},"script":{"id":"control_reindex_batch","params": {"increment":1}}
}GET hamlet-new/_searchPOST _aliases
{"actions": [{"add": {"index": "hamlet-new","alias": "hamlet"}},{"remove": {"indices": ["hamlet-1","hamlet-2"],  # 需要注意的是这里多个索引的话json的key为indices,单数的话为index"alias": "hamlet"}}]
}GET hamlet/_search

# Create a pipeline named `split_act_scene_line`. The pipeline
# splits the value of `line_number` using the dots as a
# separator, and stores the split values into three
# new fields named `number_act`, `number_scene`, and
# `number_line`, respectively# Test the pipeline on the following document{"_source": {"line_number": "1.2.3"
}
}
Satisfied with the outcome? Go update your documents, then!
# Update all documents in `hamlet-new` by using the
# `split_act_scene_line` pipeline

结合set processor 和 script processor


POST _ingest/pipeline/_simulate
{"pipeline": {"description": "string split by dot","processors": [{"split": {"field": "line_number","separator": "\\.","target_field":"temp_arry"}},{"script": {"lang": "painless","source": """ctx.number_act=ctx.temp_arry[0];ctx.number_scene=ctx.temp_arry[1];ctx.number_line=ctx.temp_arry[2];
"""}},{"remove": {"field": "temp_arry"}}]},"docs": [{"_source": {"line_number": "1.1.3","text_entry": "Long live the king!","reindexBatch": 2,"speaker": "BERNARDO"}}]
}PUT _ingest/pipeline/split_act_scene_line
{"description": "string split by dot","processors": [{"split": {"field": "line_number","separator": "\\.","target_field": "temp_arry"}},{"script": {"lang": "painless","source": """ctx.number_act=ctx.temp_arry[0];ctx.number_scene=ctx.temp_arry[1];ctx.number_line=ctx.temp_arry[2];
"""}},{"remove": {"field": "temp_arry"}}]
}POST hamlet-new/_update_by_query?pipeline=split_act_scene_lineGET hamlet-new/_search

04.德国博士练习_04_index_data相关推荐

docker 配置使用宿主机的GPU（ubuntu16.04+cuda10.0+cudnn7）
1. 安装 Docker 卸载旧版本 Docker sudo apt-get remove docker docker-engine docker.io containerd runc 安装新版本 s ...
Ubuntu 16.04 安装后修改屏幕分辨率(xrandr: Failed to get size of gamma for output default)
ubuntu 16.04 安装后分辨率只有一个选项 1024x768,使用 xrandr 命令出现错误: xrandr: Failed to get size of gamma for output ...
Ubuntu 16.04 安装 Docker - Dependency failed for Docker Application Container
Docker 安装由于 apt 官方库里的 Docker 版本可能比较旧,所以先卸载可能存在的旧版本: sudo apt-get remove docker docker-engine docker ...
【Docker】Ubuntu18.04国内源安装Docker-准备工作(一）
前言: 安装docker由于很多教程都使用国外源和阿里源,安装失败,这里总结一种国内源的安装方法,亲测有效! 过程: 步骤1:在服务器上创建虚拟机远程连接服务器,win+R--输入mstsc---- ...
在Ubuntu18.04上安装opencv 3.4.1
对于安装opencv有的人一次就成功,而有人安装了N多次才成功.我就是那个安装了N多次的人,每次遇到了很多安装错误,只能通过到网上搜教程资料,解决方法:通过一次次的试错,最终完成了安装.再此提醒第一次 ...
Go 中 time.Parse 报错：year/month/day hour/minute/second out of range 时间格式化为什么是 2006-01-02 15:04:05？
1. 问题现象在使用 Go 语言的 time.Parse 解析时间时遇到以下错误: func main() {timeParse, err := time.Parse("2006-11-0 ...
Go 学习笔记（25）— 并发（04）[有缓冲/无缓冲通道、WaitGroup 协程同步、select 多路监听通道、close 关闭通道、channel 传参或作为结构体成员]
1. 无缓冲的通道无缓冲的通道(unbuffered channel)是指在接收前没有能力保存任何值的通道. 这种类型的通道要求发送 goroutine 和接收 goroutine 同时准备好,才能 ...
ubuntu14.04安装hadoop2.6.0（伪分布模式）
版本:虚拟机下安装的ubuntu14.04(64位),hadoop-2.6.0 下面是hadoop2.6.0的官方英文教程: http://hadoop.apache.org/docs/r2.6.0/ ...
Ubuntu 12.04安装Sun JDK 6
Ubuntu 12.04安装Sun JDK 6 下载 sun jdk 6 bin. 设置权限 chmod +x jdk-6u25-linux-i586.bin 解压文件 ./jdk-6u25-linu ...

04.德国博士练习_04_index_data

文章目录

1. exercise01: update delete by query

2. exercise02: index template

3. exercise03: alias,reindex,pipeline use

04.德国博士练习_04_index_data相关推荐

最新文章

热门文章