ElasticSearch CRUL操作

使用curl完成es的操作

-X 指定http的请求方法有HEAD GET POST PUT DELETE
-d 指定要传输的数据
-H 指定http请求头信息

文章目录

1.添加记录
2.删除记录
3.修改记录
4.查询记录
5.查看集群健康状态

1.添加记录

在es中创建一个索引库（index）product
```
  curl -XPUT http://hadoop01:9200/product
```
ES创建索引库和索引时的注意点
1. 索引库名称必须要全部小写，不能以下划线开头，也不能包含逗号
2. 如果没有明确指定索引数据的ID，那么es会自动生成一个随机的ID，需要使用POST参数
  
  POST和PUT的区别：
  
  PUT是幂等方法，POST不是。所以PUT用户更新，POST用于新增比较合适。
  
  PUT和DELETE操作是幂等的。所谓幂等是指不管进行多少次操作，结果都一样。比如用PUT修改一篇文章，然后在做同样的操作，每次操作后的结果并没有什么不同，DELETE也是一样。
  
  POST操作不是幂等的，比如常见的POST重复加载问题：当我们多次发出同样的POST请求后，其结果是创建了若干的资源。
  还有一点需要注意的就是，创建操作可以使用POST，也可以使用PUT，区别就在于POST是作用在一个集合资源(/articles)之上的，而PUT操作是作用在一个具体资源之上的(/articles/123)，比如说很多资源使用数据库自增主键作为标识信息，这个时候就需要使用PUT了。而创建的资源的标识信息到底是什么，只能由服务端提供时，这个时候就必须使用POST。
  
  在创建索引的时候，如果想要id(主键)自增，需要手动指定。
  
  如果不需要自增，可以不用指定。
  
  但是需要注意，此时只能使用POST操作，此时id为一个随机字符串。

添加一条记录

格式

curl -XPOST http://hadoop01:9200/{index}/{type}/{id} -d'json数据'

eg.

手动指定id

curl -XPOST -H 'Content-Type: application/json' http://hadoop01:9200/product/hadoop/1 -d '{"name": "hadoop", "version": "2.7.6", "author": "apache"}'

自动添加id

curl -XPOST -H 'Content-Type: application/json' http://hadoop01:9200/product/hadoop?pretty -d '{"name": "hbase", "version": "1.1.5", "author": "apache"}'

{"_index" : "product","_type" : "hadoop","_id" : "Pe-uVHUBpUX7OVeo0RgI","_version" : 1,"result" : "created","_shards" : {"total" : 2,"successful" : 2,"failed" : 0},"_seq_no" : 1,"_primary_term" : 2
}

批量操作

Bulk api可以帮助我们同时执行多个请求

创建一个索引库来保存批量信息，index=account,type=bank

create和index的区别：如果数据存在，使用create操作失败，会提示文档已经存在，使用index则可以成功执行。
```
curl -XPUT -H "Content-Type: application/json" http://hadoop01:9200/account?pretty
```

数据格式

action:[index|create|update|delete](可选的操作)
metadata:_index,_type,_id(可选的metadata)
request body:_source(删除操作不需要)
{action:{metadata}}\n(第一条记录的index)
{request body}\n(第一条记录的内容)
{action:{metadata}}\n(第二条记录的index)
{request body}\n(第二条记录的内容)

accounts.json

{"index":{"_id":"456"}}
{"account_number":456,"balance":21419,"firstname":"Solis","lastname":"Kline","age":33,"gender":"M","address":"818 Ashford Street","employer":"Vetron","email":"soliskline@vetron.com","city":"Ruffin","state":"NY"}
{"index":{"_id":"463"}}
{"account_number":463,"balance":36672,"firstname":"Heidi","lastname":"Acosta","age":20,"gender":"F","address":"692 Kenmore Terrace","employer":"Elpro","email":"heidiacosta@elpro.com","city":"Ezel","state":"SD"}
{"index":{"_id":"468"}}
{"account_number":468,"balance":18400,"firstname":"Foreman","lastname":"Fowler","age":40,"gender":"M","address":"443 Jackson Court","employer":"Zillactic","email":"foremanfowler@zillactic.com","city":"Wakarusa","state":"WA"}
{"index":{"_id":"470"}}

执行批量操作，注意文件最后需要换行符

格式

curl -XPOST/PUT -H "Content-Type: application/json" http://hadoop01:9200/{index}/{type}/_bulk --data-binary @path

eg.

curl -XPUT -H "Content-Type: application/json" http://hadoop01:9200/account/bank/_bulk?pretty --data-binary @/home/hadoop/datas/accounts.json

2.删除记录

删除一条记录(只能基于id进行删除)

格式

curl -XDELETE http://hadoop01:9200/{index}/{type}/{id}

eg.

curl -XDELETE http://hadoop01:9200/product/hadoop/Pe-uVHUBpUX7OVeo0RgI?pretty

可以看到每删除一次version就会更改一次

{"_index" : "product","_type" : "hadoop","_id" : "Pe-uVHUBpUX7OVeo0RgI","_version" : 2,"result" : "deleted","_shards" : {"total" : 2,"successful" : 2,"failed" : 0},"_seq_no" : 2,"_primary_term" : 2
}

一个文档被删除之后，不会立即生效，它只是被标记为已删除。ES将会在你之后添加更多索引的时候才会在后台进行删除。

3.修改记录

格式

curl -XPOST -H "Content-Type: application/json" http://hadoop01:9200/{index}/{type}/{id}/_update -d'{"doc":{xxx}}

eg.（修改name与version）

curl -XPOST -H "Content-Type: application/json" http://hadoop01:9200/product/hadoop/1/_update?pretty -d'{"doc": {"name":"spark", "version":"2.7.6"}}'

4.查询记录

查询一条索引信息

格式

curl -XGET http://hadoop01:9200/{index}/{type}/{id}

eg.

curl -XGET http://hadoop01:9200/product/hadoop/1

美化版的格式：

curl -XGET http://hadoop01:9200/{index}/{type}/{id}?pretty

eg.

curl -XGET http://hadoop01:9200/product/hadoop/1?pretty

{"_index" : "product","_type" : "hadoop","_id" : "1","_version" : 2,"found" : true,"_source" : {"name" : "spark","version" : "2.7.6","author" : "apache"}
}

查询所有记录

格式

curl -XGET http://hadoop01:9200/{index}/_searchcurl -XGET http://hadoop01:9200/{index}/{type}/_search

eg.

curl -XGET http://hadoop01:9200/product/hadoop/_search?prettycurl -XGET http://hadoop01:9200/product/_search?pretty

条件查询

查询author为apache

curl -XGET 'http://hadoop01:9200/product/_search?q=author:apache&pretty'

查询name和author信息

curl -XGET 'http://hadoop01:9200/product/_search?_source=name,author&q=author:apache&pretty'

查询name为spark，author为apache信息

curl -XGET 'http://hadoop01:9200/product/_search?_source=name,author&q=name:spark&q=author:apache&pretty'

大结果集的分页查询
- 格式
```
curl http://hadoop01:9200/account/bank/_search?pretty&from={num}&size={size}
```
  其中from代表的是从哪一条开始，最开始的索引是0，size代表查询多少条记录
- 分页算法
  
  我们想查询第N页的数据，每页使用默认10记录，起始索引就是(N-1) * 10
  - 查询第三页的数据
```
curl http://hadoop01:9200/account/bank/_search?pretty&from=20&size=10
```

5.查看集群健康状态

curl http://hadoop01:9200/_cluster/health?pretty

{"cluster_name" : "bde-es","status" : "green","timed_out" : false,"number_of_nodes" : 3,"number_of_data_nodes" : 3,"active_primary_shards" : 10,"active_shards" : 20,"relocating_shards" : 0,"initializing_shards" : 0,"unassigned_shards" : 0,"delayed_unassigned_shards" : 0,"number_of_pending_tasks" : 0,"number_of_in_flight_fetch" : 0,"task_max_waiting_in_queue_millis" : 0,"active_shards_percent_as_number" : 100.0
}

返回结果的字段意义

cluster_name：集群名
status：集群状态。集群共有green、yellow或red中的三种状态。green代表最健康的状态，所有的主分片和所有的副分片都可用；yellow意味着所有的数据都是可用的，但是某些复制没有被分配（集群功能齐全），red则代表因为某些原因，某些数据不可用
timed_out：集群是否连接超时
number_of_nodes：集群节点个数
number_of_data_nodes：数据节点个数
active_primary_shards：集群中所有活跃的主分片数
active_shards：集群中所有活跃的分片数
relocating_shards：当前节点迁往其他节点的分片数量，通常为0，当有节点加入或者退出时该值会增加
initializing_shards：正在初始化的分片
unassigned_shards：未分配的分片数，通常为0，当有某个节点的副本分片丢失该值就会增加
delayed_unassigned_shards：是指主节点创建索引并分配shards等任务，如果该指标数值一直未减小代表集群存在不稳定因素
number_of_pending_tasks：pending task只能由主节点来进行处理，这些任务包括创建索引并将shards分配给节点
number_of_in_flight_fetch：迁移中的数量
task_max_waiting_in_queue_millis：在队列中等待的任务最大值
active_shards_percent_as_number：集群分片健康度，活跃分片数占总分片数比例

ElasticSearch CRUL操作相关推荐

Elasticsearch安装操作步骤
Elasticsearch安装操作步骤操作步骤 1,下载elasticsearch 2,上传到linux环境 3,安装及配置 4,es后台启动操作步骤 1,下载elasticsearch http ...
elasticsearch简单操作（二）
让我们建立一个员工目录,假设我们刚好在Megacorp工作,这时人力资源部门出于某种目的需要让我们创建一个员工目录,这个目录用于促进人文关怀和用于实时协同工作,所以它有以下不同的需求: 1.数据能够包 ...
elasticsearch聚合操作——本质就是针对搜索后的结果使用桶bucket（允许嵌套）进行group by，统计下分组结果，包括min/max/avg...
分析最后,我们还有一个需求需要完成:允许管理者在职员目录中进行一些分析. Elasticsearch有一个功能叫做聚合(aggregations),它允许你在数据上生成复杂的分析统计.它很像SQL中 ...
python连接es数据库_Python Elasticsearch API操作ES集群
环境Centos 7.4 Python 2.7 Pip 2.7 MySQL-python 1.2.5 Elasticsearc 6.3.1 Elasitcsearch6.3.2 知识点调用Python ...
Elasticsearch Java 操作client
0.题记之前Elasticsearch的应用比较多,但大多集中在关系型.非关系型数据库与Elasticsearch之间的同步.以上内容完成了Elasticsearch所需要的基础数据量的供给.但想要 ...
[ELK实战] Elasticsearch 常用操作 (基于DSL)
方法 / 步骤一: 前置工作 1.1 导入数据 POST /staff/_bulk {"index":{"_id":1}} {"name" ...
Elasticsearch RestHighLevelClient操作
RestHighLevelClient操作 <dependency><groupId>org.elasticsearch.client</groupId><a ...
Elasticsearch相关操作
一.ES的核心概念 1.1.概述 Elasticsearch是面向文档(document oriented)的,这意味着它可以存储整个对象或文档(document). 然而它不仅仅是存储,还会索引(i ...
Spring Data ElasticSearch的操作
依赖 <dependency><groupId>org.springframework.boot< ...