elasticsearch多字段聚合实现方式
文章目录
- 1、背景
- 2、实现多字段聚合的思路
- 3、需求
- 4、数据准备
- 4.1 创建索引
- 4.2 准备数据
- 5、实现方式
- 5.1 multi_terms实现
- 5.1.1 dsl
- 5.1.2 java 代码
- 5.1.3 运行结果
- 5.2 script实现
- 5.2.1 dsl
- 5.2.2 java代码
- 5.2.3 运行结果
- 5.3 通过copyto实现
- 5.5 通过pipeline来实现
- 5.4.1 创建mapping
- 5.4.2 创建pipeline
- 5.4.3 插入数据
- 5.4.4 聚合dsl
- 5.4.5 运行结果
- 6、实现代码
- 7、参考文档
1、背景
我们知道在sql
中是可以实现 group by 字段a,字段b
,那么这种效果在elasticsearch
中该如何实现呢?此处我们记录在elasticsearch
中的3种方式来实现这个效果。
2、实现多字段聚合的思路
图片来源:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html
从上图中,我们可以知道,可以通过3种方式来实现 多字段的聚合操作。
3、需求
根据省(province
)和性别(sex
)来进行聚合,然后根据聚合后的每个桶的数据,在根据每个桶中的最大年龄(age
)来进行倒序排序。
4、数据准备
4.1 创建索引
PUT /index_person
{"settings": {"number_of_shards": 1},"mappings": {"properties": {"id": {"type": "long"},"name": {"type": "keyword"},"province": {"type": "keyword"},"sex": {"type": "keyword"},"age": {"type": "integer"},"address": {"type": "text","analyzer": "ik_max_word","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}
}
4.2 准备数据
PUT /_bulk
{"create":{"_index":"index_person","_id":1}}
{"id":1,"name":"张三","sex":"男","age":20,"province":"湖北","address":"湖北省黄冈市罗田县匡河镇"}
{"create":{"_index":"index_person","_id":2}}
{"id":2,"name":"李四","sex":"男","age":19,"province":"江苏","address":"江苏省南京市"}
{"create":{"_index":"index_person","_id":3}}
{"id":3,"name":"王武","sex":"女","age":25,"province":"湖北","address":"湖北省武汉市江汉区"}
{"create":{"_index":"index_person","_id":4}}
{"id":4,"name":"赵六","sex":"女","age":30,"province":"北京","address":"北京市东城区"}
{"create":{"_index":"index_person","_id":5}}
{"id":5,"name":"钱七","sex":"女","age":16,"province":"北京","address":"北京市西城区"}
{"create":{"_index":"index_person","_id":6}}
{"id":6,"name":"王八","sex":"女","age":45,"province":"北京","address":"北京市朝阳区"}
5、实现方式
5.1 multi_terms实现
5.1.1 dsl
GET /index_person/_search
{"size": 0,"aggs": {"agg_province_sex": {"multi_terms": {"size": 10,"shard_size": 25,"order":{"max_age": "desc" },"terms": [{"field": "province","missing": "defaultProvince"},{"field": "sex"}]},"aggs": {"max_age": {"max": {"field": "age"}}}}}
}
5.1.2 java 代码
@Test@DisplayName("多term聚合-根据省和性别聚合,然后根据最大年龄倒序")public void agg01() throws IOException {SearchRequest searchRequest = new SearchRequest.Builder().size(0).index("index_person").aggregations("agg_province_sex", agg ->agg.multiTerms(multiTerms ->multiTerms.terms(term -> term.field("province")).terms(term -> term.field("sex")).order(new NamedValue<>("max_age", SortOrder.Desc))).aggregations("max_age", ageAgg ->ageAgg.max(max -> max.field("age")))).build();System.out.println(searchRequest);SearchResponse<Object> response = client.search(searchRequest, Object.class);System.out.println(response);}
5.1.3 运行结果
5.2 script实现
5.2.1 dsl
GET /index_person/_search
{"size": 0,"runtime_mappings": {"runtime_province_sex": {"type": "keyword","script": """String province = doc['province'].value;String sex = doc['sex'].value;emit(province + '|' + sex);"""}},"aggs": {"agg_province_sex": {"terms": {"field": "runtime_province_sex","size": 10,"shard_size": 25,"order": {"max_age": "desc"}},"aggs": {"max_age": {"max": {"field": "age"}}}}}
}
5.2.2 java代码
@Test@DisplayName("多term聚合-根据省和性别聚合,然后根据最大年龄倒序")public void agg02() throws IOException {SearchRequest searchRequest = new SearchRequest.Builder().size(0).index("index_person").runtimeMappings("runtime_province_sex", field -> {field.type(RuntimeFieldType.Keyword);field.script(script -> script.inline(new InlineScript.Builder().lang(ScriptLanguage.Painless).source("String province = doc['province'].value;\n" +" String sex = doc['sex'].value;\n" +" emit(province + '|' + sex);").build()));return field;}).aggregations("agg_province_sex", agg ->agg.terms(terms ->terms.field("runtime_province_sex").size(10).shardSize(25).order(new NamedValue<>("max_age", SortOrder.Desc))).aggregations("max_age", minAgg ->minAgg.max(max -> max.field("age")))).build();System.out.println(searchRequest);SearchResponse<Object> response = client.search(searchRequest, Object.class);System.out.println(response);}
5.2.3 运行结果
5.3 通过copyto实现
我本地测试过,通过copyto没实现,此处故先不考虑
5.5 通过pipeline来实现
实现思路:
创建mapping时,多创建一个字段pipeline_province_sex
,该字段的值由创建数据时指定pipeline
来生产。
5.4.1 创建mapping
PUT /index_person
{"settings": {"number_of_shards": 1},"mappings": {"properties": {"id": {"type": "long"},"name": {"type": "keyword"},"province": {"type": "keyword"},"sex": {"type": "keyword"},"age": {"type": "integer"},"pipeline_province_sex":{"type": "keyword"},"address": {"type": "text","analyzer": "ik_max_word","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}
}
此处指定了一个字段pipeline_province_sex
,该字段的值会由pipeline
来处理。
5.4.2 创建pipeline
PUT _ingest/pipeline/pipeline_index_person_provice_sex
{"description": "将provice和sex的值拼接起来","processors": [{"set": {"field": "pipeline_province_sex","value": ["{{province}}", "{{sex}}"]}, "join": {"field": "pipeline_province_sex","separator": "|"}}]
}
5.4.3 插入数据
PUT /_bulk?pipeline=pipeline_index_person_provice_sex
{"create":{"_index":"index_person","_id":1}}
{"id":1,"name":"张三","sex":"男","age":20,"province":"湖北","address":"湖北省黄冈市罗田县匡河镇"}
{"create":{"_index":"index_person","_id":2}}
{"id":2,"name":"李四","sex":"男","age":19,"province":"江苏","address":"江苏省南京市"}
{"create":{"_index":"index_person","_id":3}}
{"id":3,"name":"王武","sex":"女","age":25,"province":"湖北","address":"湖北省武汉市江汉区"}
{"create":{"_index":"index_person","_id":4}}
{"id":4,"name":"赵六","sex":"女","age":30,"province":"北京","address":"北京市东城区"}
{"create":{"_index":"index_person","_id":5}}
{"id":5,"name":"钱七","sex":"女","age":16,"province":"北京","address":"北京市西城区"}
{"create":{"_index":"index_person","_id":6}}
{"id":6,"name":"王八","sex":"女","age":45,"province":"北京","address":"北京市朝阳区"}
注意: 此处的插入需要指定上一步的pipeline
PUT /_bulk?pipeline=pipeline_index_person_provice_sex
5.4.4 聚合dsl
GET /index_person/_search
{"size": 0,"aggs": {"agg_province_sex": {"terms": {"field": "pipeline_province_sex","size": 10,"shard_size": 25,"order": {"max_age": "desc" }}, "aggs": {"max_age": {"max": {"field": "age"}}}}}
}
5.4.5 运行结果
6、实现代码
https://gitee.com/huan1993/spring-cloud-parent/blob/master/es/es8-api/src/main/java/com/huan/es8/aggregations/bucket/MultiTermsAggs.java
7、参考文档
- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html
elasticsearch多字段聚合实现方式相关推荐
- java操作es聚合操作并显示其他字段_java使用elasticsearch分组进行聚合查询过程解析...
这篇文章主要介绍了java使用elasticsearch分组进行聚合查询过程解析,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下 java连接elas ...
- es 时间字段聚合_ES之五:ElasticSearch聚合
1.单值聚合 Sum求和,dsl参考如下: { "size": 0, "aggs": { "return_balance": { " ...
- ElasticSearch java API - 聚合查询-聚合多字段聚合demo
以球员信息为例,player索引的player type包含5个字段,姓名,年龄,薪水,球队,场上位置. index的mapping为: "mappings": {"pl ...
- Elasticsearch嵌套字段的聚合操作
数据库字段 在之前介绍Elasticsearch字段的时候介绍过Elasticsearch的嵌套字段.在日常使用中,针对嵌套类型的聚合操作和普通字段类型有些许的不同. 嵌套类型 nested 是ES对 ...
- elasticsearch基础3——聚合、补全、集群
用于复习快速回顾. 目录 1.数据聚合 1.1.聚合的种类 1.2.DSL实现聚合 1.2.1.Bucket聚合语法 1.2.2.聚合结果排序 1.2.3.query限定聚合范围 1.2.4.Metr ...
- ElasticSearch Terms Aggregation 聚合
ElasticSearch(后续简称为ES)提供了对数据的统计分析服务.在之前的开发中使用Terms Aggregation 对数据进行聚合统计,遇到了一些问题,查阅了ES的官方文档和技术博文了解Te ...
- 七.全文检索ElasticSearch经典入门-聚合查询
前言 今天我们讲ES的高亮和聚合查询,聚合功能是ES很重要的功能,它基于查询条件来对数据进行分桶和计算.它提供了类似于关系型数据库的SUM,COUNT, AVG , Group By 等功能.聚合也可 ...
- ElasticSearch分页查询几种方式分析
ElasticSearch分页查询几种方式分析 1 from+size 语句示例 # from+size浅分页 GET test/_search {"from": 10," ...
- Elasticsearch查询和聚合基本语法
1.概述 Elasticsearch主要的查询语法包括URI查询和body查询,URI比较轻便快速,而body查询作为一种json的格式化查询,可以有许多限制条件.本文主要介绍结构化查询的query, ...
最新文章
- 【git】git入门之把自己的项目上传到github
- 【BLE MIDI】MIDI 文件格式分析 ( FF 58 04 拍号 | 音符开指令 | 音符关指令 | 音轨结束标志 )
- Loading(二)--ThreeBodyLoadingView
- 大数据技术 学习之旅_如何开始您的数据科学之旅?
- #个人博客作业week2——结对编程伙伴代码复审
- angular动态绑定样式以及改变UI框架样式的方法
- Mac上自带的语音功能怎么用?让你的mac读给你听
- RecycleView 万能Adapter
- 用什么 软件测试无线频段,Wirelessmon无线频段与信号强度扫描工具软件使用技巧...
- 交叉熵以及相对熵的理解
- 万恶的LayoutSubviews
- linux设备i2c优先级,Linux设备之I2C
- 颜色的搭配适用,摘自某论坛
- Flutter 全能型选手GetX —— 路由管理
- pandas 数据合并 pd.join() pd.merge() pd.crosstab() pd.concat()
- 高效的JS 拼接字符串
- 达内python第一次月考题目_第一次月考试卷分析
- STM32L051xx的时钟配置
- 宝德开开游戏云战略发布会在京瞩目召开
- 中国海洋大学计算机科学与技术考研科目,中国海洋大学(专业学位)计算机技术研究生考试科目和考研参考书目...