文章目录

  • 1、背景
  • 2、实现多字段聚合的思路
  • 3、需求
  • 4、数据准备
    • 4.1 创建索引
    • 4.2 准备数据
  • 5、实现方式
    • 5.1 multi_terms实现
      • 5.1.1 dsl
      • 5.1.2 java 代码
      • 5.1.3 运行结果
    • 5.2 script实现
      • 5.2.1 dsl
      • 5.2.2 java代码
      • 5.2.3 运行结果
    • 5.3 通过copyto实现
    • 5.5 通过pipeline来实现
      • 5.4.1 创建mapping
      • 5.4.2 创建pipeline
      • 5.4.3 插入数据
      • 5.4.4 聚合dsl
      • 5.4.5 运行结果
  • 6、实现代码
  • 7、参考文档

1、背景

我们知道在sql中是可以实现 group by 字段a,字段b,那么这种效果在elasticsearch中该如何实现呢?此处我们记录在elasticsearch中的3种方式来实现这个效果。

2、实现多字段聚合的思路


图片来源:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html
从上图中,我们可以知道,可以通过3种方式来实现 多字段的聚合操作。

3、需求

根据省(province)和性别(sex)来进行聚合,然后根据聚合后的每个桶的数据,在根据每个桶中的最大年龄(age)来进行倒序排序。

4、数据准备

4.1 创建索引

PUT /index_person
{"settings": {"number_of_shards": 1},"mappings": {"properties": {"id": {"type": "long"},"name": {"type": "keyword"},"province": {"type": "keyword"},"sex": {"type": "keyword"},"age": {"type": "integer"},"address": {"type": "text","analyzer": "ik_max_word","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}
}

4.2 准备数据

PUT /_bulk
{"create":{"_index":"index_person","_id":1}}
{"id":1,"name":"张三","sex":"男","age":20,"province":"湖北","address":"湖北省黄冈市罗田县匡河镇"}
{"create":{"_index":"index_person","_id":2}}
{"id":2,"name":"李四","sex":"男","age":19,"province":"江苏","address":"江苏省南京市"}
{"create":{"_index":"index_person","_id":3}}
{"id":3,"name":"王武","sex":"女","age":25,"province":"湖北","address":"湖北省武汉市江汉区"}
{"create":{"_index":"index_person","_id":4}}
{"id":4,"name":"赵六","sex":"女","age":30,"province":"北京","address":"北京市东城区"}
{"create":{"_index":"index_person","_id":5}}
{"id":5,"name":"钱七","sex":"女","age":16,"province":"北京","address":"北京市西城区"}
{"create":{"_index":"index_person","_id":6}}
{"id":6,"name":"王八","sex":"女","age":45,"province":"北京","address":"北京市朝阳区"}

5、实现方式

5.1 multi_terms实现

5.1.1 dsl

GET /index_person/_search
{"size": 0,"aggs": {"agg_province_sex": {"multi_terms": {"size": 10,"shard_size": 25,"order":{"max_age": "desc"    },"terms": [{"field": "province","missing": "defaultProvince"},{"field": "sex"}]},"aggs": {"max_age": {"max": {"field": "age"}}}}}
}

5.1.2 java 代码

    @Test@DisplayName("多term聚合-根据省和性别聚合,然后根据最大年龄倒序")public void agg01() throws IOException {SearchRequest searchRequest = new SearchRequest.Builder().size(0).index("index_person").aggregations("agg_province_sex", agg ->agg.multiTerms(multiTerms ->multiTerms.terms(term -> term.field("province")).terms(term -> term.field("sex")).order(new NamedValue<>("max_age", SortOrder.Desc))).aggregations("max_age", ageAgg ->ageAgg.max(max -> max.field("age")))).build();System.out.println(searchRequest);SearchResponse<Object> response = client.search(searchRequest, Object.class);System.out.println(response);}

5.1.3 运行结果

5.2 script实现

5.2.1 dsl

GET /index_person/_search
{"size": 0,"runtime_mappings": {"runtime_province_sex": {"type": "keyword","script": """String province = doc['province'].value;String sex = doc['sex'].value;emit(province + '|' + sex);"""}},"aggs": {"agg_province_sex": {"terms": {"field": "runtime_province_sex","size": 10,"shard_size": 25,"order": {"max_age": "desc"}},"aggs": {"max_age": {"max": {"field": "age"}}}}}
}

5.2.2 java代码

@Test@DisplayName("多term聚合-根据省和性别聚合,然后根据最大年龄倒序")public void agg02() throws IOException {SearchRequest searchRequest = new SearchRequest.Builder().size(0).index("index_person").runtimeMappings("runtime_province_sex", field -> {field.type(RuntimeFieldType.Keyword);field.script(script -> script.inline(new InlineScript.Builder().lang(ScriptLanguage.Painless).source("String province = doc['province'].value;\n" +"          String sex = doc['sex'].value;\n" +"          emit(province + '|' + sex);").build()));return field;}).aggregations("agg_province_sex", agg ->agg.terms(terms ->terms.field("runtime_province_sex").size(10).shardSize(25).order(new NamedValue<>("max_age", SortOrder.Desc))).aggregations("max_age", minAgg ->minAgg.max(max -> max.field("age")))).build();System.out.println(searchRequest);SearchResponse<Object> response = client.search(searchRequest, Object.class);System.out.println(response);}

5.2.3 运行结果

5.3 通过copyto实现

我本地测试过,通过copyto没实现,此处故先不考虑

5.5 通过pipeline来实现

实现思路:
创建mapping时,多创建一个字段pipeline_province_sex,该字段的值由创建数据时指定pipeline来生产。

5.4.1 创建mapping

PUT /index_person
{"settings": {"number_of_shards": 1},"mappings": {"properties": {"id": {"type": "long"},"name": {"type": "keyword"},"province": {"type": "keyword"},"sex": {"type": "keyword"},"age": {"type": "integer"},"pipeline_province_sex":{"type": "keyword"},"address": {"type": "text","analyzer": "ik_max_word","fields": {"keyword": {"type": "keyword","ignore_above": 256}}}}}
}

此处指定了一个字段pipeline_province_sex,该字段的值会由pipeline来处理。

5.4.2 创建pipeline

PUT _ingest/pipeline/pipeline_index_person_provice_sex
{"description": "将provice和sex的值拼接起来","processors": [{"set": {"field": "pipeline_province_sex","value": ["{{province}}", "{{sex}}"]}, "join": {"field": "pipeline_province_sex","separator": "|"}}]
}

5.4.3 插入数据

PUT /_bulk?pipeline=pipeline_index_person_provice_sex
{"create":{"_index":"index_person","_id":1}}
{"id":1,"name":"张三","sex":"男","age":20,"province":"湖北","address":"湖北省黄冈市罗田县匡河镇"}
{"create":{"_index":"index_person","_id":2}}
{"id":2,"name":"李四","sex":"男","age":19,"province":"江苏","address":"江苏省南京市"}
{"create":{"_index":"index_person","_id":3}}
{"id":3,"name":"王武","sex":"女","age":25,"province":"湖北","address":"湖北省武汉市江汉区"}
{"create":{"_index":"index_person","_id":4}}
{"id":4,"name":"赵六","sex":"女","age":30,"province":"北京","address":"北京市东城区"}
{"create":{"_index":"index_person","_id":5}}
{"id":5,"name":"钱七","sex":"女","age":16,"province":"北京","address":"北京市西城区"}
{"create":{"_index":"index_person","_id":6}}
{"id":6,"name":"王八","sex":"女","age":45,"province":"北京","address":"北京市朝阳区"}

注意: 此处的插入需要指定上一步的pipeline
PUT /_bulk?pipeline=pipeline_index_person_provice_sex

5.4.4 聚合dsl

GET /index_person/_search
{"size": 0,"aggs": {"agg_province_sex": {"terms": {"field": "pipeline_province_sex","size": 10,"shard_size": 25,"order": {"max_age": "desc"   }}, "aggs": {"max_age": {"max": {"field": "age"}}}}}
}

5.4.5 运行结果

6、实现代码

https://gitee.com/huan1993/spring-cloud-parent/blob/master/es/es8-api/src/main/java/com/huan/es8/aggregations/bucket/MultiTermsAggs.java

7、参考文档

  1. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

elasticsearch多字段聚合实现方式相关推荐

  1. java操作es聚合操作并显示其他字段_java使用elasticsearch分组进行聚合查询过程解析...

    这篇文章主要介绍了java使用elasticsearch分组进行聚合查询过程解析,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友可以参考下 java连接elas ...

  2. es 时间字段聚合_ES之五:ElasticSearch聚合

    1.单值聚合 Sum求和,dsl参考如下: { "size": 0, "aggs": { "return_balance": { " ...

  3. ElasticSearch java API - 聚合查询-聚合多字段聚合demo

    以球员信息为例,player索引的player type包含5个字段,姓名,年龄,薪水,球队,场上位置. index的mapping为: "mappings": {"pl ...

  4. Elasticsearch嵌套字段的聚合操作

    数据库字段 在之前介绍Elasticsearch字段的时候介绍过Elasticsearch的嵌套字段.在日常使用中,针对嵌套类型的聚合操作和普通字段类型有些许的不同. 嵌套类型 nested 是ES对 ...

  5. elasticsearch基础3——聚合、补全、集群

    用于复习快速回顾. 目录 1.数据聚合 1.1.聚合的种类 1.2.DSL实现聚合 1.2.1.Bucket聚合语法 1.2.2.聚合结果排序 1.2.3.query限定聚合范围 1.2.4.Metr ...

  6. ElasticSearch Terms Aggregation 聚合

    ElasticSearch(后续简称为ES)提供了对数据的统计分析服务.在之前的开发中使用Terms Aggregation 对数据进行聚合统计,遇到了一些问题,查阅了ES的官方文档和技术博文了解Te ...

  7. 七.全文检索ElasticSearch经典入门-聚合查询

    前言 今天我们讲ES的高亮和聚合查询,聚合功能是ES很重要的功能,它基于查询条件来对数据进行分桶和计算.它提供了类似于关系型数据库的SUM,COUNT, AVG , Group By 等功能.聚合也可 ...

  8. ElasticSearch分页查询几种方式分析

    ElasticSearch分页查询几种方式分析 1 from+size 语句示例 # from+size浅分页 GET test/_search {"from": 10," ...

  9. Elasticsearch查询和聚合基本语法

    1.概述 Elasticsearch主要的查询语法包括URI查询和body查询,URI比较轻便快速,而body查询作为一种json的格式化查询,可以有许多限制条件.本文主要介绍结构化查询的query, ...

最新文章

  1. 【git】git入门之把自己的项目上传到github
  2. 【BLE MIDI】MIDI 文件格式分析 ( FF 58 04 拍号 | 音符开指令 | 音符关指令 | 音轨结束标志 )
  3. Loading(二)--ThreeBodyLoadingView
  4. 大数据技术 学习之旅_如何开始您的数据科学之旅?
  5. #个人博客作业week2——结对编程伙伴代码复审
  6. angular动态绑定样式以及改变UI框架样式的方法
  7. Mac上自带的语音功能怎么用?让你的mac读给你听
  8. RecycleView 万能Adapter
  9. 用什么 软件测试无线频段,Wirelessmon无线频段与信号强度扫描工具软件使用技巧...
  10. 交叉熵以及相对熵的理解
  11. 万恶的LayoutSubviews
  12. linux设备i2c优先级,Linux设备之I2C
  13. 颜色的搭配适用,摘自某论坛
  14. Flutter 全能型选手GetX —— 路由管理
  15. pandas 数据合并 pd.join() pd.merge() pd.crosstab() pd.concat()
  16. 高效的JS 拼接字符串
  17. 达内python第一次月考题目_第一次月考试卷分析
  18. STM32L051xx的时钟配置
  19. 宝德开开游戏云战略发布会在京瞩目召开
  20. 中国海洋大学计算机科学与技术考研科目,中国海洋大学(专业学位)计算机技术研究生考试科目和考研参考书目...

热门文章

  1. 收敛的几何级数与flash过渡动画
  2. 海洋系列灯具创意设计
  3. CEC2017:斑马优化算法(Zebra Optimization Algorithm,ZOA)求解cec2017(提供MATLAB代码)
  4. 如何在三年内赚够100万
  5. Java 中的面向数据编程
  6. 2021年想做生意赚钱,一定要学会生意必备6个APP
  7. 国产工业软件做的如何?
  8. 国家信息化计算机教育认证证书的名称及用处介绍
  9. 工具分享:macOS上可以单独设置鼠标滚轮方向的小工具,Mos
  10. 基于 scipy.optimize.minimize 方法对 MindQuantm 搭建的变参量子线路进行优化