Elasticsearch: 运用 Field collapsing 来减少基于单个字段的搜索结果
允许根据字段值折叠搜索结果。 折叠是通过每个折叠键仅选择排序最靠前的文档来完成的。要想理解这个其实也并不难,我们就那百度音乐的页面例子来说:
我们可以看到在上面的页面中,它有展示很多喜欢的歌曲。其实这个歌曲可能是一个专辑里的最突出的一个。当我们做页面的时候,我们没有必要把一个专辑里所有的歌曲都放到这个封面的位置。我也许就只想放这个专辑里点击率最高的或者是最受欢迎的一首歌作为这个专辑的代表。当我们点击这个专辑的时候,我们还可以看到其它在这个专辑里的歌曲:
Field collapsing 就是为这个而生。这种情况也适用于有些新闻头条出现在标题栏中。当我们点击进去过,可以看到更多的相关类别的新闻。
下面我们来通过一个例子来展示如何使用。
准备数据
今天我们使用的数据是一个最好游戏的一个数据。我们可以从我的 github 项目里把这个数据下载下来:
git clon https://github.com/liu-xiao-guo/best_games_json_data
然后,我们通过如下的方式把我们下载的JSON数据导入到Elasticsearch中:
我们把这个 index 的名字叫做 best_games:
这样我们的数据就准备好了。整个索引共有500条数据。这个索引里的每一条数据就像:
{"id":"madden-nfl-2002-ps2-2001","name":"Madden NFL 2002","year":2001,"platform":"PS2","genre":"Sports","publisher":"Electronic Arts","global_sales":3.08,"critic_score":94,"user_score":7,"developer":"EA Sports","image_url":"http://www.mobygames.com/images/covers/l/202684-madden-nfl-2002-playstation-2-back-cover.png"}
它的 mapping 为:
{"best_games" : {"mappings" : {"_meta" : {"created_by" : "ml-file-data-visualizer"},"properties" : {"critic_score" : {"type" : "long"},"developer" : {"type" : "text"},"genre" : {"type" : "keyword"},"global_sales" : {"type" : "double"},"id" : {"type" : "keyword"},"image_url" : {"type" : "keyword"},"name" : {"type" : "text"},"platform" : {"type" : "keyword"},"publisher" : {"type" : "keyword"},"user_score" : {"type" : "long"},"year" : {"type" : "long"}}}}
}
Field collapsing
下面我们用 collapsing 的方法来对我们的数据进行搜索:
GET best_games/_search
{"query": {"match": {"name": "Final Fantasy"}},"collapse": {"field": "publisher"}, "sort": [{"critic_score": {"order": "desc"}}]
}
搜索的结果是:
{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 11,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "best_games","_type" : "_doc","_id" : "E3JzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-ix-ps-2000","name" : "Final Fantasy IX","year" : 2000,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 5.3,"critic_score" : 94,"user_score" : 8,"developer" : "SquareSoft","image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"},"fields" : {"publisher" : ["SquareSoft"]},"sort" : [94]},{"_index" : "best_games","_type" : "_doc","_id" : "wnJzF28BjrINWI3xtt40","_score" : null,"_source" : {"id" : "final-fantasy-vii-ps-1997","name" : "Final Fantasy VII","year" : 1997,"platform" : "PS","genre" : "Role-Playing","publisher" : "Sony Computer Entertainment","global_sales" : 9.72,"critic_score" : 92,"user_score" : 9,"developer" : "SquareSoft","image_url" : "https://r.hswstatic.com/w_907/gif/finalfantasyvii-MAIN.jpg"},"fields" : {"publisher" : ["Sony Computer Entertainment"]},"sort" : [92]},{"_index" : "best_games","_type" : "_doc","_id" : "_nJzF28BjrINWI3xtt40","_score" : null,"_source" : {"id" : "final-fantasy-xii-ps2-2006","name" : "Final Fantasy XII","year" : 2006,"platform" : "PS2","genre" : "Role-Playing","publisher" : "Square Enix","global_sales" : 5.95,"critic_score" : 92,"user_score" : 7,"developer" : "Square Enix","image_url" : "https://m.media-amazon.com/images/M/MV5BM2I4MDMyMDQtNjM2OC00ZWNkLTg0ODQtNzYxZjY0M2QxODQyXkEyXkFqcGdeQXVyNjY5NTM5MjA@._V1_.jpg"},"fields" : {"publisher" : ["Square Enix"]},"sort" : [92]},{"_index" : "best_games","_type" : "_doc","_id" : "FXJzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-x-2-ps2-2003","name" : "Final Fantasy X-2","year" : 2003,"platform" : "PS2","genre" : "Role-Playing","publisher" : "Electronic Arts","global_sales" : 5.29,"critic_score" : 85,"user_score" : 6,"developer" : "SquareSoft","image_url" : "https://upload.wikimedia.org/wikipedia/en/thumb/6/6c/FFX-2_box.jpg/220px-FFX-2_box.jpg"},"fields" : {"publisher" : ["Electronic Arts"]},"sort" : [85]}]}
}
上面的结果显示:
- 我们搜索所有的名字为 Final Fantasy 的游戏,并按照 critic_score 降序排序。
- 由于我们使用 collapse,并按照 publisher 来进行分类。它的意思就是每个 publisher 只能有一个搜索的结果,尽管每一 publisher 有很多款的游戏
比如,我们可以找到 publisher 为 SquareSoft 并且 name 里含有 Final Fantasy 的游戏,有三款之多:
GET best_games/_search
{"query": {"bool": {"must": [{"match": {"name": "Final Fantasy"}},{"match": {"publisher": "SquareSoft"}}]}},"sort": [{"critic_score": {"order": "desc"}}]
}
上面的查询结果:
"hits" : [{"_index" : "best_games","_type" : "_doc","_id" : "E3JzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-ix-ps-2000","name" : "Final Fantasy IX","year" : 2000,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 5.3,"critic_score" : 94,"user_score" : 8,"developer" : "SquareSoft","image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"},"sort" : [94]},{"_index" : "best_games","_type" : "_doc","_id" : "0nJzF28BjrINWI3xtt40","_score" : null,"_source" : {"id" : "final-fantasy-viii-ps-1999","name" : "Final Fantasy VIII","year" : 1999,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 7.86,"critic_score" : 90,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"},"sort" : [90]},{"_index" : "best_games","_type" : "_doc","_id" : "SHJzF28BjrINWI3xtuA1","_score" : null,"_source" : {"id" : "final-fantasy-tactics-ps-1997","name" : "Final Fantasy Tactics","year" : 1997,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 2.45,"critic_score" : 83,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"},"sort" : [83]}]}
但是由于我们使用了collapse,只有一款游戏,并且是按照 critic_score 最高的那个被搜索出来。
注意:能够被 collapse 所使用的字段必须是数字或 keyword 字段,并且含有 doc_values。
扩展 Collapse 结果
我们也可以通过使用 inner_hits 选项来扩展 Collapse 的热门匹配:
GET best_games/_search
{"query": {"match": {"name": "Final Fantasy"}},"collapse": {"field": "publisher","inner_hits": {"name": "top 3 games","size": 3,"sort": [{"user_score": "desc"}]}}, "sort": [{"critic_score": {"order": "desc"}}]
}
那么运行后的结果为:
"hits" : [{"_index" : "best_games","_type" : "_doc","_id" : "E3JzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-ix-ps-2000","name" : "Final Fantasy IX","year" : 2000,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 5.3,"critic_score" : 94,"user_score" : 8,"developer" : "SquareSoft","image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"},"fields" : {"publisher" : ["SquareSoft"]},"sort" : [94],"inner_hits" : {"top 3 games" : {"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "best_games","_type" : "_doc","_id" : "0nJzF28BjrINWI3xtt40","_score" : null,"_source" : {"id" : "final-fantasy-viii-ps-1999","name" : "Final Fantasy VIII","year" : 1999,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 7.86,"critic_score" : 90,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"},"sort" : [8]},{"_index" : "best_games","_type" : "_doc","_id" : "E3JzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-ix-ps-2000","name" : "Final Fantasy IX","year" : 2000,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 5.3,"critic_score" : 94,"user_score" : 8,"developer" : "SquareSoft","image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"},"sort" : [8]},{"_index" : "best_games","_type" : "_doc","_id" : "SHJzF28BjrINWI3xtuA1","_score" : null,"_source" : {"id" : "final-fantasy-tactics-ps-1997","name" : "Final Fantasy Tactics","year" : 1997,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 2.45,"critic_score" : 83,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"},"sort" : [8]}]}}}},
我们可以看出来在每个 publisher 里,在 inner_hits 里同时含有3个 top 3 games。它们分别是按照 user_score 来进行分类的。
也可以为每个合拢的匹配请求多个 inner_hits。 当您想要获得 Collapse 后的匹配的多种表示形式时,此功能很有用。
GET best_games/_search
{"query": {"match": {"name": "Final Fantasy"}},"collapse": {"field": "publisher","inner_hits": [{"name": "top user liked","size": 3,"sort": [{"user_score": "desc"}]},{"name": "top most recent games","size": 3,"sort": [{"year": "desc"}]}]},"sort": [{"critic_score": {"order": "desc"}}]
}
显示结果为:
"hits" : [{"_index" : "best_games","_type" : "_doc","_id" : "E3JzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-ix-ps-2000","name" : "Final Fantasy IX","year" : 2000,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 5.3,"critic_score" : 94,"user_score" : 8,"developer" : "SquareSoft","image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"},"fields" : {"publisher" : ["SquareSoft"]},"sort" : [94],"inner_hits" : {"top user liked" : {"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "best_games","_type" : "_doc","_id" : "0nJzF28BjrINWI3xtt40","_score" : null,"_source" : {"id" : "final-fantasy-viii-ps-1999","name" : "Final Fantasy VIII","year" : 1999,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 7.86,"critic_score" : 90,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"},"sort" : [8]},{"_index" : "best_games","_type" : "_doc","_id" : "E3JzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-ix-ps-2000","name" : "Final Fantasy IX","year" : 2000,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 5.3,"critic_score" : 94,"user_score" : 8,"developer" : "SquareSoft","image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"},"sort" : [8]},{"_index" : "best_games","_type" : "_doc","_id" : "SHJzF28BjrINWI3xtuA1","_score" : null,"_source" : {"id" : "final-fantasy-tactics-ps-1997","name" : "Final Fantasy Tactics","year" : 1997,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 2.45,"critic_score" : 83,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"},"sort" : [8]}]}},"top most recent games" : {"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "best_games","_type" : "_doc","_id" : "E3JzF28BjrINWI3xtt80","_score" : null,"_source" : {"id" : "final-fantasy-ix-ps-2000","name" : "Final Fantasy IX","year" : 2000,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 5.3,"critic_score" : 94,"user_score" : 8,"developer" : "SquareSoft","image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"},"sort" : [2000]},{"_index" : "best_games","_type" : "_doc","_id" : "0nJzF28BjrINWI3xtt40","_score" : null,"_source" : {"id" : "final-fantasy-viii-ps-1999","name" : "Final Fantasy VIII","year" : 1999,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 7.86,"critic_score" : 90,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"},"sort" : [1999]},{"_index" : "best_games","_type" : "_doc","_id" : "SHJzF28BjrINWI3xtuA1","_score" : null,"_source" : {"id" : "final-fantasy-tactics-ps-1997","name" : "Final Fantasy Tactics","year" : 1997,"platform" : "PS","genre" : "Role-Playing","publisher" : "SquareSoft","global_sales" : 2.45,"critic_score" : 83,"user_score" : 8,"developer" : "SquareSoft","image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"},"sort" : [1997]}]}}}},
这样针对每个 publisher,我们也可以得到每个 publisher 在 user 中最受欢迎的三个,同时显示最新的三个游戏。
参考:
【1】Request body search | Elasticsearch Guide [7.16] | Elastic
Elasticsearch: 运用 Field collapsing 来减少基于单个字段的搜索结果相关推荐
- elasticsearch 基础 —— Field Collapsing字段折叠
允许根据字段值折叠搜索结果.通过按折叠键选择顶部排序文档来完成折叠.例如,下面的查询检索每个用户的最佳推文,并按喜欢的数量对它们进行排序. GET /twitter/_search {"qu ...
- ElasticSearch(二十四)基于scoll技术滚动搜索大量数据
1.为什么要使用scroll? 如果一次性要查出来比如10万条数据,那么性能会很差,此时一般会采取用scoll滚动查询,一批一批的查,直到所有数据都查询完处理完 2.原理 使用scoll滚动搜索,可以 ...
- Elasticsearch 5.x 字段折叠(Field Collapsing)的使用
在 Elasticsearch 5.x 有一个字段折叠(Field Collapsing,#22337)的功能非常有意思,在这里分享一下, 字段折叠是一个很有历史的需求了,可以看这个 issue, ...
- Elasticsearch 学习之Field Collapsing(字段折叠)
Field Collapsing(字段折叠)不能与scroll.rescore以及search after 结合使用 collapse字段:表示按照age(每个age对应多条document结果)的值 ...
- Elasticsearch中基于词项的搜索
为了方便我们学习,我们导入kibana为我们提供的范例数据. 目前为止,我们已经探索了如何将数据放入Elasticsearch,现在来讨论下如何将数据从Elasticsearch中拿出来,那就是通过搜 ...
- ES Field Collapsing 字段折叠使用详解
在 Elasticsearch 5.x 有一个字段折叠(Field Collapsing,#22337)的功能非常有意思,在这里分享一下, 字段折叠是一个很有历史的需求了,可以看这个 issue, ...
- 抠图:基于单个原色通道
基于单个原色通道的抠图,指的是:找到一个反差最大的原色通道,将其复制为 Alpha 通道,并通过完善此 Alpha 通道进行抠图,是一种相对较简单的基于通道的抠图方法. ◆ ◆ ◆ 找出反差最大的 ...
- 基于Solr的空间搜索学习笔记
基于Solr的空间搜索学习笔记 在Solr中基于空间地址查询主要围绕2个概念实现: (1) Cartesian Tiers 笛卡尔层 Cartesian Tiers是通过将一个平面地图的根据设定的层次 ...
- 22_深度探秘搜索技术_手动控制全文检索(match)结果的精准度、基于boost的细粒度搜索条件实现权重控制...
本文章收录于[Elasticsearch 系列],将详细的讲解 Elasticsearch 整个大体系,包括但不限于ELK讲解.ES调优.海量数据处理等 本博客以例子为主线,来说明在elasticse ...
最新文章
- R语言使用coin包应用于独立性问题的置换检验(permutation tests)、使用普通cor.test函数和置换近似spearman_test函数、检验变量的相关性的显著性
- 基于vue和elementUI封装框选表格组件
- 深度学习100例 | 第29天-ResNet50模型:船型识别
- mysql ——读写分离
- (zz)Sql Server 2005中的架构(Schema)、用户(User)、角色(Role)和登录(Login)(三)
- Sqoop-MySQL导入hive时id为文本解决
- ssis 有条件拆分_SSIS条件拆分转换概述
- Android 快捷键
- vbs 解析 json jsonp 方法
- 转载 LDAP Schema Design
- Malmquist指数DEAP2.1应用
- 软件项目开发文档 模板
- CAS4搭建HTTP环境
- Harry Potter and the Goblet of Fire
- 大学英语综合教程一 Unit 1 课文内容英译中 中英翻译
- 服务器raid配置和安装系统,R390X G2服务器配板载RSTe阵列卡UEFI模式安装windows2008 R2系统典型配置...
- android studio抛出,Android Studio升级到3.0,抛出Aapt2Exception异常
- Arduino直流电动机控制
- Xray工具使用(一)
- pep8 python 编码方式_PEP8 Python 编码规范整理