简而言之,如果两个document之间的mapping比较类似,则使用type(同一个index下两个type),否则使用两个index可能是更好的选择。

https://www.elastic.co/blog/index-vs-type

注意红色字体的部分:

Who has never wondered whether new data should be put into a new type of an existing index, or into a new index? This is a recurring question for new users, that can’t be answered without understanding how both are implemented.

In the past we tried to make elasticsearch easier to understand by building an analogy with relational databases: indices would be like a database, and types like a table in a database. This was a mistake: the way data is stored is so different that any comparisons can hardly make sense, and this ultimately led to an overuse of types in cases where they were more harmful than helpful.

What is an index?

An index is stored in a set of shards, which are themselves Lucene indices. This already gives you a glimpse of the limits of using a new index all the time: Lucene indices have a small yet fixed overhead in terms of disk space, memory usage and file descriptors used. For that reason, a single large index is more efficient than several small indices: the fixed cost of the Lucene index is better amortized across many documents.

Another important factor is how you plan to search your data. While each shard is searched independently, Elasticsearch eventually needs to merge results from all the searched shards. For instance if you search across 10 indices that have 5 shards each, the node that coordinates the execution of a search request will need to merge 5x10=50 shard results. Here again you need to be careful: if there are too many shard results to merge and/or if you ran an heavy request that produces large shard responses (which can easily happen with aggregations), the task of merging all these shard results can become very resource-intensive, both in terms of CPU and memory. Again this would advocate for having fewer indices.

What is a type?

This is where types help: types are a convenient way to store several types of data in the same index, in order to keep the total number of indices low for the reasons exposed above. In terms of implementation it works by adding a “_type” field to every document that is automatically used for filtering when searching on a specific type. One nice property of types is that searching across several types of the same index comes with no overhead compared to searching a single type: it does not change how many shard results need to be merged.

However this comes with limitations as well(type有哪些限制):

  • Fields need to be consistent across types. For instance if two fields have the same name in different types of the same index, they need to be of the same field type (string, date, etc.) and have the same configuration.
  • Fields that exist in one type will also consume resources for documents of types where this field does not exist. This is a general issue with Lucene indices: they don’t like sparsity. Sparse postings lists can’t be compressed efficiently because of high deltas between consecutive matches. And the issue is even worse with doc values: for speed reasons, doc values often reserve a fixed amount of disk space for every document, so that values can be addressed efficiently. This means that if Lucene establishes that it needs one byte to store all value of a given numeric field, it will also consume one byte for documents that don’t have a value for this field. Future versions of Elasticsearch will have improvements in this area but I would still advise you to model your data in a way that will limit sparsity as much as possible.
  • Scores use index-wide statistics, so scores of documents in one type can be impacted by documents from other types.

This means types can be helpful, but only if all types from a given index have mappings that are similar. Otherwise, the fact that fields also consume resources in documents where they don’t exist could make things worse than if the data had been stored in separate indices.

Which one should I use?

This is a tough question, and the answer will depend on your hardware, data and use-case. First it is important to realize that types are useful because they can help reduce the number of Lucene indices that Elasticsearch needs to manage. But there is another way that you can reduce this number: creating indices that have fewer shards. For instance, instead of folding 5 types into the same index, you could create 5 indices with 1 primary shard each.

I will try to summarize the questions you should ask yourself to make a decision:

  • Are you using parent/child? If yes this can only be done with two types in the same index.
  • Do your documents have similar mappings? If no, use different indices.
  • If you have many documents for each type, then the overhead of Lucene indices will be easily amortized so you can safely use indices, with fewer shards than the default of 5 if necessary.
  • Otherwise you can consider putting documents in different types of the same index. Or even in the same type.

In conclusion, you may be surprised that there are not as many use cases for types as you expected. And this is right: there are actually few use cases for having several types in the same index for the reasons that we mentioned above. Don’t hesitate to allocate different indices for data that would have different mappings, but still keep in mind that you should keep a reasonable number of shards in your cluster, which can be achieved by reducing the number of shards for indices that don’t require a high write throughput and/or will store low numbers of documents.

elastic search index和type相关推荐

  1. 如何快速定位 elastic search 运行出现的 bug HTTP/1.1 400 Bad Request type is missing VALUE_NUMBER_INT

    文章目录 前言 HTTP/1.1 400 Bad Request type is missing VALUE_NUMBER_INT ES BUG 快速定位 前言 因为最近项目上线,正好碰到了elast ...

  2. Elastic Search 介绍和基本概念

    Elastic Search 特点 Elastic Search 可能是是当下最火的搜索引擎中间件了.为什么这么火呢?主要是因为他有几大绝艺: 快速.无论什么时候,你需要向 ES 查询你的数据,都能够 ...

  3. Spring Boot集成Elastic Search

    一.导入maven依赖 本机安装的是6.5.4版本的Elastic Search,故这里导入6.5.4版本的Elastic Search依赖 <properties><java.ve ...

  4. Java微服务篇4——Elastic search

    Java微服务篇4--Elastic search 1.Elastic search安装配置 Elastic search官方:https://www.elastic.co/cn/products/e ...

  5. Elastic Search Java API(文档操作API、Query DSL查询API)、es搜索引擎实战demo

    elastic search实战小demo:https://github.com/simonsfan/springboot-quartz-demo,分支:feature_es 之前在 Elastic ...

  6. Java Elastic search 常用查询

    java Elastic 客户端基本使用 引入jar compile 'org.elasticsearch:elasticsearch:5.5.0'compile 'org.elasticsearch ...

  7. Elastic Search

    简介 Elasticsearch是一个开源的,高扩展.分布式.RESTful 风格的搜索和数据分析引擎,是整个 Elastic Stack 技术栈的核心. Elasticsearch是一个基于Apac ...

  8. java使用elastic search入门

    转自:https://www.ibm.com/developerworks/cn/java/j-use-elasticsearch-java-apps/ 如果您使用过 Apache Lucene 或 ...

  9. debian 10 buster 安装配置 elastic search 和 中文, 拼音分词

    debian 10 buster 安装配置 es 和 中文, 拼音分词 安装 测试 配置 分词 IK 分词器 拼音分词 一个完整的动态映射模板(包含geo, pinyin, IK) 安装 1, 安装j ...

最新文章

  1. smack连接远程openfire连接超时-No response received within reply timeout. Timeout was 5000ms (~5s).
  2. 开发日记-20190513 关键词 汇编语言(六)
  3. pmp考试中容易混淆的22组概念
  4. FAQ接口自动化_转载参考
  5. helm uninstall命令的使用:卸载Release
  6. matlab中antoine方程应用,Antoine方程 安托因方程
  7. LeetCode DD-2020006. 简单游戏(前缀和)
  8. 计算机扩展卡,对于电脑来说, 扩展卡是什么?与接口又有什么关系呢?
  9. leetcode题库10--正则表达式匹配
  10. @程序员,这 TOP 11 物联网云平台速码!
  11. Altium Designer17.1版本使用教程
  12. java递归生成无限层级的树--分类管理
  13. Java应用分层(阿里巴巴Java开发手册)
  14. vue组件库,插件大全
  15. 安居客数据的爬取并保存到MySQL
  16. 配置多用户连接k8s
  17. 浅析重复线性渐变repeating-linear-gradient如何使用
  18. w ndows10系统怎么退出安全模式,Windows 10安全模式怎么解除
  19. w ndoWs8pE模式下载,天意PE迷你版V2011.9.9(天意PE系统)下载 - 下载吧
  20. 副驾驶的意义_副驾驶位置有什么含义?

热门文章

  1. 劳斯判据【建议收藏】
  2. 怎么把短视频做成gif图?短视频生成gif的步骤
  3. 【人工智能AI】Meta开源AI语言模型NLLB-200
  4. 那些年我们喜欢的动漫,红影忍者
  5. 程序员为什么不自己写程序去卖?只有老程序员才理解的道理
  6. 死链接是如何产生的呢?
  7. win7安装不了vmtools
  8. 工作日闹钟 android,工作日闹钟安卓版-工作日闹钟appv1.0 - 起点软件园
  9. 像素射击服务器维护公告图片,《我的勇者》11月29日更新公告
  10. ubuntu系统简单尝试