纪实:嵌入式Elasticsearch服务因为gc无法释放内存,导致宕机事件
场景描述
我们电商服务中使用了Elasticsearch嵌入式服务,然后再一次错误代码提交后,导致elasticsearch服务检索了大量数据使得内存无法释放,最后服务发生stop-the-world,宕机了
原因解析
网上查询可能是因为Elasticsearch服务的gc高占用引起的,所以就开分析日志,分析命令为:
cat xxx.log |grep "INFO elasticsearch\[estore\]\[scheduler\]\[T#1\]"
以上是抓取elasticsearch服务的gc处理日志,其返回结果为:
16:06:47 1.6-2018-02-07 16:06:47,609 INFO elasticsearch[estore][scheduler][T#1] - [estore] [gc][young][319][334] duration [832ms], collections [1]/[1.2s], total [832ms]/[7.1s], memory [530.6mb]->[653.8mb]/[7.9gb], all_pools {[young] [4.1mb]->[4mb]/[123.5mb]}{[survivor] [0b]->[47.4mb]/[47.5mb]}{[old] [526.5mb]->[602.2mb]/[7.7gb]}
16:07:50 1.6-2018-02-07 16:07:50,065 INFO elasticsearch[estore][scheduler][T#1] - [estore] [gc][old][363][10] duration [6.2s], collections [1]/[6.9s], total [6.2s]/[23.9s], memory [5.4gb]->[5.3gb]/[7.9gb], all_pools {[young] [2.9mb]->[3.7mb]/[99mb]}{[survivor] [39.6mb]->[0b]/[80mb]}{[old] [5.3gb]->[5.3gb]/[7.7gb]}
16:08:02 1.6-2018-02-07 16:08:02,274 INFO elasticsearch[estore][scheduler][T#1] - [estore] [gc][old][368][11] duration [7.8s], collections [1]/[8s], total [7.8s]/[31.8s], memory [6.3gb]->[6.3gb]/[7.9gb], all_pools {[young] [32.1mb]->[2.2mb]/[121.5mb]}{[survivor] [55.9mb]->[0b]/[67.5mb]}{[old] [6.2gb]->[6.3gb]/[7.7gb]}
16:08:15 1.6-2018-02-07 16:08:15,852 INFO elasticsearch[estore][scheduler][T#1] - [estore] [gc][old][373][12] duration [8.2s], collections [1]/[9.1s], total [8.2s]/[40s], memory [7.2gb]->[7.4gb]/[7.9gb], all_pools {[young] [2.2mb]->[2.3mb]/[118mb]}{[survivor] [54.4mb]->[0b]/[67mb]}{[old] [7.2gb]->[7.4gb]/[7.7gb]}
16:08:50 1.6-2018-02-07 16:08:50,028 INFO elasticsearch[estore][scheduler][T#1] - [estore] [gc][old][377][15] duration [8.4s], collections [1]/[8.4s], total [8.4s]/[1.2m], memory [7.8gb]->[7.7gb]/[7.9gb], all_pools {[young] [57.9mb]->[49.6mb]/[124.5mb]}{[survivor] [0b]->[0b]/[66.5mb]}{[old] [7.7gb]->[7.7gb]/[7.7gb]}
分析结果,最后一条{[old] [7.7gb]->[7.7gb]/[7.7gb]},该结果表示Elasticsearch服务最后无法将gc释放,导致了内存高占用,使得服务宕机了
以上是宕机原因,但什么使得gc高占用呢?
继续查看结果分析,发现在第一条结尾 {[old] [526.5mb]->[602.2mb]/[7.7gb]},而最后一条{[old] [7.7gb]->[7.7gb]/[7.7gb]}
说明应该是有不合适索引检索导致的,那么继续分析日志
cat xxx.log | grep "elasticsearch\[estore\]\[search\]"
抓取结果
16:03:04 1.6-2018-02-07 16:03:04,331 TRACE elasticsearch[estore][search][T#3] - [estore] [blanksimple][2] took[575.1ms], took_millis[575], types[goodsType], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{"must":[{"term":{"bill.GOODSSTATUS":1}},{"term":{"bill.GOODSTYPE":1}},{"term":{"bill.STOREID":6635387}},{"bool":{"should":[{"term":{"bill.STATUS":"0"}},{"term":{"bill.STATUS":"1"}}]}},{"bool":{"should":{"term":{"bill.STOREGOODSTYPEID":"6636222"}}}},{"range":{"bill.RELEASEFROM":{"from":null,"to":"2018-02-07T16:03:03","include_lower":true,"include_upper":false}}},{"range":{"bill.RELEASETO":{"from":"2018-02-07T16:03:03","to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":0.0,"to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":null,"to":1.7976931348623157E308,"include_lower":true,"include_upper":false}}}]}},"explain":false,"_source":{"includes":["id"],"excludes":[]},"sort":[{"_score":{"order":"desc"}},{"bill.MAINSCORE":{"order":"desc"}},{"bill.UPDATETIME":{"order":"desc"}},{"_score":{"order":"desc"}}]}], extra_source[],
16:03:04 1.6-2018-02-07 16:03:04,407 TRACE elasticsearch[estore][search][T#4] - [estore] [blanksimple][3] took[652.8ms], took_millis[652], types[goodsType], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{"must":[{"term":{"bill.GOODSSTATUS":1}},{"term":{"bill.GOODSTYPE":1}},{"term":{"bill.STOREID":6635387}},{"bool":{"should":[{"term":{"bill.STATUS":"0"}},{"term":{"bill.STATUS":"1"}}]}},{"bool":{"should":{"term":{"bill.STOREGOODSTYPEID":"6636222"}}}},{"range":{"bill.RELEASEFROM":{"from":null,"to":"2018-02-07T16:03:03","include_lower":true,"include_upper":false}}},{"range":{"bill.RELEASETO":{"from":"2018-02-07T16:03:03","to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":0.0,"to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":null,"to":1.7976931348623157E308,"include_lower":true,"include_upper":false}}}]}},"explain":false,"_source":{"includes":["id"],"excludes":[]},"sort":[{"_score":{"order":"desc"}},{"bill.MAINSCORE":{"order":"desc"}},{"bill.UPDATETIME":{"order":"desc"}},{"_score":{"order":"desc"}}]}], extra_source[],
16:03:04 1.6-2018-02-07 16:03:04,424 TRACE elasticsearch[estore][search][T#2] - [estore] [blanksimple][1] took[669.4ms], took_millis[669], types[goodsType], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{"must":[{"term":{"bill.GOODSSTATUS":1}},{"term":{"bill.GOODSTYPE":1}},{"term":{"bill.STOREID":6635387}},{"bool":{"should":[{"term":{"bill.STATUS":"0"}},{"term":{"bill.STATUS":"1"}}]}},{"bool":{"should":{"term":{"bill.STOREGOODSTYPEID":"6636222"}}}},{"range":{"bill.RELEASEFROM":{"from":null,"to":"2018-02-07T16:03:03","include_lower":true,"include_upper":false}}},{"range":{"bill.RELEASETO":{"from":"2018-02-07T16:03:03","to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":0.0,"to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":null,"to":1.7976931348623157E308,"include_lower":true,"include_upper":false}}}]}},"explain":false,"_source":{"includes":["id"],"excludes":[]},"sort":[{"_score":{"order":"desc"}},{"bill.MAINSCORE":{"order":"desc"}},{"bill.UPDATETIME":{"order":"desc"}},{"_score":{"order":"desc"}}]}], extra_source[],
16:03:04 1.6-2018-02-07 16:03:04,439 TRACE elasticsearch[estore][search][T#5] - [estore] [blanksimple][4] took[684.4ms], took_millis[684], types[goodsType], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{"must":[{"term":{"bill.GOODSSTATUS":1}},{"term":{"bill.GOODSTYPE":1}},{"term":{"bill.STOREID":6635387}},{"bool":{"should":[{"term":{"bill.STATUS":"0"}},{"term":{"bill.STATUS":"1"}}]}},{"bool":{"should":{"term":{"bill.STOREGOODSTYPEID":"6636222"}}}},{"range":{"bill.RELEASEFROM":{"from":null,"to":"2018-02-07T16:03:03","include_lower":true,"include_upper":false}}},{"range":{"bill.RELEASETO":{"from":"2018-02-07T16:03:03","to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":0.0,"to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":null,"to":1.7976931348623157E308,"include_lower":true,"include_upper":false}}}]}},"explain":false,"_source":{"includes":["id"],"excludes":[]},"sort":[{"_score":{"order":"desc"}},{"bill.MAINSCORE":{"order":"desc"}},{"bill.UPDATETIME":{"order":"desc"}},{"_score":{"order":"desc"}}]}], extra_source[],
16:03:04 1.6-2018-02-07 16:03:04,441 TRACE elasticsearch[estore][search][T#1] - [estore] [blanksimple][0] took[687.1ms], took_millis[687], types[goodsType], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{"must":[{"term":{"bill.GOODSSTATUS":1}},{"term":{"bill.GOODSTYPE":1}},{"term":{"bill.STOREID":6635387}},{"bool":{"should":[{"term":{"bill.STATUS":"0"}},{"term":{"bill.STATUS":"1"}}]}},{"bool":{"should":{"term":{"bill.STOREGOODSTYPEID":"6636222"}}}},{"range":{"bill.RELEASEFROM":{"from":null,"to":"2018-02-07T16:03:03","include_lower":true,"include_upper":false}}},{"range":{"bill.RELEASETO":{"from":"2018-02-07T16:03:03","to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":0.0,"to":null,"include_lower":false,"include_upper":true}}},{"range":{"bill.ES_MINPRICE":{"from":null,"to":1.7976931348623157E308,"include_lower":true,"include_upper":false}}}]}},"explain":false,"_source":{"includes":["id"],"excludes":[]},"sort":[{"_score":{"order":"desc"}},{"bill.MAINSCORE":{"order":"desc"}},{"bill.UPDATETIME":{"order":"desc"}},{"_score":{"order":"desc"}}]}], extra_source[],
16:06:48 1.6-2018-02-07 16:06:48,327 DEBUG elasticsearch[estore][search][T#14] - [estore] [blanksimple][0] took[4.2s], took_millis[4288], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:06:48 1.6-2018-02-07 16:06:48,331 DEBUG elasticsearch[estore][search][T#20] - [estore] [blanksimple][1] took[4.2s], took_millis[4292], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:06:48 1.6-2018-02-07 16:06:48,332 DEBUG elasticsearch[estore][search][T#21] - [estore] [blanksimple][2] took[4.2s], took_millis[4294], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:06:48 1.6-2018-02-07 16:06:48,343 DEBUG elasticsearch[estore][search][T#19] - [estore] [blanksimple][3] took[4.3s], took_millis[4304], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:06:48 1.6-2018-02-07 16:06:48,347 DEBUG elasticsearch[estore][search][T#22] - [estore] [blanksimple][4] took[4.3s], took_millis[4308], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
...
很多,但是主要注意到里面出现了 WRAN,于是修改抓取命令
cat xxx.log |grep WARN | grep "elasticsearch\[estore\]\[search\]"
其结果
16:07:38 1.6-2018-02-07 16:07:38,981 WARN elasticsearch[estore][search][T#2] - [estore] [blanksimple][2] took[49.8s], took_millis[49814], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:07:39 1.6-2018-02-07 16:07:39,090 WARN elasticsearch[estore][search][T#24] - [estore] [blanksimple][3] took[49.9s], took_millis[49924], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:07:39 1.6-2018-02-07 16:07:39,101 WARN elasticsearch[estore][search][T#18] - [estore] [blanksimple][0] took[49.9s], took_millis[49934], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:07:39 1.6-2018-02-07 16:07:39,201 WARN elasticsearch[estore][search][T#25] - [estore] [blanksimple][4] took[50s], took_millis[50035], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
16:07:39 1.6-2018-02-07 16:07:39,469 WARN elasticsearch[estore][search][T#23] - [estore] [xblanksimple][1] took[50.3s], took_millis[50303], types[errotype], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"from":0,"size":2147483647,"query":{"bool":{}},"explain":false,"_source":{"includes":["bill.OECODE","bill.NAME","bill.PRODUCT"],"excludes":[]},"sort":[{"_score":{"order":"desc"}}]}], extra_source[],
发现这个查询有问题,这个查询在没有条件的情况下,其查询条件size却设为了Integer最大值,该types下的数据量很大,从而导致gc高占用,这个是业务代码问题
此外的第2个因素,索引数据量达到了12个G,内嵌elasticsearch-web服务jvm才8G,这个索引数据大小和jvm的问题请大家自行查询下分片策略等,总之jvm应该超过当前服务所有用的分片的数据总量才行
总结
以上2个错误同存在,导致gc无法释放内存,导致宕机事件
参考内容:http://blog.csdn.net/quicknet/article/details/45148447
纪实:嵌入式Elasticsearch服务因为gc无法释放内存,导致宕机事件相关推荐
- 云宕机事件盘点:IBM云服务全球宕机四小时,安全稳定成空话?
随着越来越多的企业及应用将它们的数据搬运至云端,即便只是云服务上的一个小小宕机事件,都可能引发一场大灾难. 6月10日,IBM云计算发生了长达四个小时的中断故障,导致多项托管于平台上的互联网服务中断, ...
- 频繁分配释放内存导致的性能问题的分析--brk和mmap的实现
现象 1 压力测试过程中,发现被测对象性能不够理想,具体表现为: 进程的系统态CPU消耗20,用户态CPU消耗10,系统idle大约70 2 用ps -o majflt,minflt -C pr ...
- 【百度分享】频繁分配释放内存导致的性能问题的分析
现象1 压力测试过程中,发现被测对象性能不够理想,具体表现为: 进程的系统态CPU消耗20,用户态CPU消耗10,系统idle大约70 2 用ps -o majflt,minflt -C prog ...
- 苹果服务两天内经历两次宕机:部分服务受影响 现已修复
3月23日消息,据国外媒体报道,苹果旗下服务连续第二天出现宕机.美国时间周二,大量用户再次抱怨苹果的服务和应用程序再次出现问题,而就在前一天苹果服务器就宕机了几个小时. 据悉,App Stores.A ...
- linux内存不足宕机,记一次linux机器内存占用太多导致的服务宕机
背景 最近我们测试环境部署的一个项目总是不停的宕机,之前也有过,但是最近特别频繁 猜测 可能是因为cup或者内存占用太大导致的服务宕机 执行 1.登录linux服务器 2.top命令 下面是对每一行信 ...
- java没有释放内存_java – G1年轻的GC没有释放内存 – 空间耗尽
我正在使用G1GC,jdk 1.7 Java HotSpot(TM) 64-Bit Server VM (24.79-b02) for linux-amd64 JRE (1.7.0_79-b15), ...
- 一次region过多导致HBase服务宕机事件
具体情况是,甲方有10个节点的HBase集群,主要业务表共10张,region总数达23000+,平均每台RegionServer(RS)的region数量2300左右,每台RS堆内存配置96G(初始 ...
- java 验证服务器宕机_java服务宕机原因查询
背景 在java服务项目上线之后经常会出现宕机的情况 常见原因 内存溢出 1.查到服务进程号 [root@wms ~]# ps -ef|grep java root 6399 6069 0 08:57 ...
- java项目宕机出现原因,java服务宕机原因查询
在JAVA服务项目上线之后经常会出现宕机的情况 常见原因 内存溢出 1.查到服务进程号 [root@wms ~]# ps -ef|grep java root 6399 6069 0 08:57 pt ...
最新文章
- Python导入其他文件中的.py文件 即模块
- python交互式绘图库_一个交互式可视化Python库——Bokeh
- Android 用adb 打印linux内核调试信息dmesg和kmsg命令
- 计算机控制系统三种信号,计算机控制技术模拟试题3
- 数据库菜鸟不可不看 简单SQL语句小结
- 不用L约束又不会梯度消失的GAN,了解一下?
- FastCGI中文规范
- 一、linux搭建jenkins+github详细步骤
- linux tar 大小不同,linux – 如何在使用tar时设置bzip2块大小?
- 一个方便使用的在线截图Web控件-WebImageMaker
- 论述计算机硬件结构的理解论文,论述对汇编语言教学内容和方法及特点的认识与思考...
- 数据算法之二叉树平衡(BinTreeNode Rotate)的Java实现
- 获取表数据_大数据抽取解决方案——kettle分页循环
- 数据结构与算法总结(八股文)
- android studio外接模拟器,Android Studio,使用外部模拟器作为生成app调试的模拟器
- WebCollector
- 详细讲解深层神经网络DNN
- vue 饿了么ui如何修改内联样式:element.style
- 配电室环境监测系统,智能配电室环境监控系统完整方案
- WPS以及Office 下 word 文档,使用通配符进行高级替换