elasticsearch与Hadoop
1. 安装sdk
yum -y install unzip
yum -y install zip
curl -s "https://get.sdkman.io" | bash
新终端下执行:source "$HOME/.sdkman/bin/sdkman-init.sh"
检查是够安装成功:
(1) sdk version
(2) sdk help
补充删除sdk
tar zcvf ~/sdkman-backup_$(date +%F-%kh%M).tar.gz -C ~/ .sdkman
rm -rf ~/.sdkman
2. 安装gradle
sdk install gradle
3. 下载es-hadoop
cd /data/tools
git clone https://github.com/elastic/elasticsearch-hadoop.git
4.编译es-hadoop
cd /data/tools/elasticsearch-hadoop
vi gradle.properties
+hadoopversion 2.6.0
+hiveversion 1.1.0
+sparkversion 2.1.0
./gradlew distZip
5.
cp elasticsearch-hadoop-7.0.0-alpha1-SNAPSHOT.jar /opt/cloudera/parcels/CDH/lib/hive/lib
scp elasticsearch-hadoop-7.0.0-alpha1-SNAPSHOT.jar root@ctdn-1:/opt/cloudera/parcels/CDH/lib/hive/lib
6.
参考:
https://github.com/elastic/elasticsearch-hadoop
https://www.elastic.co/guide/en/elasticsearch/hadoop/current/hive.html#hive
https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html
hive> add jar /opt/cloudera/parcels/CDH/lib/hive/lib/elasticsearch-hadoop-7.0.0-alpha1-SNAPSHOT.jar;
CREATE EXTERNAL TABLE ext_es_org_info (
`orgid` string,
`investorg` string,
`orgname` string,
`logo` string,
`weburl` string,
`orgdesc` string,
`founddate` string,
`district` string,
`investtotal` int,
`investstage` string,
`prov` string,
`city` string,
`focusdomain` string,
`investproj` string,
`investamount` string)
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES(
'es.nodes' = '10.11.8.32:9200',
'es.index.auto.create' = 'true',
'es.resource' = 'org/org_info',
'es.mapping.id' = 'orgid',
'es.mapping.names' = 'investorg:investorg,
orgname:orgname,
logo:logo,
weburl:weburl,
orgdesc:orgdesc,
founddate:founddate,
district:district,
investtotal:investtotal,
investstage:investstage,
prov:prov,
city:city,
focusdomain:focusdomain,
investproj:investproj,
investamount:investamount');
SET hive.mapred.reduce.tasks.speculative.execution = false;
SET mapreduce.map.speculative = false;
SET mapreduce.reduce.speculative = false;
INSERT overwrite TABLE ext_es_org_info
SELECT orgid
,investorg
,orgname
,logo
,weburl
,orgdesc
,founddate
,district
,investtotal
,investstage
,prov
,city
,focusdomain
,investproj
,investamount
FROM es_org_info;
curl -XGET http://10.11.8.32:9200/yelpindex/1
7.
cd /opt/cloudera/parcels/CDH/lib/hive/conf
vi hive-site.xml
+
<property>
<name>hive.aux.jars.path</name>
<value>/opt/cloudera/parcels/CDH/lib/hive/lib/elasticsearch-hadoop-7.0.0-alpha1-SNAPSHOT.jar</value>
<description>A comma separated list (with no spaces) of the jar files</description>
</property>
scp hive-site.xml root@ctdn-6:/opt/cloudera/parcels/CDH/lib/hive/conf
hive-site.xml
curl XGET http://10.11.8.32:9200/yelpindex/yelp/_search?q=id:1
参照:
https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/index.html
https://github.com/medcl/elasticsearch-analysis-ik
https://github.com/elastic/elasticsearch-py
http://qbox.io/blog/elasticsearch-in-apache-spark-python
https://www.yelp.com/dataset
http://blog.csdn.net/xmo_jiao/article/details/73251937
https://www.elastic.co/guide/en/elasticsearch/reference/6.1/query-dsl-mlt-query.html
elasticsearch与Hadoop相关推荐
- elasticsearch和hadoop集成,gateway.type hdfs设置
配置elasticsearch的存储路径为hdfs需要两步,安装插件 elasticsearch-hadoop,在联网的情况下在命令窗口运行:plugin -install elasticsearch ...
- 【Elasticsearch】Elasticsearch-Hadoop打通Elasticsearch和Hadoop
https://elasticsearch.cn/article/6194
- Elasticsearch和Hive整合,将hive数据同步到ES中
1 Elasticsearch整合Hive 1.1 软件环境 Hadoop软件环境 Hive软件环境 ES软件环境 1.2 ES-Hadoop介绍 1.2.1 官网 https://www.elast ...
- ElasticSearch搜索语法学习(term,filter,bool,terms,range)
ES搜索语法学习 目录 原始数据 term,filter使用 bool组合多个filter条件来搜索数据 terms搜索多个值以及多值搜索结果优化 基于range filter来进行范围过滤 手动控制 ...
- 市面上的hadoop书籍调研
MapReduce设计模式(不看,设计模式在小公司里就是让web工程使用的.) Hadoop与大数据挖掘(张良均写的,不看) Hadoop大数据分析与挖掘实战(张良均写的,不看) Hadoop 2.X ...
- Elasticsearch之插件介绍及安装
ES站点插件(以网页形式展现) 1.BigDesk Plugin (作者 Lukáš Vlček) 简介:监控es状态的插件,推荐![目前不支持2.x] 2.Elasticsearch Head Pl ...
- Elasticsearch的前后台运行与停止(tar包方式)
备注:在生产环境中,往往一般用后台来运行.jps查看. 1.ES的前台运行 [hadoop@djt002 elasticsearch-2.4.3]$ pwd /usr/local/elasticsea ...
- elasticsearch 学习须知
人工智能.大数据快速发展的今天,对于 TB 甚至 PB 级大数据的快速检索已然成为刚需.Elasticsearch 作为开源领域的后起之秀,从2010年至今得到飞跃式的发展. Elasticsearc ...
- Spark 整合ElasticSearch
Spark 整合ElasticSearch 因为做资料搜索用到了ElasticSearch,最近又了解一下 Spark ML,先来演示一个Spark 读取/写入 ElasticSearch 简单示例. ...
最新文章
- Linux下日志文件过大解决方案
- 行人属性--HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis
- docker 无法正常启动或版本信息会报错 Cannot connect to the Docker daemon at
- java 压缩js css,java YUI压缩JS跟CSS
- 【Linux】一步一步学Linux——rename命令(36)
- Dapr + .NET 实战(九)本地调试
- Linux 内核打印级别
- Hadoop HIVE 关联查询
- 使用Python爬取mobi格式电纸书
- lambdaQuery中EQ、NE、GT、LT、GE、LE的用法 (来自网络收集)
- Silverlight+WCF 新手实例 象棋 WCF通讯基础(十四)
- 游戏窗口化工具_仙剑奇侠传16珍藏版大合集(含各个版本和工具)
- 苹果计算机键盘usb,没有USB3.0驱动的苹果电脑与键盘鼠标失灵的关系
- cpci检索太慢_了解CPCI检索,对自己的好处
- 蝴蝶展翅鸿蒙云飞,我不会告诉你,中华民族几千年的梦就是蝴蝶云梦
- 鸿蒙杀戮手机电脑版,鸿蒙杀戮单职业
- 10寸、10.1寸、10.4寸液晶屏解决方案
- 邮箱格式,好用的商务邮箱推荐
- 作业5管理用户、组及权限
- MAC A1466 820-00165-A 进水不触发