Hadoop基础-配置历史服务器
Hadoop基础-配置历史服务器
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
Hadoop自带了一个历史服务器,可以通过历史服务器查看已经运行完的Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。默认情况下,Hadoop历史服务器是没有启动的,我们可以通过Hadoop自带的命令(mr-jobhistory-daemon.sh)来启动Hadoop历史服务器。
一.yarn上运行mr程序
1>.启动集群
[yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 3043 ResourceManager 2507 NameNode 3389 Jps 2814 DFSZKFailoverController 命令执行成功 ============= s102 jps ============ 2417 DataNode 2484 JournalNode 2664 NodeManager 2828 Jps 2335 QuorumPeerMain 命令执行成功 ============= s103 jps ============ 2421 DataNode 2488 JournalNode 2666 NodeManager 2333 QuorumPeerMain 2830 Jps 命令执行成功 ============= s104 jps ============ 2657 NodeManager 2818 Jps 2328 QuorumPeerMain 2410 DataNode 2477 JournalNode 命令执行成功 ============= s105 jps ============ 2688 Jps 2355 NameNode 2424 DFSZKFailoverController 命令执行成功 [yinzhengjie@s101 ~]$
2>.在yarn上执行MapReduce程序
[yinzhengjie@s101 ~]$ hadoop jar /soft/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /yinzhengjie/data/ /yinzhengjie/data/output 18/08/21 07:37:35 INFO client.RMProxy: Connecting to ResourceManager at s101/172.30.1.101:8032 18/08/21 07:37:37 INFO input.FileInputFormat: Total input paths to process : 1 18/08/21 07:37:37 INFO mapreduce.JobSubmitter: number of splits:1 18/08/21 07:37:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1534851274873_0001 18/08/21 07:37:37 INFO impl.YarnClientImpl: Submitted application application_1534851274873_0001 18/08/21 07:37:37 INFO mapreduce.Job: The url to track the job: http://s101:8088/proxy/application_1534851274873_0001/ 18/08/21 07:37:37 INFO mapreduce.Job: Running job: job_1534851274873_0001 18/08/21 07:37:55 INFO mapreduce.Job: Job job_1534851274873_0001 running in uber mode : false 18/08/21 07:37:55 INFO mapreduce.Job: map 0% reduce 0% 18/08/21 07:38:13 INFO mapreduce.Job: map 100% reduce 0% 18/08/21 07:38:31 INFO mapreduce.Job: map 100% reduce 100% 18/08/21 07:38:32 INFO mapreduce.Job: Job job_1534851274873_0001 completed successfully 18/08/21 07:38:32 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=4469FILE: Number of bytes written=249719FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=3925HDFS: Number of bytes written=3315HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=15295Total time spent by all reduces in occupied slots (ms)=15161Total time spent by all map tasks (ms)=15295Total time spent by all reduce tasks (ms)=15161Total vcore-milliseconds taken by all map tasks=15295Total vcore-milliseconds taken by all reduce tasks=15161Total megabyte-milliseconds taken by all map tasks=15662080Total megabyte-milliseconds taken by all reduce tasks=15524864Map-Reduce FrameworkMap input records=104Map output records=497Map output bytes=5733Map output materialized bytes=4469Input split bytes=108Combine input records=497Combine output records=288Reduce input groups=288Reduce shuffle bytes=4469Reduce input records=288Reduce output records=288Spilled Records=576Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=163CPU time spent (ms)=1430Physical memory (bytes) snapshot=439443456Virtual memory (bytes) snapshot=4216639488Total committed heap usage (bytes)=286785536Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=3817File Output Format Counters Bytes Written=3315 [yinzhengjie@s101 ~]$
3>.通过webUI查看hdfs是否有数据产生
4>.查看yarn的记录信息
5>.查看历史日志,发现无法访问
二.配置yarn历史服务器
1>.修改“mapred-site.xml”配置文件
1 [yinzhengjie@s101 ~]$ more /soft/hadoop/etc/hadoop/mapred-site.xml 2 <?xml version="1.0"?> 3 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 4 <configuration> 5 <property> 6 <name>mapreduce.framework.name</name> 7 <value>yarn</value> 8 </property> 9 10 <property> 11 <name>mapreduce.jobhistory.address</name> 12 <value>s101:10020</value> 13 </property> 14 15 <property> 16 <name>mapreduce.jobhistory.webapp.address</name> 17 <value>s101:19888</value> 18 </property> 19 20 21 <property> 22 <name>mapreduce.jobhistory.done-dir</name> 23 <value>${yarn.app.mapreduce.am.staging-dir}/done</value> 24 </property> 25 26 <property> 27 <name>mapreduce.jobhistory.intermediate-done-dir</name> 28 <value>${yarn.app.mapreduce.am.staging-dir}/done_intermediate</value> 29 </property> 30 31 <property> 32 <name>yarn.app.mapreduce.am.staging-dir</name> 33 <value>/yinzhengjie/logs/hdfs/history</value> 34 </property> 35 36 </configuration> 37 38 <!-- 39 mapred-site.xml 配置文件的作用: 40 #HDFS的相关设定,如reduce任务的默认个数、任务所能够使用内存 41 的默认上下限等,此中的参数定义会覆盖mapred-default.xml文件中的 42 默认配置. 43 44 mapreduce.framework.name 参数的作用: 45 #指定MapReduce的计算框架,有三种可选,第一种:local(本地),第 46 二种是classic(hadoop一代执行框架),第三种是yarn(二代执行框架),我 47 们这里配置用目前版本最新的计算框架yarn即可。 48 49 mapreduce.jobhistory.address 参数的作用: 50 #指定job的历史服务器 51 52 mapreduce.jobhistory.webapp.address 参数的作用: 53 #指定日志服务器的web访问端口 54 55 mapreduce.jobhistory.done-dir 参数的作用: 56 #指定存放已经运行完的Hadoop作业记录 57 58 mapreduce.jobhistory.intermediate-done-dir 参数的作用: 59 #指定正在运行的Hadoop作业记录 60 61 yarn.app.mapreduce.am.staging-dir 参数的作用: 62 #指定applicationID以及需要的jar包文件等 63 64 --> 65 [yinzhengjie@s101 ~]$
2>.启动历史服务器服务
[yinzhengjie@s101 ~]$ hdfs dfs -mkdir /yinzhengjie/logs/hdfs/history #创建存放历史日志的路径 [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ mr-jobhistory-daemon.sh start historyserver #启动历史服务 starting historyserver, logging to /soft/hadoop-2.7.3/logs/mapred-yinzhengjie-historyserver-s101.out [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ jps 3043 ResourceManager 4009 JobHistoryServer #注意,这个进程就是历史服务进程 2507 NameNode 4045 Jps 2814 DFSZKFailoverController [yinzhengjie@s101 ~]$
3>.在yarn上执行MapReduce程序
[yinzhengjie@s101 ~]$ hdfs dfs -rm -R /yinzhengjie/data/output #删除之前的输出路径 18/08/21 08:43:34 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes. Deleted /yinzhengjie/data/output [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ hadoop jar /soft/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /yinzhengjie/data/input /yinzhengjie/data/output 18/08/21 08:44:58 INFO client.RMProxy: Connecting to ResourceManager at s101/172.30.1.101:8032 18/08/21 08:44:58 INFO input.FileInputFormat: Total input paths to process : 1 18/08/21 08:44:58 INFO mapreduce.JobSubmitter: number of splits:1 18/08/21 08:44:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1534851274873_0002 18/08/21 08:44:59 INFO impl.YarnClientImpl: Submitted application application_1534851274873_0002 18/08/21 08:44:59 INFO mapreduce.Job: The url to track the job: http://s101:8088/proxy/application_1534851274873_0002/ 18/08/21 08:44:59 INFO mapreduce.Job: Running job: job_1534851274873_0002 18/08/21 08:45:15 INFO mapreduce.Job: Job job_1534851274873_0002 running in uber mode : false 18/08/21 08:45:15 INFO mapreduce.Job: map 0% reduce 0% 18/08/21 08:45:30 INFO mapreduce.Job: map 100% reduce 0% 18/08/21 08:45:45 INFO mapreduce.Job: map 100% reduce 100% 18/08/21 08:45:45 INFO mapreduce.Job: Job job_1534851274873_0002 completed successfully 18/08/21 08:45:46 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=4469FILE: Number of bytes written=249693FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=3931HDFS: Number of bytes written=3315HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=12763Total time spent by all reduces in occupied slots (ms)=12963Total time spent by all map tasks (ms)=12763Total time spent by all reduce tasks (ms)=12963Total vcore-milliseconds taken by all map tasks=12763Total vcore-milliseconds taken by all reduce tasks=12963Total megabyte-milliseconds taken by all map tasks=13069312Total megabyte-milliseconds taken by all reduce tasks=13274112Map-Reduce FrameworkMap input records=104Map output records=497Map output bytes=5733Map output materialized bytes=4469Input split bytes=114Combine input records=497Combine output records=288Reduce input groups=288Reduce shuffle bytes=4469Reduce input records=288Reduce output records=288Spilled Records=576Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=139CPU time spent (ms)=1610Physical memory (bytes) snapshot=439873536Virtual memory (bytes) snapshot=4216696832Total committed heap usage (bytes)=281018368Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=3817File Output Format Counters Bytes Written=3315 [yinzhengjie@s101 ~]$
4>.通过webUI查看hdfs是否有数据产生
5>.查看yarn的webUI的历史任务
6>.查看历史记录
7>.配置日志聚集功能
详情请参考:https://www.cnblogs.com/yinzhengjie/p/9471921.html
转载于:https://www.cnblogs.com/yinzhengjie/p/9466159.html
Hadoop基础-配置历史服务器相关推荐
- hadoop—集群配置历史服务器及访问历史服务器日志
Hadoop历史服务器 Hadoop自带了一个历史服务器,可以通过历史服务器查看已经运行完的Mapreduce作业记录,比如用了多少个Map.用了多少个Reduce.作业提交时间.作业启动时间.作业 ...
- 【Hadoop】MapReduce的配置 ---配置历史服务器
下面介绍MapReduce的配置 相关介绍:并行计算框架(2.X),思想:分而治之 核心: Map:并行处理数据,将数据分割,一部分一部分的处理 Reduce:将Map的处理结果进行合并. 配置 cd ...
- Hadoop 笔记(五)配置历史服务器
之前的文章讲过 Hadoop 安装 和配置,并且运行 wordcount 程序. 本问主要配置历史服务器和日志服务器,用于查看程序运行的历史信息和日志信息. 配置历史服务器 历史服务器可以查看任务运行 ...
- 大数据之-Hadoop伪分布式_配置历史服务器---大数据之hadoop工作笔记0027
上一节我们配置好了yarn,但是我们yarn的管理控制台点击,我们执行的MapReduce的任务的,history的时候,显示无法打开,现在我们来配置 历史服务器,让history可用,这样我们使用y ...
- Hadoop配置历史服务器、日志聚集、常用端口号(2.x/3.x)
历史服务器 为了查看程序的历史运行情况,需要配置一下历史服务器. 首先在NameNode配置mapred-site.xml <property> <name>mapreduce ...
- hadoop 3.x 配置历史服务器
修改$HADOOP_HOME/etc/hadoop/mapred-site.xml,加入以下配置(修改主机名为你自己的主机或IP,尽量不要使用中文注释) 1 <!--history addres ...
- Hadoop基础环境搭建完整版
Hadoop基础环境搭建(转载尚硅谷) 说明 个人学习记录 基于虚拟机搭建,需要提前准备虚拟机环境 搭建版本:hadoop-3.1.3 搭建HDFS和yarn 提前准备Hadoop安装包:hadoop ...
- Hadoop基础之《(6)—Hadoop单机伪集群安装》
一.安装JDK yum install java-1.8* 二.关闭防火墙 systemctl status firewalld systemctl stop firewalld systemctl ...
- 七日杀开服架设教程开服配置服务器搭建需要什么配置的服务器Linux系统
七日杀开服架设教程开服配置服务器搭建需要什么配置的服务器Linux系统 新开放世界僵尸游戏 <七日杀>是由The Fun Pimps Entertainment研发的集合第一人称射击.恐怖 ...
最新文章
- 搜索算法,一触即达:GitHub上有个规模最大的开源算法库
- C++设计模式7--外观模式--The Client don't want to know
- php文件上传空间,PHP上传文件-PHP多文件上传
- Linux下Nginx的安装
- matlab rebit,BIM的算法最新消息!MATLAB被禁也有BIM开源工具用!
- linux nas解决方案_阿里产品总监:四大 Linux 支持的 NAS 解决方案
- Python 包含\u字符串转中文(\u00)
- code128java字符_java相关:如何使用Code128字体将文本转换为code128条形码
- 【顶会论文解析】罪行预测
- 【git commit --amend 修改提交记录】
- 【退役文】人退心不退,博客有空继续更
- 【Auto.JS】Autojs官方提取文档使用说明函数 (1)
- pat 甲级 A1008 Elevator
- matlab课表编排程序实例,编排课程表的一点心得
- 分享88个HTML旅游交通模板,总有一款适合您
- Android兼容8.0后APP图标变为原生小机器人图标
- BufferedWriter的用法
- 安卓 获取屏幕坐标(点击屏幕获取坐标)
- 佛说父母恩难报经原文、译文
- linux下ctrl 常用组合键
热门文章
- linux下隐藏root进程,宝塔面板隐藏彩蛋 – btkill.py:Linux异常进程专杀
- Vue3+Cli4 中使用 Echarts 5
- matlab中garchred是什么意思,MATLAB里的aic是啥意思
- stm32采集脉冲信号_基于STM32+FPGA的数据采集系统的设计与实现
- python拆分合并文件_python实现文件的分割与合并
- frameset嵌套多个html,在一个html的js中调用另一个html的变量和函数(导航栏更新个人图标)
- Android的Broadcase的使用(读取短信和创建通知)
- c语言将水仙花数放入一维数组a中,C语言考试题库及答案(1)
- java中long的包装类_Java中基本数据的包装类
- 20220219:力扣第72场双周赛题解