Hadoop基础-配置历史服务器

                                    作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

   Hadoop自带了一个历史服务器,可以通过历史服务器查看已经运行完的Mapreduce作业记录,比如用了多少个Map、用了多少个Reduce、作业提交时间、作业启动时间、作业完成时间等信息。默认情况下,Hadoop历史服务器是没有启动的,我们可以通过Hadoop自带的命令(mr-jobhistory-daemon.sh)来启动Hadoop历史服务器。

一.yarn上运行mr程序

1>.启动集群

[yinzhengjie@s101 ~]$ xcall.sh jps
============= s101 jps ============
3043 ResourceManager
2507 NameNode
3389 Jps
2814 DFSZKFailoverController
命令执行成功
============= s102 jps ============
2417 DataNode
2484 JournalNode
2664 NodeManager
2828 Jps
2335 QuorumPeerMain
命令执行成功
============= s103 jps ============
2421 DataNode
2488 JournalNode
2666 NodeManager
2333 QuorumPeerMain
2830 Jps
命令执行成功
============= s104 jps ============
2657 NodeManager
2818 Jps
2328 QuorumPeerMain
2410 DataNode
2477 JournalNode
命令执行成功
============= s105 jps ============
2688 Jps
2355 NameNode
2424 DFSZKFailoverController
命令执行成功
[yinzhengjie@s101 ~]$ 

2>.在yarn上执行MapReduce程序

[yinzhengjie@s101 ~]$ hadoop jar /soft/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /yinzhengjie/data/ /yinzhengjie/data/output
18/08/21 07:37:35 INFO client.RMProxy: Connecting to ResourceManager at s101/172.30.1.101:8032
18/08/21 07:37:37 INFO input.FileInputFormat: Total input paths to process : 1
18/08/21 07:37:37 INFO mapreduce.JobSubmitter: number of splits:1
18/08/21 07:37:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1534851274873_0001
18/08/21 07:37:37 INFO impl.YarnClientImpl: Submitted application application_1534851274873_0001
18/08/21 07:37:37 INFO mapreduce.Job: The url to track the job: http://s101:8088/proxy/application_1534851274873_0001/
18/08/21 07:37:37 INFO mapreduce.Job: Running job: job_1534851274873_0001
18/08/21 07:37:55 INFO mapreduce.Job: Job job_1534851274873_0001 running in uber mode : false
18/08/21 07:37:55 INFO mapreduce.Job:  map 0% reduce 0%
18/08/21 07:38:13 INFO mapreduce.Job:  map 100% reduce 0%
18/08/21 07:38:31 INFO mapreduce.Job:  map 100% reduce 100%
18/08/21 07:38:32 INFO mapreduce.Job: Job job_1534851274873_0001 completed successfully
18/08/21 07:38:32 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=4469FILE: Number of bytes written=249719FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=3925HDFS: Number of bytes written=3315HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=15295Total time spent by all reduces in occupied slots (ms)=15161Total time spent by all map tasks (ms)=15295Total time spent by all reduce tasks (ms)=15161Total vcore-milliseconds taken by all map tasks=15295Total vcore-milliseconds taken by all reduce tasks=15161Total megabyte-milliseconds taken by all map tasks=15662080Total megabyte-milliseconds taken by all reduce tasks=15524864Map-Reduce FrameworkMap input records=104Map output records=497Map output bytes=5733Map output materialized bytes=4469Input split bytes=108Combine input records=497Combine output records=288Reduce input groups=288Reduce shuffle bytes=4469Reduce input records=288Reduce output records=288Spilled Records=576Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=163CPU time spent (ms)=1430Physical memory (bytes) snapshot=439443456Virtual memory (bytes) snapshot=4216639488Total committed heap usage (bytes)=286785536Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=3817File Output Format Counters Bytes Written=3315
[yinzhengjie@s101 ~]$ 

3>.通过webUI查看hdfs是否有数据产生

4>.查看yarn的记录信息

5>.查看历史日志,发现无法访问

二.配置yarn历史服务器

1>.修改“mapred-site.xml”配置文件

 1 [yinzhengjie@s101 ~]$ more /soft/hadoop/etc/hadoop/mapred-site.xml
 2 <?xml version="1.0"?>
 3 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 4 <configuration>
 5         <property>
 6                 <name>mapreduce.framework.name</name>
 7                 <value>yarn</value>
 8         </property>
 9
10     <property>
11         <name>mapreduce.jobhistory.address</name>
12         <value>s101:10020</value>
13     </property>
14
15     <property>
16             <name>mapreduce.jobhistory.webapp.address</name>
17             <value>s101:19888</value>
18     </property>
19
20
21     <property>
22           <name>mapreduce.jobhistory.done-dir</name>
23         <value>${yarn.app.mapreduce.am.staging-dir}/done</value>
24     </property>
25
26     <property>
27             <name>mapreduce.jobhistory.intermediate-done-dir</name>
28             <value>${yarn.app.mapreduce.am.staging-dir}/done_intermediate</value>
29     </property>
30
31     <property>
32             <name>yarn.app.mapreduce.am.staging-dir</name>
33             <value>/yinzhengjie/logs/hdfs/history</value>
34     </property>
35
36 </configuration>
37
38 <!--
39 mapred-site.xml 配置文件的作用:
40         #HDFS的相关设定,如reduce任务的默认个数、任务所能够使用内存
41 的默认上下限等,此中的参数定义会覆盖mapred-default.xml文件中的
42 默认配置.
43
44 mapreduce.framework.name 参数的作用:
45         #指定MapReduce的计算框架,有三种可选,第一种:local(本地),第
46 二种是classic(hadoop一代执行框架),第三种是yarn(二代执行框架),我
47 们这里配置用目前版本最新的计算框架yarn即可。
48
49 mapreduce.jobhistory.address 参数的作用:
50     #指定job的历史服务器
51
52 mapreduce.jobhistory.webapp.address 参数的作用:
53     #指定日志服务器的web访问端口
54
55 mapreduce.jobhistory.done-dir 参数的作用:
56     #指定存放已经运行完的Hadoop作业记录
57
58 mapreduce.jobhistory.intermediate-done-dir 参数的作用:
59     #指定正在运行的Hadoop作业记录
60
61 yarn.app.mapreduce.am.staging-dir 参数的作用:
62     #指定applicationID以及需要的jar包文件等
63
64 -->
65 [yinzhengjie@s101 ~]$ 

2>.启动历史服务器服务

[yinzhengjie@s101 ~]$ hdfs dfs -mkdir /yinzhengjie/logs/hdfs/history      #创建存放历史日志的路径
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ mr-jobhistory-daemon.sh start historyserver      #启动历史服务
starting historyserver, logging to /soft/hadoop-2.7.3/logs/mapred-yinzhengjie-historyserver-s101.out
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ jps
3043 ResourceManager
4009 JobHistoryServer        #注意,这个进程就是历史服务进程
2507 NameNode
4045 Jps
2814 DFSZKFailoverController
[yinzhengjie@s101 ~]$ 

3>.在yarn上执行MapReduce程序

[yinzhengjie@s101 ~]$ hdfs dfs -rm -R /yinzhengjie/data/output        #删除之前的输出路径
18/08/21 08:43:34 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /yinzhengjie/data/output
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ hadoop jar /soft/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /yinzhengjie/data/input  /yinzhengjie/data/output
18/08/21 08:44:58 INFO client.RMProxy: Connecting to ResourceManager at s101/172.30.1.101:8032
18/08/21 08:44:58 INFO input.FileInputFormat: Total input paths to process : 1
18/08/21 08:44:58 INFO mapreduce.JobSubmitter: number of splits:1
18/08/21 08:44:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1534851274873_0002
18/08/21 08:44:59 INFO impl.YarnClientImpl: Submitted application application_1534851274873_0002
18/08/21 08:44:59 INFO mapreduce.Job: The url to track the job: http://s101:8088/proxy/application_1534851274873_0002/
18/08/21 08:44:59 INFO mapreduce.Job: Running job: job_1534851274873_0002
18/08/21 08:45:15 INFO mapreduce.Job: Job job_1534851274873_0002 running in uber mode : false
18/08/21 08:45:15 INFO mapreduce.Job:  map 0% reduce 0%
18/08/21 08:45:30 INFO mapreduce.Job:  map 100% reduce 0%
18/08/21 08:45:45 INFO mapreduce.Job:  map 100% reduce 100%
18/08/21 08:45:45 INFO mapreduce.Job: Job job_1534851274873_0002 completed successfully
18/08/21 08:45:46 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=4469FILE: Number of bytes written=249693FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=3931HDFS: Number of bytes written=3315HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job Counters Launched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=12763Total time spent by all reduces in occupied slots (ms)=12963Total time spent by all map tasks (ms)=12763Total time spent by all reduce tasks (ms)=12963Total vcore-milliseconds taken by all map tasks=12763Total vcore-milliseconds taken by all reduce tasks=12963Total megabyte-milliseconds taken by all map tasks=13069312Total megabyte-milliseconds taken by all reduce tasks=13274112Map-Reduce FrameworkMap input records=104Map output records=497Map output bytes=5733Map output materialized bytes=4469Input split bytes=114Combine input records=497Combine output records=288Reduce input groups=288Reduce shuffle bytes=4469Reduce input records=288Reduce output records=288Spilled Records=576Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=139CPU time spent (ms)=1610Physical memory (bytes) snapshot=439873536Virtual memory (bytes) snapshot=4216696832Total committed heap usage (bytes)=281018368Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=3817File Output Format Counters Bytes Written=3315
[yinzhengjie@s101 ~]$ 

4>.通过webUI查看hdfs是否有数据产生

5>.查看yarn的webUI的历史任务

6>.查看历史记录

7>.配置日志聚集功能

  详情请参考:https://www.cnblogs.com/yinzhengjie/p/9471921.html

转载于:https://www.cnblogs.com/yinzhengjie/p/9466159.html

Hadoop基础-配置历史服务器相关推荐

  1. hadoop—集群配置历史服务器及访问历史服务器日志

    Hadoop历史服务器  Hadoop自带了一个历史服务器,可以通过历史服务器查看已经运行完的Mapreduce作业记录,比如用了多少个Map.用了多少个Reduce.作业提交时间.作业启动时间.作业 ...

  2. 【Hadoop】MapReduce的配置 ---配置历史服务器

    下面介绍MapReduce的配置 相关介绍:并行计算框架(2.X),思想:分而治之 核心: Map:并行处理数据,将数据分割,一部分一部分的处理 Reduce:将Map的处理结果进行合并. 配置 cd ...

  3. Hadoop 笔记(五)配置历史服务器

    之前的文章讲过 Hadoop 安装 和配置,并且运行 wordcount 程序. 本问主要配置历史服务器和日志服务器,用于查看程序运行的历史信息和日志信息. 配置历史服务器 历史服务器可以查看任务运行 ...

  4. 大数据之-Hadoop伪分布式_配置历史服务器---大数据之hadoop工作笔记0027

    上一节我们配置好了yarn,但是我们yarn的管理控制台点击,我们执行的MapReduce的任务的,history的时候,显示无法打开,现在我们来配置 历史服务器,让history可用,这样我们使用y ...

  5. Hadoop配置历史服务器、日志聚集、常用端口号(2.x/3.x)

    历史服务器 为了查看程序的历史运行情况,需要配置一下历史服务器. 首先在NameNode配置mapred-site.xml <property> <name>mapreduce ...

  6. hadoop 3.x 配置历史服务器

    修改$HADOOP_HOME/etc/hadoop/mapred-site.xml,加入以下配置(修改主机名为你自己的主机或IP,尽量不要使用中文注释) 1 <!--history addres ...

  7. Hadoop基础环境搭建完整版

    Hadoop基础环境搭建(转载尚硅谷) 说明 个人学习记录 基于虚拟机搭建,需要提前准备虚拟机环境 搭建版本:hadoop-3.1.3 搭建HDFS和yarn 提前准备Hadoop安装包:hadoop ...

  8. Hadoop基础之《(6)—Hadoop单机伪集群安装》

    一.安装JDK yum install java-1.8* 二.关闭防火墙 systemctl status firewalld systemctl stop firewalld systemctl ...

  9. 七日杀开服架设教程开服配置服务器搭建需要什么配置的服务器Linux系统

    七日杀开服架设教程开服配置服务器搭建需要什么配置的服务器Linux系统 新开放世界僵尸游戏 <七日杀>是由The Fun Pimps Entertainment研发的集合第一人称射击.恐怖 ...

最新文章

  1. 搜索算法,一触即达:GitHub上有个规模最大的开源算法库
  2. C++设计模式7--外观模式--The Client don't want to know
  3. php文件上传空间,PHP上传文件-PHP多文件上传
  4. Linux下Nginx的安装
  5. matlab rebit,BIM的算法最新消息!MATLAB被禁也有BIM开源工具用!
  6. linux nas解决方案_阿里产品总监:四大 Linux 支持的 NAS 解决方案
  7. Python 包含\u字符串转中文(\u00)
  8. code128java字符_java相关:如何使用Code128字体将文本转换为code128条形码
  9. 【顶会论文解析】罪行预测
  10. 【git commit --amend 修改提交记录】
  11. 【退役文】人退心不退,博客有空继续更
  12. 【Auto.JS】Autojs官方提取文档使用说明函数 (1)
  13. pat 甲级 A1008 Elevator
  14. matlab课表编排程序实例,编排课程表的一点心得
  15. 分享88个HTML旅游交通模板,总有一款适合您
  16. Android兼容8.0后APP图标变为原生小机器人图标
  17. BufferedWriter的用法
  18. 安卓 获取屏幕坐标(点击屏幕获取坐标)
  19. 佛说父母恩难报经原文、译文
  20. linux下ctrl 常用组合键

热门文章

  1. linux下隐藏root进程,宝塔面板隐藏彩蛋 – btkill.py:Linux异常进程专杀
  2. Vue3+Cli4 中使用 Echarts 5
  3. matlab中garchred是什么意思,MATLAB里的aic是啥意思
  4. stm32采集脉冲信号_基于STM32+FPGA的数据采集系统的设计与实现
  5. python拆分合并文件_python实现文件的分割与合并
  6. frameset嵌套多个html,在一个html的js中调用另一个html的变量和函数(导航栏更新个人图标)
  7. Android的Broadcase的使用(读取短信和创建通知)
  8. c语言将水仙花数放入一维数组a中,C语言考试题库及答案(1)
  9. java中long的包装类_Java中基本数据的包装类
  10. 20220219:力扣第72场双周赛题解