hadoop入门8:自定义OutputFormat,根据需求数据输出不同的路径
在有些情况下,我们需要数据分开输出,即指定输出路径,这时就需要重写outputformat。
测试数据(部分):
1374609798.19 1374609798.20 1374609798.20 1374609798.51 110 5 8615103869897 460029934830160 3559380454939260 2 460 0 14443 15406 10.184.49.172 220.181.112.82 55351 80 6 cmnet 1 221.177.233.5 221.177.217.145 221.177.233.6 221.177.217.155 mobads-logs.baidu.com http://mobads-logs.baidu.com/ad.log?url2=nH0drHn_PjRsrasvnWc3PHnvQjczrjc_nW0sQjc_nasYmW6kmhubn7qWTZc_PAc3nyFhuj0_RLK-mv-9U7P8whqzRy-dTv9GQZP4UyFGmy3_FBmkPBmhn0&extra2=nj0snjDsnj0snj0snj0snisznjD1njTzPj0Ynj0sn0&rnd=93528145.000000 NOKIA6120ci/UCWEB8.9.0.253/28/999 GET 200 1217 366 1 3 0 0 1 3 0 0 0 0 http://mobads-logs.baidu.com/ad.log?url2=nH0drHn_PjRsrasvnWc3PHnvQjczrjc_nW0sQjc_nasYmW6kmhubn7qWTZc_PAc3nyFhuj0_RLK-mv-9U7P8whqzRy-dTv9GQZP4UyFGmy3_FBmkPBmhn0&extra2=nj0snjDsnj0snj0snj0snisznjD1njTzPj0Ynj0sn0&rnd=93528145.000000 5903897840525807627 5903904035903819787 5915956
1374609778.91 1374609779.15 1374609779.15 1374609779.15 134 591 8615103869897 460029934830160 3559380454939260 2 460 0 14443 15406 10.184.49.172 111.13.12.15 55390 80 6 cmnet 1 221.177.233.5 221.177.217.145 221.177.233.6 221.177.217.155 m.baidu.com http://m.baidu.com/bd_page_type=1/pu=sz%40240%5F320%2Cta%40middle%5F%5F3%2E1%5F1%5F8%2E9%2Cusm%400/uid=DF457D5FC05AAC3ECA12096D8BBFB663/t=wap/w=0_10_%E8%89%B2%E7%9C%AF%E7%9C%AF/ssid=0/from=2001a/l=0/tc?func=nextp&pi=2&m=128&pn=11&src=http%3A%2F%2Fwww%2E0901s%2Ecom%2Farticlelist%2F%3F23%2D7%2Ehtml NOKIA6120ci/UCWEB8.9.0.253/28/999 GET 200 1693 4080 4 6 0 0 4 6 0 0 0 0 http://m.baidu.com/bd_page_type=1/pu=sz%40240%5F320%2Cta%40middle%5F%5F3%2E1%5F1%5F8%2E9%2Cusm%400/uid=DF457D5FC05AAC3ECA12096D8BBFB663/t=wap/w=0_10_%E8%89%B2%E7%9C%AF%E7%9C%AF/ssid=0/from=2001a/l=0/tc?func=nextp&pi=2&m=128&pn=11&src=http%3A%2F%2Fwww%2E09 5903897840525807627 5903904043347443723 5915956
1374609732.67 1374609741.82 1374609741.82 1374609741.82 134 591 8615103869897 460029934830160 3559380454939260 2 460 0 14443 15406 10.184.49.172 111.13.12.15 46666 80 6 cmnet 1 221.177.233.5 221.177.217.145 221.177.233.6 221.177.217.155 m.baidu.com http://m.baidu.com/bd_page_type=1/pu=sz%40240%5F320%2Cta%40middle%5F%5F3%2E1%5F1%5F8%2E9%2Cusm%400/uid=DF457D5FC05AAC3ECA12096D8BBFB663/t=wap/w=0_10_%E8%89%B2%E7%9C%AF%E7%9C%AF/ssid=0/from=2001a/l=0/tc?pn=11&m=128&src=www%2E0901s%2Ecom%2Farticle%2F%3F5616%2Ehtml NOKIA6120ci/UCWEB8.9.0.253/28/999 GET 200 1614 1091 3 3 0 0 3 3 0 0 0 0 http://m.baidu.com/bd_page_type=1/pu=sz%40240%5F320%2Cta%40middle%5F%5F3%2E1%5F1%5F8%2E9%2Cusm%400/uid=DF457D5FC05AAC3ECA12096D8BBFB663/t=wap/w=0_10_%E8%89%B2%E7%9C%AF%E7%9C%AF/ssid=0/from=2001a/l=0/tc?pn=11&m=128&src=www%2E0901s%2Ecom%2Farticle%2F%3F5616 5903897840525807627 5903903844056297483 5915956
1374609771.39 1374609771.57 1374609771.57 1374609771.57 134 591 8615103869897 460029934830160 3559380454939260 2 460 0 14443 15406 10.184.49.172 111.13.12.15 40314 80 6 cmnet 1 221.177.233.5 221.177.217.145 221.177.233.6 221.177.217.155 m.baidu.com http://m.baidu.com/bd_page_type=1/pu=sz%40240%5F320%2Cta%40middle%5F%5F3%2E1%5F1%5F8%2E9%2Cusm%400/uid=DF457D5FC05AAC3ECA12096D8BBFB663/t=wap/w=0_10_%E8%89%B2%E7%9C%AF%E7%9C%AF/ssid=0/from=2001a/l=0/tc?pn=11&m=128&src=www%2E0901s%2Ecom%2Farticlelist%2F%3F23%2D7%2Ehtml NOKIA6120ci/UCWEB8.9.0.253/28/999 GET 200 1660 5832 4 7 0 0 4 7 0 0 0 0 http://m.baidu.com/bd_page_type=1/pu=sz%40240%5F320%2Cta%40middle%5F%5F3%2E1%5F1%5F8%2E9%2Cusm%400/uid=DF457D5FC05AAC3ECA12096D8BBFB663/t=wap/w=0_10_%E8%89%B2%E7%9C%AF%E7%9C%AF/ssid=0/from=2001a/l=0/tc?pn=11&m=128&src=www%2E0901s%2Ecom%2Farticlelist%2F%3F 5903897840525807627 5903904010625413131 5915956
1374609798.85 1374609798.86 1374609798.86 1374609799.03 110 5 8615103869897 460029934830160 3559380454939260 2 460 0 14443 15406 10.184.49.172 220.181.112.82 40887 80 6 cmnet 1 221.177.233.5 221.177.217.145 221.177.233.6 221.177.217.155 mobads-logs.baidu.com http://mobads-logs.baidu.com/ad.log?url2=nHDLP1n_PHD4PBsvn16vnWcvQjcvPHR_nH0sQjc_nasYmW6kmhubn7qWTZc_PAc3nyFhuj0_RLK-mv-9U7P8whqzRy-dTv9GQZP4UyFGmy3_FBmkPBmhn0&extra2=nj0snjDsnj0snj0snj0snisznjD1njTzPj0Ynj0sn0&rnd=56776839.000000 NOKIA6120ci/UCWEB8.9.0.253/28/999 GET 200 1217 366 1 3 0 0 1 3 0 0 0 0 http://mobads-logs.baidu.com/ad.log?url2=nHDLP1n_PHD4PBsvn16vnWcvQjcvPHR_nH0sQjc_nasYmW6kmhubn7qWTZc_PAc3nyFhuj0_RLK-mv-9U7P8whqzRy-dTv9GQZP4UyFGmy3_FBmkPBmhn0&extra2=nj0snjDsnj0snj0snj0snisznjD1njTzPj0Ynj0sn0&rnd=56776839.000000 5903897840525807627 5903904128756236299 5915956
1374609777.43 1374609777.44 1374609777.44 1374609777.61 110 5 8615103869897 460029934830160 3559380454939260 2 460 0 14443 15406 10.184.49.172 220.181.112.82 44427 80 6 cmnet 1 221.177.233.5 221.177.217.145 221.177.233.6 221.177.217.155 mobads-logs.baidu.com http://mobads-logs.baidu.com/ad.log?url2=nHDYnHc_Pjb3rasvnHbYPjcsQjcdPH0_nW0sQjc_QHfdnHRsrjDsPzsYmW6kmhubn7qWTZc_PAc3nyFhuj0_n1csnDFjnYR4nDmkwjKarDNAPH0LwWFAnYRswRujPjn_TL-Vmh-9UBYkQA4Epv-9QHmknWKWpimhnHmhFWcvPBmh&__mobads_ta=mLwzrW0_mywJIgPYrW00&__mobads_qk=51eee16b6e10202ef7e2968dc4708590193811c6&exp_id=gd,zl,&extra2=nj0snjDsnj0snj0snj0snisznjD1njTzPj0Ynjcdnf&rnd=609569073 NOKIA6120ci/UCWEB8.9.0.253/28/999 GET 200 1365 366 1 3 0 0 1 3 0 0 0 0 http://mobads-logs.baidu.com/ad.log?url2=nHDYnHc_Pjb3rasvnHbYPjcsQjcdPH0_nW0sQjc_QHfdnHRsrjDsPzsYmW6kmhubn7qWTZc_PAc3nyFhuj0_n1csnDFjnYR4nDmkwjKarDNAPH0LwWFAnYRswRujPjn_TL-Vmh-9UBYkQA4Epv-9QHmknWKWpimhnHmhFWcvPBmh&__mobads_ta=mLwzrW0_mywJIgPYrW00&__mobads 5903897840525807627 5903904035903823883 5915956
1374609776.77 1374609776.78 1374609776.78 1374609777.07 110 5 8615103869897 460029934830160 3559380454939260 2 460 0 14443 15406 10.184.49.172 220.181.112.82 55105 80 6 cmnet 1 221.177.233.5 221.177.217.145 221.177.233.6 221.177.217.155 mobads-logs.baidu.com http://mobads-logs.baidu.com/ad.log?url2=nH0vnH6_PjRknasvnWc3PHnvQjczrjf_nW0sQjc_QHfdnHRsrjDsPzsYmW6kmhubn7qWTZc_PAc3nyFhuj0_n1csnDFjnYR4nDmkwjKarDNAPH0LwWFAnYRswRujPjn_TL-Vmh-9UBYkQA4Epv-9QHmknWKWpimhnHmhFWcvPBmh&__mobads_ta=mLwzrW0_mywJIgPYrW00&__mobads_qk=51eee16b6e10202ef7e2968dc4708590193811c6&exp_id=gd,zl,&extra2=nj0snjDsnj0snj0snj0snisznjD1njTzPj0Ynjcdnf&rnd=1685946172 NOKIA6120ci/UCWEB8.9.0.253/28/999 GET 200 1366 366 1 3 0 0 1 3 0 0 0 0 http://mobads-logs.baidu.com/ad.log?url2=nH0vnH6_PjRknasvnWc3PHnvQjczrjf_nW0sQjc_QHfdnHRsrjDsPzsYmW6kmhubn7qWTZc_PAc3nyFhuj0_n1csnDFjnYR4nDmkwjKarDNAPH0LwWFAnYRswRujPjn_TL-Vmh-9UBYkQA4Epv-9QHmknWKWpimhnHmhFWcvPBmh&__mobads_ta=mLwzrW0_mywJIgPYrW00&__mobads 5903897840525807627 5903903694712119307 5915956
1374609806.06 1374609806.10 1374609806.10 1374609807.38 110 4 8613526051568 460003760137902 8674910129582223 2 460 0 14254 2844 10.88.83.12 10.0.0.172 40793 80 6 cmwap 1 221.177.217.135 221.177.217.145 221.177.217.136 221.177.217.149 rc.dxsvr.com http://rc.dxsvr.com/get?dv=1.4.5&is=460003760137902&model=8150&op=46000&lp=1&locale=zh_CN&pkg=com.dianxinos.dxbb&net=2&tk=GVTQONILweFpMcxchEcj4g==&h=800&w=480&v=5003&ie=867491012958222&lc=D6PogbkVGkUYW1fJ&sdk=10&dpi=240&rv=1.1 Apache-HttpClient/UNAVAILABLE (java 1.4) GET 200 520 443 3 2 0 0 3 2 0 0 0 0 http://rc.dxsvr.com/get?dv=1.4.5&is=460003760137902&model=8150&op=46000&lp=1&locale=zh_CN&pkg=com.dianxinos.dxbb&net=2&tk=GVTQONILweFpMcxchEcj4g==&h=800&w=480&v=5003&ie=867491012958222&lc=D6PogbkVGkUYW1fJ&sdk=10&dpi=240&rv=1.1 5903899536962572299 5903904150061305867 5926819
1374609808.54 1374609808.61 1374609808.61 1374609809.38 110 362 8613592017377 460003796093791 8606460252897478 2 460 0 18737 26732 10.88.114.208 10.0.0.172 45268 80 6 cmwap 1 221.177.156.71 221.177.217.145 221.177.156.71 221.177.217.150 bufferfly.mqsng.qq.com http://bufferfly.mqsng.qq.com/analytics/upload POST 200 600 535 4 2 0 0 4 2 0 0 0 0 http://bufferfly.mqsng.qq.com/analytics/upload 5903903713112543243 5903904168439287819 5967156
1
数据库(部分)里的数据:
INSERT INTO `url_rule` VALUES ('http://timg01.baidu-1img.cn/timg?imagewise_list&size=w100&quality=60&sec=1374609621&di=66db0b184da63c1c76d87f8b243d07c9&src=http://i3.baidu.com/it/u=1069655089,3312248484&fm=21&gp=0.jpg', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://timg01.baidu-2img.cn/timg?tc&size=w304&sec=1374609877&di=5da5f6be8f6aa1605fe035eacaafc235&imgtype=0&quality=80&src=http%3A%2F%2Fpics%2Ewajiw%2Ecom%2Fimg%5F02%2Fa%5F8%2F235097135%5F1%5F0%2Ejpg', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://timg01.baidu-2img.cn/timg?tc&size=w304&sec=1374609877&di=b4d535480eb486257dbb05a0b9c986f3&imgtype=0&quality=80&src=http%3A%2F%2Fpics%2Ewajiw%2Ecom%2Fimg%5F02%2Fa%5F8%2F775906177%5F1%5F0%2Ejpg', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://timg01.baidu-2img.cn/timg?tc&size=w304&sec=1374609877&di=da5b8374285c90c693ff1916e42b12f4&imgtype=0&quality=80&src=http%3A%2F%2Fpics%2Ewajiw%2Ecom%2Fimg%5F02%2Fa%5F8%2F518671053%5F1%5F0%2Ejpg', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://wap.wapreach.com/upload/view/2013/06/20130621194639803.png', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://www.17caifu.com/BestAugury/bazi/Images/2007_Q&A_job.gif', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://house60.3g.qq.com/g/s?sid=AULxRr6syQFZz-P3wHdxoEZG&3G_UIN=1324479092&saveURL=0&aid=home_self&g_f=595', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://m1.baidu.com/bd_page_type=1/pu=sz%40224%5F220%2Cta%40middle%5F%5F%5F%5F%2Cusm%400/uid=CBF7BD93377D6EDDE086B9855274FEA8/t=wap/w=0_10_Www%2ECcc36%2ECom/ssid=0/from=643e/l=0/tc?pn=11&m=0&src=www%2Eccc36%2Ecom%2Fqiangjian%2F20130118%2F133645%2Ehtml', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://mobads-logs.baidu.com:80/ad.log?url2=nHD3n1R_PHcvrisdPWRLPWn4QjcvrHD_nW0sQjc_QHndPWmLPH63PisYmW6kmhubn7qWTZc_PAc3nyFhuj0_wbfdn1bvPW0Yf1RkwHuDrHNjfWKDn16vnYRknWcLfRn_TL-Vmh-9UBYkQA4Epv-9QHRzn1nhFWDvFBmzPHnhFBfb&__mobads_ta=mLwzrW0_mywJIgPYrW00&__mobads_qk=51eee0fcf01684baf7e2968db627d5d410ae827f&exp_id=gd,zl,&extra2=nj0snjDsnj0snj0snj0snisznjD1njTzPj0YnjDsn0&rnd=1386322966', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://r3.sinaimg.cn/10170/2013/0723/8d/3/52441965/360x532x75x0.jpg', 'somecontent');
INSERT INTO `url_rule` VALUES ('http://r3.sinaimg.cn/10170/2013/0723/b6/e/53444068/360x532x75x0.jpg', 'somecontent');
代码:
重写outputformate:
package com.zsy.mr.logenhance;import java.io.IOException;import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.RecordWriter;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class LogEnhanceOutPutFormat extends FileOutputFormat<Text, NullWritable> {@Overridepublic RecordWriter<Text, NullWritable> getRecordWriter(TaskAttemptContext job)throws IOException, InterruptedException {FileSystem fs = FileSystem.get(job.getConfiguration());Path enHancePath = new Path("hdfs://hadoop01:9000/logenhance/echancelog/log.data");Path urlPath = new Path("hdfs://hadoop01:9000/logenhance/NeedGrabBag/url.data");FSDataOutputStream enHanceOs = fs.create(enHancePath);FSDataOutputStream urlOs = fs.create(urlPath);return new LogEnhanceRecordWriter(enHanceOs, urlOs);}static class LogEnhanceRecordWriter extends RecordWriter<Text, NullWritable> {FSDataOutputStream enHanceOs = null;FSDataOutputStream urlOs = null;public LogEnhanceRecordWriter() {}public LogEnhanceRecordWriter(FSDataOutputStream enHanceOs, FSDataOutputStream urlOs) {this.enHanceOs = enHanceOs;this.urlOs = urlOs;}@Overridepublic void write(Text key, NullWritable value) throws IOException, InterruptedException {String data = key.toString();// 如果包括"NeedGrabBag",数据写入hfds://hadoop01:9000/logenhance/NeedGrabBag/url.data 路径if (data.contains(("NeedGrabBag"))) {urlOs.write(data.getBytes());} else {// 写入hfds://hadoop01:9000/logenhance/echancelog/log.dataenHanceOs.write(data.getBytes());}}@Overridepublic void close(TaskAttemptContext context) throws IOException, InterruptedException {if (urlOs != null) {urlOs.close();}if (enHanceOs != null) {enHanceOs.close();}}}}
dbutils代码:
package com.zsy.mr.utils;import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.HashMap;
import java.util.Map;public class DBUtils {public static Connection getConn() {String driver = "com.mysql.jdbc.Driver";String url = "jdbc:mysql://192.168.31.11:3306/mytest";String username = "root";String password = "123456";Connection conn = null;try {Class.forName(driver); // classLoader,加载对应驱动conn = (Connection) DriverManager.getConnection(url, username, password);} catch (ClassNotFoundException e) {e.printStackTrace();} catch (SQLException e) {e.printStackTrace();}return conn;}public static Map<String, String> getUrlRuleData() {Connection connection = getConn();Statement statement = null;ResultSet resultSet = null;Map<String, String> result = new HashMap<String, String>(128);try {statement = connection.createStatement();String sql = "select url,content from url_rule";resultSet = statement.executeQuery(sql);while (resultSet.next()) {result.put(resultSet.getString(1), resultSet.getString(2));}} catch (SQLException e) {// TODO Auto-generated catch blocke.printStackTrace();} finally {if (resultSet != null) {try {resultSet.close();} catch (SQLException e) {e.printStackTrace();}}if (statement != null) {try {statement.close();} catch (SQLException e) {e.printStackTrace();}}if (connection != null) {try {connection.close();} catch (SQLException e) {e.printStackTrace();}}}return result;}
}
mr代码:
package com.zsy.mr.logenhance;import java.io.IOException;
import java.util.HashMap;
import java.util.Map;import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Counter;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import com.zsy.mr.utils.DBUtils;public class LogEnhance {static class LogenhanceMapper extends Mapper<LongWritable, Text, Text, NullWritable> {Map<String, String> ruleMap = new HashMap<String, String>();Text k = new Text();NullWritable v = NullWritable.get();@Overrideprotected void setup(Mapper<LongWritable, Text, Text, NullWritable>.Context context)throws IOException, InterruptedException {ruleMap = DBUtils.getUrlRuleData();// 这里可与进行查询数据库,来获取url键值对}@Overrideprotected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, NullWritable>.Context context)throws IOException, InterruptedException {Counter counter = context.getCounter("feifa", "feifaline");String line = value.toString();String[] strs = StringUtils.split(line);try {String url = strs[26];String content = ruleMap.get(url);// 判断url是否在数据库中存在,如果为空:待处理数据;不为空,进行日志增强if (StringUtils.isBlank(content)) {k.set(url + "\t" + "NeedGrabBag" + "\n");context.write(k, v);} else {k.set(line + "\t" + content + "\n");context.write(k, v);}} catch (Exception e) {counter.increment(1);}}}public static void main(String[] args) throws Exception {Configuration conf = new Configuration();/** conf.set("mapreduce.framework.name", "yarn");* conf.set("yarn.resoucemanger.hostname", "hadoop01");*/Job job = Job.getInstance(conf);job.setJarByClass(LogEnhance.class);// 指定本业务job要使用的业务类job.setMapperClass(LogenhanceMapper.class);// job.setReducerClass(LogEnhanceReducer.class);// 指定mapper输出的k v类型 如果map的输出和reduce的输出一样,只需要设置输出即可// job.setMapOutputKeyClass(Text.class);// job.setMapOutputValueClass(IntWritable.class);// 指定最终输出kv类型(reduce输出类型)job.setOutputKeyClass(Text.class);job.setOutputValueClass(NullWritable.class);// 控制不同的数据写出不同地方路径(数据库、hdfs等),可以使用自定义的OutputFormat实现job.setOutputFormatClass(LogEnhanceOutPutFormat.class);// 指定job的输入文件所在目录FileInputFormat.setInputPaths(job, new Path(args[0]));// 指定job的输出结果目录,虽然重写了outputforamt,但是还是要写outputPath因为还需要在该文件里输出SUCCESS文件FileOutputFormat.setOutputPath(job, new Path(args[1]));// 不需要reduce ,设置为0即可job.setNumReduceTasks(0);// 将job中配置的相关参数,以及job所有的java类所在 的jar包,提交给yarn去运行// job.submit();无结果返回,建议不使用它boolean res = job.waitForCompletion(true);System.exit(res ? 0 : 1);}
}
结果:
url.data:
log.data:
outputPath下文件:
以上就是hadoop的outputformat自定义内容
hadoop入门8:自定义OutputFormat,根据需求数据输出不同的路径相关推荐
- Hadoop入门(上):大数据特点、大数据前景、大数据组织结构、Hadoop组成、Hadoop版本介绍、Hadoop运行环境搭建、CentOS6安装JDK、安装Hadoop、Hadoop目录结构
资料来源于尚硅谷 本篇文章目录 第1章·大数据概论 1.1.大数据概念. 1.2.大数据特点(4V) 1.3.大数据应用场景 1.4.大数据发展前景 1.5·大数据部门业务流程分析. 1.6·大数据部 ...
- 大数据之-Hadoop3.x_MapReduce_自定义outputformat案例mapperreducer---大数据之hadoop3.x工作笔记0122
然后我们去根据上一节我们分析的需求去写代码实现,首先我们新建一个package,outputformat 然后我们新建一个mapper,LogMapper,可以看到继承系统的hadoop3.1.3的m ...
- Hadoop之OutputFormat数据输出详解
Hadoop之OutputFormat数据输出详解 目录 OutputFormat接口实现类 自定义OutputFormat 1. OutputFormat接口实现类 OutputFormat是Map ...
- hadoop 自定义OutputFormat
前言 在某些业务场景下,需要对原始的数据进行合理的分类输出,减少后续的程序处理数据带来的麻烦,其实这也属于ETL中的一种,比如,我们收集到了一份原始的日志,主体字段为区域编码,需要根据这个编码将这份日 ...
- 大数据之Hadoop入门
1 大数据概论 大数据(Big Data):指无法在一定时间范围内用常规软件工具进行捕捉.管理和处理的数据集合,是需要新处理模式才能具有更强决策力.洞察发现力和流程优化能力的海量.高增长率和多样化的信 ...
- 大数据框架Hadoop篇之Hadoop入门
1. 写在前面 今天开始,想开启大数据框架学习的一个新系列,之前在学校的时候就会大数据相关技术很是好奇,但苦于没有实践场景,对这些东西并没有什么体会,到公司之后,我越发觉得大数据的相关知识很重要,不管 ...
- 大数据与Hadoop有什么关系?大数据Hadoop入门简介
学习着数据科学与大数据技术专业(简称大数据)的我们,对于"大数据"这个词是再熟悉不过了,而每当我们越去了解大数据就越发现有个词也会一直被提及那就是--Hadoop 那Hadoop与 ...
- 大数据平台hadoop运维之hadoop入门-高俊峰-专题视频课程
大数据平台hadoop运维之hadoop入门-5245人已学习 课程介绍 主要介绍hadoop生态圈的常用软件和基础知识,可使学员迅速了解hadoop运维的基础知识,并迅速掌握hado ...
- 大数据时代|核心架构Hadoop入门学习之HDFS,循序渐进求真知
前言 当今世界,科学技术飞速发展,人们不知不觉的进入了大数据时代.而什么是大数据时代,大数据的发展是什么?这一系列的问题其实很抽象,很难一言半语的概括.但是,在这大数据时代,必须掌握相应的技术作为支撑 ...
最新文章
- WMI技术介绍和应用——查询系统信息
- 使用VS搭建三层结构
- JDK动态代理实现简单AOP--转
- 面试题: Vue中的 computed 和 watch的区别
- ZigBee On Windows Mobile—利用CF卡接口外扩
- hadoop删除节点
- 《成为一名机器学习工程师》_如何在2020年成为机器学习工程师
- 谷歌浏览器的 vue插件工具
- 关于拉格朗日乘子法和KKT条件
- Theine for Mac(电脑休眠工具)
- 利用 HttpModule,基于输出,统一控制、干预、处理(例如: 过滤关键字、AntiXSS) ASP.Net WebForm Control 展现属性的方案原型...
- HTML5七夕情人节表白网页制作【一生守护】HTML+CSS+JavaScript
- PLSQL官方下载、安装和使用完全指南
- 2021-11-02 Kafka、Zookeeper的下载、打开、关闭
- RGB格式转换的实现
- java 日历转化-阴历转阳历
- python画成绩正态分布图_R统计学(09): 正态分布 (二)
- 株洲c语言培训机构,株洲好就业的学c语言程序设计,计算机专业地址
- 袁永福软件行业从业经历
- 项目管理必看书籍,全部打包送给你