在mapper处理阶段中有时候我们需要获取当前正在处理的HDFS文件名/HDFS目录名,其实我们可以通过 Context 来获取相关参数,代码类似如下:
1 2 3 4 5 |
FileSplit fileSplit = (FileSplit) context.getInputSplit(); System.out.println("========> getPath.getName = " + fileSplit.getPath().getName()); System.out.println("========> getPath = " + fileSplit.getPath().toString()); System.out.println("========> getPath.getParent = " + fileSplit.getPath().getParent().toString()); System.out.println("========> getPath.getParent.getName() = " + fileSplit.getPath().getParent().getName()); |
输出的日志信息如下:
1 2 3 4 |
========> getPath.getName = fatal_2015-02-05-04.log ========> getPath = hdfs://mycluster/user/micmiu/demo/nsplog/2015/02/05/7/fatal_2015-02-05-04.log ========> getPath.getParent = hdfs://mycluster/user/micmiu/demo/nsplog/2015/02/05/7 ========> getPath.getParent.getName() = 7 |
—————– EOF @Michael Sun —————–
原创文章,转载请注明: 转载自micmiu – 软件开发+生活点滴[ http://www.micmiu.com/ ]
本文链接地址: http://www.micmiu.com/bigdata/hadoop/hadoop-map-inputsplit-filepath/
0 条评论。