You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Mullangi (JIRA)" <ji...@apache.org> on 2013/07/23 10:32:48 UTC
[jira] [Created] (HADOOP-9764) MapReduce output issue
Mullangi created HADOOP-9764:
--------------------------------
Summary: MapReduce output issue
Key: HADOOP-9764
URL: https://issues.apache.org/jira/browse/HADOOP-9764
Project: Hadoop Common
Issue Type: Bug
Affects Versions: 1.0.3
Environment: ubuntu
Reporter: Mullangi
Hi,
I am new to Hadoop concepts.
While practicing with one custom MapReduce program, I found the result is not as expected after executing the code on HDFS based file. Please note that when I execute the same program using Unix based file,getting expected result.
Below are the details of my code.
MapReduce in java
==================
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.*;
public class WordCount1 {
public static class Map extends MapReduceBase implements Mapper {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {
String line = value.toString();
String tokenedZone=null;
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
tokenedZone=tokenizer.nextToken();
word.set(tokenedZone);
output.collect(word, one);
}
}
}
public static class Reduce extends MapReduceBase implements Reducer {
public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException {
int sum = 0;
int val = 0;
while (values.hasNext()) {
val = values.next().get();
sum += val;
}
if(sum>1)
output.collect(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf();
conf.setJarByClass(WordCount1.class);
conf.setJobName("wordcount1");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
Path inPath = new Path(args[0]);
Path outPath = new Path(args[0]);
FileInputFormat.setInputPaths(conf,inPath );
FileOutputFormat.setOutputPath(conf, outPath);
JobClient.runJob(conf);
}
}
input File
===========
test my program
during test and my hadoop
your during
get program
hadoop generated output file on HDFS file system
=======================================
during 2
my 2
test 2
hadoop generated output file on local file system
=======================================
during 2
my 2
program 2
test 2
Please help me on this issue
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira