You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Suresh Srinivas (JIRA)" <ji...@apache.org> on 2013/07/23 16:20:48 UTC
[jira] [Commented] (HADOOP-9764) MapReduce output issue

    [ https://issues.apache.org/jira/browse/HADOOP-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716415#comment-13716415 ] 

Suresh Srinivas commented on HADOOP-9764:
-----------------------------------------

Please use hadoop user mailing list to ask these type of questions. Jira is for reporting bugs. 
                
> MapReduce output issue
> ----------------------
>
>                 Key: HADOOP-9764
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9764
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 1.0.3
>         Environment: ubuntu
>            Reporter: Mullangi
>
> Hi,
> I am new to Hadoop concepts. 
> While practicing with one custom MapReduce program, I found the result is not as expected after executing the code on HDFS based file. Please note that when I execute the same program using Unix based file,getting expected result.
> Below are the details of my code.
> MapReduce in java
> ==================
> import java.io.IOException;
> import java.util.*;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.conf.*;
> import org.apache.hadoop.io.*;
> import org.apache.hadoop.mapred.*;
> import org.apache.hadoop.mapreduce.Job;
> import org.apache.hadoop.util.*;
> public class WordCount1 {
>     public static class Map extends MapReduceBase implements Mapper {
>       private final static IntWritable one = new IntWritable(1);
>       private Text word = new Text();
>       public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {
>         String line = value.toString();
>         String tokenedZone=null;
>         StringTokenizer tokenizer = new StringTokenizer(line);
>         while (tokenizer.hasMoreTokens()) {
>           tokenedZone=tokenizer.nextToken();
>           word.set(tokenedZone);
>           output.collect(word, one);
>         }
>       }
>     }
>     public static class Reduce extends MapReduceBase implements Reducer {
>       public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException {
>         int sum = 0;
>         int val = 0;
>         while (values.hasNext()) {
>         	val = values.next().get();
>         	sum += val;
>         }
>         if(sum&gt;1)
>         	output.collect(key, new IntWritable(sum));
>       }
>     }
>     public static void main(String[] args) throws Exception {
>       JobConf conf = new JobConf();
>       conf.setJarByClass(WordCount1.class);
>       conf.setJobName("wordcount1");
>       
>       conf.setOutputKeyClass(Text.class);
>       conf.setOutputValueClass(IntWritable.class);
>       conf.setMapperClass(Map.class);
>       conf.setCombinerClass(Reduce.class);
>       conf.setReducerClass(Reduce.class);
>       conf.setInputFormat(TextInputFormat.class);
>       conf.setOutputFormat(TextOutputFormat.class);
>       
>       Path inPath = new Path(args[0]);
>       Path outPath = new Path(args[0]);
>       FileInputFormat.setInputPaths(conf,inPath );
>       FileOutputFormat.setOutputPath(conf, outPath);
>       JobClient.runJob(conf);
>     }
>   
> }
> input File
> ===========
> test my program
> during test and my hadoop 
> your during
> get program
> hadoop generated output file on HDFS file system
> =======================================
> during	2
> my	2
> test	2
> hadoop generated output file on local file system
> =======================================
> during	2
> my	2
> program	2
> test	2
> Please help me on this issue

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira