You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Matt Tanquary <ma...@gmail.com> on 2010/10/19 18:44:55 UTC

Simple Reduce not working

I have set up a simple m/r job, but it's not working as I would expect. The
input is a simple pipe-delimited file. The map is generating keys/values
exactly as expected. The reduce is not behaving as I expect.

A simplified example:

1|5
1|6
2|7
2|8

On input of line one, the map key is 1 and the value is 5....and so forth. I
expect that the sets will be 1 [5, 6] and 2 [7, 8]. Seems simple enough. But
my output is not creating the sets as expected.

This is literally my output:
2000000002      810
2000000002      815
2000000004      591
2000000004      818
2000000006      140
2000000006      821

Where I would have expected:
2000000002      810|815
2000000004      591|818
2000000006      140|821


Here are my relevant code snippets:

public static class InitialProductMapper extends Mapper<Object, Text, Text,
Text>
    {
        private Text basketKey = new Text();
        private Text item = new Text();

        public void map(Object key, Text value, Context context)
        throws IOException, InterruptedException {

            String[] record = value.toString().split("\\|");
            //String newWord =
record[TRANSACTION_DATE_INDEX]+"|"+record[TRANSACTION_KEY_INDEX]+"|"+record[STORE_KEY_INDEX];
            String newWord = record[TRANSACTION_KEY_INDEX];
            basketKey.set(newWord);
            logger.info("Mapper KEY: " + basketKey.toString());
            item.set(record[PRODUCT_KEY_INDEX]);
            logger.info("Mapper ITEM: " + item);
            context.write(basketKey, item);

        }
    }
.
.
.
public static class InitialProductReducer extends Reducer<Text, Text, Text,
Text>
    {
        public void reduce(Text key, Iterable<IntWritable> values,
                Context context) throws IOException, InterruptedException {

            String itemList = "";
            Text items = new Text();

            for (IntWritable val : values) {
                itemList = val + "|";
            }

            itemList = itemList.substring(0, itemList.lastIndexOf("|")-1);
            items.set(itemList);

            context.write(key, items);
        }
    }
.
.
.

public static void main(String[] args) throws IOException,
InterruptedException, ClassNotFoundException {
        // TODO Auto-generated method stub
        Configuration conf = new Configuration();
        String[] otherArgs = new GenericOptionsParser(conf, args)
                        .getRemainingArgs();
        if (otherArgs.length != 2) {
                System.err.println("Usage: ProductCount <in> <out>");
                System.exit(2);
        }

        Job job= new Job(conf, "MyJob");

        job.setJarByClass(ProductCount.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);

        job.setMapperClass(InitialProductMapper.class);
        //job.setCombinerClass(InitialProductReducer.class);
        job.setReducerClass(InitialProductReducer.class);

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
        //Initial results output
        FileOutputFormat.setOutputPath(job, new
Path(otherArgs[1]+"/initial"));

        if (job.waitForCompletion(true))
        {
.
.
.

-- 
Have you thanked a teacher today? ---> http://www.liftateacher.org

Re: Simple Reduce not working

Posted by Matt Tanquary <ma...@gmail.com>.
Solved the problem.
It was a typo in my reducer loop.

I meant itemList += val + "|", not itemList = val + "|"; (note the added
'+').

On Tue, Oct 19, 2010 at 9:44 AM, Matt Tanquary <ma...@gmail.com>wrote:

> I have set up a simple m/r job, but it's not working as I would expect. The
> input is a simple pipe-delimited file. The map is generating keys/values
> exactly as expected. The reduce is not behaving as I expect.
>
> A simplified example:
>
> 1|5
> 1|6
> 2|7
> 2|8
>
> On input of line one, the map key is 1 and the value is 5....and so forth.
> I expect that the sets will be 1 [5, 6] and 2 [7, 8]. Seems simple enough.
> But my output is not creating the sets as expected.
>
> This is literally my output:
> 2000000002      810
> 2000000002      815
> 2000000004      591
> 2000000004      818
> 2000000006      140
> 2000000006      821
>
> Where I would have expected:
> 2000000002      810|815
> 2000000004      591|818
> 2000000006      140|821
>
>
> Here are my relevant code snippets:
>
> public static class InitialProductMapper extends Mapper<Object, Text, Text,
> Text>
>     {
>         private Text basketKey = new Text();
>         private Text item = new Text();
>
>         public void map(Object key, Text value, Context context)
>         throws IOException, InterruptedException {
>
>             String[] record = value.toString().split("\\|");
>             //String newWord =
> record[TRANSACTION_DATE_INDEX]+"|"+record[TRANSACTION_KEY_INDEX]+"|"+record[STORE_KEY_INDEX];
>             String newWord = record[TRANSACTION_KEY_INDEX];
>             basketKey.set(newWord);
>             logger.info("Mapper KEY: " + basketKey.toString());
>             item.set(record[PRODUCT_KEY_INDEX]);
>             logger.info("Mapper ITEM: " + item);
>             context.write(basketKey, item);
>
>         }
>     }
> .
> .
> .
> public static class InitialProductReducer extends Reducer<Text, Text, Text,
> Text>
>     {
>         public void reduce(Text key, Iterable<IntWritable> values,
>                 Context context) throws IOException, InterruptedException {
>
>             String itemList = "";
>             Text items = new Text();
>
>             for (IntWritable val : values) {
>                 itemList = val + "|";
>             }
>
>             itemList = itemList.substring(0, itemList.lastIndexOf("|")-1);
>             items.set(itemList);
>
>             context.write(key, items);
>         }
>     }
> .
> .
> .
>
> public static void main(String[] args) throws IOException,
> InterruptedException, ClassNotFoundException {
>         // TODO Auto-generated method stub
>         Configuration conf = new Configuration();
>         String[] otherArgs = new GenericOptionsParser(conf, args)
>                         .getRemainingArgs();
>         if (otherArgs.length != 2) {
>                 System.err.println("Usage: ProductCount <in> <out>");
>                 System.exit(2);
>         }
>
>         Job job= new Job(conf, "MyJob");
>
>         job.setJarByClass(ProductCount.class);
>         job.setOutputKeyClass(Text.class);
>         job.setOutputValueClass(Text.class);
>
>         job.setMapperClass(InitialProductMapper.class);
>         //job.setCombinerClass(InitialProductReducer.class);
>         job.setReducerClass(InitialProductReducer.class);
>
>         job.setInputFormatClass(TextInputFormat.class);
>         job.setOutputFormatClass(TextOutputFormat.class);
>
>         FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
>         //Initial results output
>         FileOutputFormat.setOutputPath(job, new
> Path(otherArgs[1]+"/initial"));
>
>         if (job.waitForCompletion(true))
>         {
> .
> .
> .
>
> --
> Have you thanked a teacher today? ---> http://www.liftateacher.org
>



-- 
Have you thanked a teacher today? ---> http://www.liftateacher.org