You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Erik Test <er...@gmail.com> on 2010/07/28 20:44:56 UTC

Distance Calculation problem

Hello,

I've implemented a program using map reduce for a simple distance
calculations between two 2D points. I've set up my input such that all
calculations should be the same but they are not. This makes me think I'm
doing something wrong within a map and/or reduce function. Here is my
output.

1,7,13,19       18.973665961010276
10,16,22,28     8.48528137423857
11,17,23,29     8.48528137423857
12,18,24,30     8.48528137423857
13,19,25,31     8.48528137423857
14,20,26,32     8.48528137423857
15,21,27,33     18.973665961010276
16,22,28,34     18.973665961010276
17,23,29,35     18.973665961010276
18,24,30,36     8.48528137423857
19,25,31,37     18.973665961010276
2,8,14,20       8.48528137423857
20,26,32,38     8.48528137423857
21,27,33,39     8.48528137423857
22,28,34,40     8.48528137423857
23,29,35,41     8.48528137423857
24,30,36,42     8.48528137423857
25,31,37,43     18.973665961010276
26,32,38,44     18.973665961010276
27,33,39,45     18.973665961010276
28,34,40,46     16.97056274847714
29,35,41,47     8.48528137423857
3,9,15,21       8.48528137423857
30,36,42,48     18.973665961010276
31,37,43,49     18.973665961010276
32,38,44,50     8.48528137423857
33,39,45,51     8.48528137423857
34,40,46,52     8.48528137423857
35,41,47,53     18.973665961010276
4,10,16,22      18.973665961010276
5,11,17,23      18.973665961010276
6,12,18,24      8.48528137423857
7,13,19,25      8.48528137423857
8,14,20,26      8.48528137423857
9,15,21,27      18.973665961010276

My key is the entire line entry of the input file and the value is the
distance calculated. Any value that doesn't begin with an 8 is wrong.

This is what I do in the mapper.

 public static class Map extends MapReduceBase
    implements Mapper<LongWritable, Text,
      Text, DoubleWritable>
        {
          private Text word = new Text();

          public void map(LongWritable key, Text value,
            OutputCollector<Text, DoubleWritable> output,
              Reporter reporter) throws IOException
                {
                  String line = value.toString();
                  String[] tokens = line.split("[,]");

                  for(int i = 0; i < tokens.length; i++)
                  {
                    word.set(line);
                    output.collect(word, new
DoubleWritable(Double.parseDouble(tokens[i])));
                  }
                }//public void map
        }//public static class Map

And this is what I do in the reducer.

public static class Reduce extends MapReduceBase
    implements Reducer<Text, DoubleWritable,
      Text, DoubleWritable>
        {
          public void reduce(Text key, Iterator<DoubleWritable> values,
            OutputCollector<Text, DoubleWritable> output, Reporter reporter)
              throws IOException
                {
                  double distance  = 0;
                  double x1 = 0;
                  double x2 = 0;
                  double y1 = 0;
                  double y2 = 0;

                  if(values.hasNext())
                  {
                    x1 = values.next().get();
                  }

                  if(values.hasNext())
                  {
                    x2 = values.next().get();
                  }

                  if(values.hasNext())
                  {
                    y1 = values.next().get();
                  }

                  if(values.hasNext())
                  {
                    y2 = values.next().get();
                  }

                  distance = StrictMath.sqrt(StrictMath.pow(x2 - x1, 2.0) +
StrictMath.pow(y2 - y1, 2.0));
                  output.collect(key, new DoubleWritable(distance));
                }
        }//public static class Reduce

Any suggestions would be appreciated.

Erik

Re: Distance Calculation problem

Posted by Steve Lewis <lo...@gmail.com>.
Well my first suggestion is since the original String is used as a key -
repeat the parse steps in the Mapper in the reduce code using the
text from the key - then verify that the values you get are the ones in the
key -
Also if you know the answer - validate that you get the right answer when
you get the values in the map phase

On Wed, Jul 28, 2010 at 11:44 AM, Erik Test <er...@gmail.com> wrote:

> Hello,
>
> I've implemented a program using map reduce for a simple distance
> calculations between two 2D points. I've set up my input such that all
> calculations should be the same but they are not. This makes me think I'm
> doing something wrong within a map and/or reduce function. Here is my
> output.
>
> 1,7,13,19       18.973665961010276
> 10,16,22,28     8.48528137423857
> 11,17,23,29     8.48528137423857
> 12,18,24,30     8.48528137423857
> 13,19,25,31     8.48528137423857
> 14,20,26,32     8.48528137423857
> 15,21,27,33     18.973665961010276
> 16,22,28,34     18.973665961010276
> 17,23,29,35     18.973665961010276
> 18,24,30,36     8.48528137423857
> 19,25,31,37     18.973665961010276
> 2,8,14,20       8.48528137423857
> 20,26,32,38     8.48528137423857
> 21,27,33,39     8.48528137423857
> 22,28,34,40     8.48528137423857
> 23,29,35,41     8.48528137423857
> 24,30,36,42     8.48528137423857
> 25,31,37,43     18.973665961010276
> 26,32,38,44     18.973665961010276
> 27,33,39,45     18.973665961010276
> 28,34,40,46     16.97056274847714
> 29,35,41,47     8.48528137423857
> 3,9,15,21       8.48528137423857
> 30,36,42,48     18.973665961010276
> 31,37,43,49     18.973665961010276
> 32,38,44,50     8.48528137423857
> 33,39,45,51     8.48528137423857
> 34,40,46,52     8.48528137423857
> 35,41,47,53     18.973665961010276
> 4,10,16,22      18.973665961010276
> 5,11,17,23      18.973665961010276
> 6,12,18,24      8.48528137423857
> 7,13,19,25      8.48528137423857
> 8,14,20,26      8.48528137423857
> 9,15,21,27      18.973665961010276
>
> My key is the entire line entry of the input file and the value is the
> distance calculated. Any value that doesn't begin with an 8 is wrong.
>
> This is what I do in the mapper.
>
>  public static class Map extends MapReduceBase
>    implements Mapper<LongWritable, Text,
>      Text, DoubleWritable>
>        {
>          private Text word = new Text();
>
>          public void map(LongWritable key, Text value,
>            OutputCollector<Text, DoubleWritable> output,
>              Reporter reporter) throws IOException
>                {
>                  String line = value.toString();
>                  String[] tokens = line.split("[,]");
>
>                  for(int i = 0; i < tokens.length; i++)
>                  {
>                    word.set(line);
>                    output.collect(word, new
> DoubleWritable(Double.parseDouble(tokens[i])));
>                  }
>                }//public void map
>        }//public static class Map
>
> And this is what I do in the reducer.
>
> public static class Reduce extends MapReduceBase
>    implements Reducer<Text, DoubleWritable,
>      Text, DoubleWritable>
>        {
>          public void reduce(Text key, Iterator<DoubleWritable> values,
>            OutputCollector<Text, DoubleWritable> output, Reporter reporter)
>              throws IOException
>                {
>                  double distance  = 0;
>                  double x1 = 0;
>                  double x2 = 0;
>                  double y1 = 0;
>                  double y2 = 0;
>
>                  if(values.hasNext())
>                  {
>                    x1 = values.next().get();
>                  }
>
>                  if(values.hasNext())
>                  {
>                    x2 = values.next().get();
>                  }
>
>                  if(values.hasNext())
>                  {
>                    y1 = values.next().get();
>                  }
>
>                  if(values.hasNext())
>                  {
>                    y2 = values.next().get();
>                  }
>
>                  distance = StrictMath.sqrt(StrictMath.pow(x2 - x1, 2.0) +
> StrictMath.pow(y2 - y1, 2.0));
>                  output.collect(key, new DoubleWritable(distance));
>                }
>        }//public static class Reduce
>
> Any suggestions would be appreciated.
>
> Erik
>



-- 
Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA

Re: Distance Calculation problem

Posted by Erik Test <er...@gmail.com>.
Thank you Alex. I took your advice and implemented it. All the mapper
calculations are being determined correctly now.
Erik


On 28 July 2010 14:56, Alex Kozlov <al...@cloudera.com> wrote:

> Hi Erik,
>
> Your assumption is that the values are coming in the same order as they
> were
> emitted by the mapper.  This is not part of the MapReduce contract.
>
> MapReduce just gives you all the values for a given key.
>
> For your particular problem I don't think you need a reducer at all: just
> compute the distance in the mapper and set the # of reducers to 0 (*
> mapred.reduce.tasks*).  You do not need the reduce step.
>
> Alex K
>
> On Wed, Jul 28, 2010 at 11:44 AM, Erik Test <er...@gmail.com> wrote:
>
> > Hello,
> >
> > I've implemented a program using map reduce for a simple distance
> > calculations between two 2D points. I've set up my input such that all
> > calculations should be the same but they are not. This makes me think I'm
> > doing something wrong within a map and/or reduce function. Here is my
> > output.
> >
> > 1,7,13,19       18.973665961010276
> > 10,16,22,28     8.48528137423857
> > 11,17,23,29     8.48528137423857
> > 12,18,24,30     8.48528137423857
> > 13,19,25,31     8.48528137423857
> > 14,20,26,32     8.48528137423857
> > 15,21,27,33     18.973665961010276
> > 16,22,28,34     18.973665961010276
> > 17,23,29,35     18.973665961010276
> > 18,24,30,36     8.48528137423857
> > 19,25,31,37     18.973665961010276
> > 2,8,14,20       8.48528137423857
> > 20,26,32,38     8.48528137423857
> > 21,27,33,39     8.48528137423857
> > 22,28,34,40     8.48528137423857
> > 23,29,35,41     8.48528137423857
> > 24,30,36,42     8.48528137423857
> > 25,31,37,43     18.973665961010276
> > 26,32,38,44     18.973665961010276
> > 27,33,39,45     18.973665961010276
> > 28,34,40,46     16.97056274847714
> > 29,35,41,47     8.48528137423857
> > 3,9,15,21       8.48528137423857
> > 30,36,42,48     18.973665961010276
> > 31,37,43,49     18.973665961010276
> > 32,38,44,50     8.48528137423857
> > 33,39,45,51     8.48528137423857
> > 34,40,46,52     8.48528137423857
> > 35,41,47,53     18.973665961010276
> > 4,10,16,22      18.973665961010276
> > 5,11,17,23      18.973665961010276
> > 6,12,18,24      8.48528137423857
> > 7,13,19,25      8.48528137423857
> > 8,14,20,26      8.48528137423857
> > 9,15,21,27      18.973665961010276
> >
> > My key is the entire line entry of the input file and the value is the
> > distance calculated. Any value that doesn't begin with an 8 is wrong.
> >
> > This is what I do in the mapper.
> >
> >  public static class Map extends MapReduceBase
> >    implements Mapper<LongWritable, Text,
> >      Text, DoubleWritable>
> >        {
> >          private Text word = new Text();
> >
> >          public void map(LongWritable key, Text value,
> >            OutputCollector<Text, DoubleWritable> output,
> >              Reporter reporter) throws IOException
> >                {
> >                  String line = value.toString();
> >                  String[] tokens = line.split("[,]");
> >
> >                  for(int i = 0; i < tokens.length; i++)
> >                  {
> >                    word.set(line);
> >                    output.collect(word, new
> > DoubleWritable(Double.parseDouble(tokens[i])));
> >                  }
> >                }//public void map
> >        }//public static class Map
> >
> > And this is what I do in the reducer.
> >
> > public static class Reduce extends MapReduceBase
> >    implements Reducer<Text, DoubleWritable,
> >      Text, DoubleWritable>
> >        {
> >          public void reduce(Text key, Iterator<DoubleWritable> values,
> >            OutputCollector<Text, DoubleWritable> output, Reporter
> reporter)
> >              throws IOException
> >                {
> >                  double distance  = 0;
> >                  double x1 = 0;
> >                  double x2 = 0;
> >                  double y1 = 0;
> >                  double y2 = 0;
> >
> >                  if(values.hasNext())
> >                  {
> >                    x1 = values.next().get();
> >                  }
> >
> >                  if(values.hasNext())
> >                  {
> >                    x2 = values.next().get();
> >                  }
> >
> >                  if(values.hasNext())
> >                  {
> >                    y1 = values.next().get();
> >                  }
> >
> >                  if(values.hasNext())
> >                  {
> >                    y2 = values.next().get();
> >                  }
> >
> >                  distance = StrictMath.sqrt(StrictMath.pow(x2 - x1, 2.0)
> +
> > StrictMath.pow(y2 - y1, 2.0));
> >                  output.collect(key, new DoubleWritable(distance));
> >                }
> >        }//public static class Reduce
> >
> > Any suggestions would be appreciated.
> >
> > Erik
> >
>

Re: Distance Calculation problem

Posted by Alex Kozlov <al...@cloudera.com>.
Hi Erik,

Your assumption is that the values are coming in the same order as they were
emitted by the mapper.  This is not part of the MapReduce contract.

MapReduce just gives you all the values for a given key.

For your particular problem I don't think you need a reducer at all: just
compute the distance in the mapper and set the # of reducers to 0 (*
mapred.reduce.tasks*).  You do not need the reduce step.

Alex K

On Wed, Jul 28, 2010 at 11:44 AM, Erik Test <er...@gmail.com> wrote:

> Hello,
>
> I've implemented a program using map reduce for a simple distance
> calculations between two 2D points. I've set up my input such that all
> calculations should be the same but they are not. This makes me think I'm
> doing something wrong within a map and/or reduce function. Here is my
> output.
>
> 1,7,13,19       18.973665961010276
> 10,16,22,28     8.48528137423857
> 11,17,23,29     8.48528137423857
> 12,18,24,30     8.48528137423857
> 13,19,25,31     8.48528137423857
> 14,20,26,32     8.48528137423857
> 15,21,27,33     18.973665961010276
> 16,22,28,34     18.973665961010276
> 17,23,29,35     18.973665961010276
> 18,24,30,36     8.48528137423857
> 19,25,31,37     18.973665961010276
> 2,8,14,20       8.48528137423857
> 20,26,32,38     8.48528137423857
> 21,27,33,39     8.48528137423857
> 22,28,34,40     8.48528137423857
> 23,29,35,41     8.48528137423857
> 24,30,36,42     8.48528137423857
> 25,31,37,43     18.973665961010276
> 26,32,38,44     18.973665961010276
> 27,33,39,45     18.973665961010276
> 28,34,40,46     16.97056274847714
> 29,35,41,47     8.48528137423857
> 3,9,15,21       8.48528137423857
> 30,36,42,48     18.973665961010276
> 31,37,43,49     18.973665961010276
> 32,38,44,50     8.48528137423857
> 33,39,45,51     8.48528137423857
> 34,40,46,52     8.48528137423857
> 35,41,47,53     18.973665961010276
> 4,10,16,22      18.973665961010276
> 5,11,17,23      18.973665961010276
> 6,12,18,24      8.48528137423857
> 7,13,19,25      8.48528137423857
> 8,14,20,26      8.48528137423857
> 9,15,21,27      18.973665961010276
>
> My key is the entire line entry of the input file and the value is the
> distance calculated. Any value that doesn't begin with an 8 is wrong.
>
> This is what I do in the mapper.
>
>  public static class Map extends MapReduceBase
>    implements Mapper<LongWritable, Text,
>      Text, DoubleWritable>
>        {
>          private Text word = new Text();
>
>          public void map(LongWritable key, Text value,
>            OutputCollector<Text, DoubleWritable> output,
>              Reporter reporter) throws IOException
>                {
>                  String line = value.toString();
>                  String[] tokens = line.split("[,]");
>
>                  for(int i = 0; i < tokens.length; i++)
>                  {
>                    word.set(line);
>                    output.collect(word, new
> DoubleWritable(Double.parseDouble(tokens[i])));
>                  }
>                }//public void map
>        }//public static class Map
>
> And this is what I do in the reducer.
>
> public static class Reduce extends MapReduceBase
>    implements Reducer<Text, DoubleWritable,
>      Text, DoubleWritable>
>        {
>          public void reduce(Text key, Iterator<DoubleWritable> values,
>            OutputCollector<Text, DoubleWritable> output, Reporter reporter)
>              throws IOException
>                {
>                  double distance  = 0;
>                  double x1 = 0;
>                  double x2 = 0;
>                  double y1 = 0;
>                  double y2 = 0;
>
>                  if(values.hasNext())
>                  {
>                    x1 = values.next().get();
>                  }
>
>                  if(values.hasNext())
>                  {
>                    x2 = values.next().get();
>                  }
>
>                  if(values.hasNext())
>                  {
>                    y1 = values.next().get();
>                  }
>
>                  if(values.hasNext())
>                  {
>                    y2 = values.next().get();
>                  }
>
>                  distance = StrictMath.sqrt(StrictMath.pow(x2 - x1, 2.0) +
> StrictMath.pow(y2 - y1, 2.0));
>                  output.collect(key, new DoubleWritable(distance));
>                }
>        }//public static class Reduce
>
> Any suggestions would be appreciated.
>
> Erik
>