You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mrunit.apache.org by 岸本忠士 <ki...@gmail.com> on 2013/05/31 07:53:20 UTC

The order of map-reduce results on MRUnit 1.0

Hi,

I'm trying MRUnit 1.0 on CDH4.

The order of map-reduce results had been sorted in ascending of character
on old MRUnit which was included in CDH3.

But they aren't sorted in ascending of character on MRUnit 1.0.
Why does it happen?

Thanks,

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

Hi,

I understand.

Thank you very much for your advice.

Re: The order of map-reduce results on MRUnit 1.0

Posted by Brock Noland <br...@cloudera.com>.

Hi,

You are storing the outputs in your reducer in a hashmap which is not
ordered. This is causing your issue. Additionally I'd note that buffering
output in your example is both an anti-pattern and unnecessary. Here is a
correct implementation:

http://github.mtv.cloudera.com/CDH/hadoop/blob/cdh4-2.0.0_4.3.0/hadoop-mapreduce1-project/src/mapred/org/apache/hadoop/mapred/lib/LongSumReducer.java

Brock


On Mon, Jun 3, 2013 at 10:06 PM, 岸本忠士 <ki...@gmail.com> wrote:

> This is an example.
>
> public class MapReduce {
>     public static class Map extends Mapper<LongWritable, Text, Text,
> IntWritable> {
>         @Override
>         protected void map(LongWritable key, Text value, Context context) {
>             try {
>                 context.write(new Text(value.toString()), new
> IntWritable(1));
>             } catch (Exception e) {
>                 e.printStackTrace();
>             }
>         }
>     }
>
>     public static class Reduce extends Reducer<Text, IntWritable, Text,
> IntWritable> {
>         private HashMap<String, Integer> maps = null;
>
>         protected void setup(Context context) throws IOException,
> InterruptedException {
>             maps = new HashMap<String, Integer>();
>         }
>
>         @Override
>         protected void cleanup(Context context) throws IOException,
> InterruptedException {
>             for (Entry<String, Integer> map : maps.entrySet()) {
>                 context.write(new Text(map.getKey()), new
> IntWritable(map.getValue()));
>             }
>         }
>
>         @Override
>         protected void reduce(Text key, Iterable<IntWritable> values,
> Context context) throws IOException, InterruptedException {
>             int sum = 0;
>             for (IntWritable value : values) {
>                 sum += value.get();
>             }
>
>             maps.put(key.toString(), sum);
>         }
>     }
>
>     public static void main(String[] args) throws Exception {
>         Configuration conf = new Configuration();
>         Job job = new Job(conf, "mrunittest");
>         job.setJarByClass(MapReduce.class);
>         job.setMapperClass(Map.class);
>         job.setReducerClass(Reduce.class);
>         job.setMapOutputKeyClass(Text.class);
>         job.setMapOutputValueClass(IntWritable.class);
>         job.setOutputKeyClass(Text.class);
>         job.setOutputValueClass(IntWritable.class);
>         job.setInputFormatClass(TextInputFormat.class);
>         job.setOutputFormatClass(TextOutputFormat.class);
>         job.setNumReduceTasks(1);
>         FileInputFormat.setInputPaths(job, "");
>         FileOutputFormat.setOutputPath(job, new Path(""));
>
>         job.waitForCompletion(true);
>     }
> }
>
>
> public class MapReduceTest {
>     protected Mapper<LongWritable, Text, Text, IntWritable> mapper;
>     protected Reducer<Text, IntWritable, Text, IntWritable> reducer;
>     protected MapReduceDriver<LongWritable, Text, Text, IntWritable, Text,
> IntWritable> driver;
>
>     @Before
>     public void setUp() throws Exception {
>         mapper = new MapReduce.Map();
>         reducer = new MapReduce.Reduce();
>         driver = new MapReduceDriver<LongWritable, Text, Text,
> IntWritable, Text, IntWritable>(mapper, reducer);
>     }
>
>     @Test
>     public void mapReduceTest() throws Exception {
>         driver.withInput(new LongWritable(1), new Text("a"));
>         driver.withInput(new LongWritable(1), new Text("b"));
>         driver.withInput(new LongWritable(1), new Text("a"));
>         driver.withInput(new LongWritable(1), new Text("a"));
>         driver.withInput(new LongWritable(1), new Text("b"));
>
>         driver.withOutput(new Text("a"), new IntWritable(3));
>         driver.withOutput(new Text("b"), new IntWritable(2));
>
>         driver.runTest();
>     }
> }
>
> I expect that the result is the following.
> a, 3
> b, 2
>
> But the result is the following.
> b, 2
> a, 3
>
> If I used LinkedHashMap instead of HashMap for maps variable, it would
> work correctly.
> But I think that it should work correctly if I use HashMap.
> Is this incorrect?
>
> Thanks,
>
>
> 2013/6/4 岸本忠士 <ki...@gmail.com>
>
>> Ok, I'm preparing an example. Just a moment please.
>>
>>
>> 2013/6/4 Brock Noland <br...@cloudera.com>
>>
>>> Can you share an example test?
>>>
>>>
>>> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>>>
>>>> The charset is UTF-8.
>>>> It contains ASCII characters only.
>>>>
>>>> Thanks,
>>>>
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>>
>>
>>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

This is an example.

public class MapReduce {
    public static class Map extends Mapper<LongWritable, Text, Text,
IntWritable> {
        @Override
        protected void map(LongWritable key, Text value, Context context) {
            try {
                context.write(new Text(value.toString()), new
IntWritable(1));
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    public static class Reduce extends Reducer<Text, IntWritable, Text,
IntWritable> {
        private HashMap<String, Integer> maps = null;

        protected void setup(Context context) throws IOException,
InterruptedException {
            maps = new HashMap<String, Integer>();
        }

        @Override
        protected void cleanup(Context context) throws IOException,
InterruptedException {
            for (Entry<String, Integer> map : maps.entrySet()) {
                context.write(new Text(map.getKey()), new
IntWritable(map.getValue()));
            }
        }

        @Override
        protected void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }

            maps.put(key.toString(), sum);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = new Job(conf, "mrunittest");
        job.setJarByClass(MapReduce.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        job.setNumReduceTasks(1);
        FileInputFormat.setInputPaths(job, "");
        FileOutputFormat.setOutputPath(job, new Path(""));

        job.waitForCompletion(true);
    }
}


public class MapReduceTest {
    protected Mapper<LongWritable, Text, Text, IntWritable> mapper;
    protected Reducer<Text, IntWritable, Text, IntWritable> reducer;
    protected MapReduceDriver<LongWritable, Text, Text, IntWritable, Text,
IntWritable> driver;

    @Before
    public void setUp() throws Exception {
        mapper = new MapReduce.Map();
        reducer = new MapReduce.Reduce();
        driver = new MapReduceDriver<LongWritable, Text, Text, IntWritable,
Text, IntWritable>(mapper, reducer);
    }

    @Test
    public void mapReduceTest() throws Exception {
        driver.withInput(new LongWritable(1), new Text("a"));
        driver.withInput(new LongWritable(1), new Text("b"));
        driver.withInput(new LongWritable(1), new Text("a"));
        driver.withInput(new LongWritable(1), new Text("a"));
        driver.withInput(new LongWritable(1), new Text("b"));

        driver.withOutput(new Text("a"), new IntWritable(3));
        driver.withOutput(new Text("b"), new IntWritable(2));

        driver.runTest();
    }
}

I expect that the result is the following.
a, 3
b, 2

But the result is the following.
b, 2
a, 3

If I used LinkedHashMap instead of HashMap for maps variable, it would work
correctly.
But I think that it should work correctly if I use HashMap.
Is this incorrect?

Thanks,


2013/6/4 岸本忠士 <ki...@gmail.com>

> Ok, I'm preparing an example. Just a moment please.
>
>
> 2013/6/4 Brock Noland <br...@cloudera.com>
>
>> Can you share an example test?
>>
>>
>> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>>
>>> The charset is UTF-8.
>>> It contains ASCII characters only.
>>>
>>> Thanks,
>>>
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>
>
>

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

Ok, I'm preparing an example. Just a moment please.


2013/6/4 Brock Noland <br...@cloudera.com>

> Can you share an example test?
>
>
> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>
>> The charset is UTF-8.
>> It contains ASCII characters only.
>>
>> Thanks,
>>
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>

Re: The order of map-reduce results on MRUnit 1.0

Posted by Brock Noland <br...@cloudera.com>.

Can you share an example test?


On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:

> The charset is UTF-8.
> It contains ASCII characters only.
>
> Thanks,
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

The charset is UTF-8.
It contains ASCII characters only.

Thanks,

Re: The order of map-reduce results on MRUnit 1.0

Posted by Brock Noland <br...@cloudera.com>.

What charset sort order are you expecting?


On Fri, May 31, 2013 at 12:53 AM, 岸本忠士 <ki...@gmail.com> wrote:

> Hi,
>
> I'm trying MRUnit 1.0 on CDH4.
>
> The order of map-reduce results had been sorted in ascending of character
> on old MRUnit which was included in CDH3.
>
> But they aren't sorted in ascending of character on MRUnit 1.0.
> Why does it happen?
>
> Thanks,
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org