You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mrunit.apache.org by Brock Noland <br...@cloudera.com> on 2013/06/03 18:23:19 UTC

Re: The order of map-reduce results on MRUnit 1.0

What charset sort order are you expecting?


On Fri, May 31, 2013 at 12:53 AM, 岸本忠士 <ki...@gmail.com> wrote:

> Hi,
>
> I'm trying MRUnit 1.0 on CDH4.
>
> The order of map-reduce results had been sorted in ascending of character
> on old MRUnit which was included in CDH3.
>
> But they aren't sorted in ascending of character on MRUnit 1.0.
> Why does it happen?
>
> Thanks,
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

Hi,

I understand.

Thank you very much for your advice.

Re: The order of map-reduce results on MRUnit 1.0

Posted by Brock Noland <br...@cloudera.com>.

Hi,

You are storing the outputs in your reducer in a hashmap which is not
ordered. This is causing your issue. Additionally I'd note that buffering
output in your example is both an anti-pattern and unnecessary. Here is a
correct implementation:

http://github.mtv.cloudera.com/CDH/hadoop/blob/cdh4-2.0.0_4.3.0/hadoop-mapreduce1-project/src/mapred/org/apache/hadoop/mapred/lib/LongSumReducer.java

Brock


On Mon, Jun 3, 2013 at 10:06 PM, 岸本忠士 <ki...@gmail.com> wrote:

> This is an example.
>
> public class MapReduce {
>     public static class Map extends Mapper<LongWritable, Text, Text,
> IntWritable> {
>         @Override
>         protected void map(LongWritable key, Text value, Context context) {
>             try {
>                 context.write(new Text(value.toString()), new
> IntWritable(1));
>             } catch (Exception e) {
>                 e.printStackTrace();
>             }
>         }
>     }
>
>     public static class Reduce extends Reducer<Text, IntWritable, Text,
> IntWritable> {
>         private HashMap<String, Integer> maps = null;
>
>         protected void setup(Context context) throws IOException,
> InterruptedException {
>             maps = new HashMap<String, Integer>();
>         }
>
>         @Override
>         protected void cleanup(Context context) throws IOException,
> InterruptedException {
>             for (Entry<String, Integer> map : maps.entrySet()) {
>                 context.write(new Text(map.getKey()), new
> IntWritable(map.getValue()));
>             }
>         }
>
>         @Override
>         protected void reduce(Text key, Iterable<IntWritable> values,
> Context context) throws IOException, InterruptedException {
>             int sum = 0;
>             for (IntWritable value : values) {
>                 sum += value.get();
>             }
>
>             maps.put(key.toString(), sum);
>         }
>     }
>
>     public static void main(String[] args) throws Exception {
>         Configuration conf = new Configuration();
>         Job job = new Job(conf, "mrunittest");
>         job.setJarByClass(MapReduce.class);
>         job.setMapperClass(Map.class);
>         job.setReducerClass(Reduce.class);
>         job.setMapOutputKeyClass(Text.class);
>         job.setMapOutputValueClass(IntWritable.class);
>         job.setOutputKeyClass(Text.class);
>         job.setOutputValueClass(IntWritable.class);
>         job.setInputFormatClass(TextInputFormat.class);
>         job.setOutputFormatClass(TextOutputFormat.class);
>         job.setNumReduceTasks(1);
>         FileInputFormat.setInputPaths(job, "");
>         FileOutputFormat.setOutputPath(job, new Path(""));
>
>         job.waitForCompletion(true);
>     }
> }
>
>
> public class MapReduceTest {
>     protected Mapper<LongWritable, Text, Text, IntWritable> mapper;
>     protected Reducer<Text, IntWritable, Text, IntWritable> reducer;
>     protected MapReduceDriver<LongWritable, Text, Text, IntWritable, Text,
> IntWritable> driver;
>
>     @Before
>     public void setUp() throws Exception {
>         mapper = new MapReduce.Map();
>         reducer = new MapReduce.Reduce();
>         driver = new MapReduceDriver<LongWritable, Text, Text,
> IntWritable, Text, IntWritable>(mapper, reducer);
>     }
>
>     @Test
>     public void mapReduceTest() throws Exception {
>         driver.withInput(new LongWritable(1), new Text("a"));
>         driver.withInput(new LongWritable(1), new Text("b"));
>         driver.withInput(new LongWritable(1), new Text("a"));
>         driver.withInput(new LongWritable(1), new Text("a"));
>         driver.withInput(new LongWritable(1), new Text("b"));
>
>         driver.withOutput(new Text("a"), new IntWritable(3));
>         driver.withOutput(new Text("b"), new IntWritable(2));
>
>         driver.runTest();
>     }
> }
>
> I expect that the result is the following.
> a, 3
> b, 2
>
> But the result is the following.
> b, 2
> a, 3
>
> If I used LinkedHashMap instead of HashMap for maps variable, it would
> work correctly.
> But I think that it should work correctly if I use HashMap.
> Is this incorrect?
>
> Thanks,
>
>
> 2013/6/4 岸本忠士 <ki...@gmail.com>
>
>> Ok, I'm preparing an example. Just a moment please.
>>
>>
>> 2013/6/4 Brock Noland <br...@cloudera.com>
>>
>>> Can you share an example test?
>>>
>>>
>>> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>>>
>>>> The charset is UTF-8.
>>>> It contains ASCII characters only.
>>>>
>>>> Thanks,
>>>>
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>>
>>
>>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

This is an example.

public class MapReduce {
    public static class Map extends Mapper<LongWritable, Text, Text,
IntWritable> {
        @Override
        protected void map(LongWritable key, Text value, Context context) {
            try {
                context.write(new Text(value.toString()), new
IntWritable(1));
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    public static class Reduce extends Reducer<Text, IntWritable, Text,
IntWritable> {
        private HashMap<String, Integer> maps = null;

        protected void setup(Context context) throws IOException,
InterruptedException {
            maps = new HashMap<String, Integer>();
        }

        @Override
        protected void cleanup(Context context) throws IOException,
InterruptedException {
            for (Entry<String, Integer> map : maps.entrySet()) {
                context.write(new Text(map.getKey()), new
IntWritable(map.getValue()));
            }
        }

        @Override
        protected void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }

            maps.put(key.toString(), sum);
        }
    }

    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = new Job(conf, "mrunittest");
        job.setJarByClass(MapReduce.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        job.setNumReduceTasks(1);
        FileInputFormat.setInputPaths(job, "");
        FileOutputFormat.setOutputPath(job, new Path(""));

        job.waitForCompletion(true);
    }
}


public class MapReduceTest {
    protected Mapper<LongWritable, Text, Text, IntWritable> mapper;
    protected Reducer<Text, IntWritable, Text, IntWritable> reducer;
    protected MapReduceDriver<LongWritable, Text, Text, IntWritable, Text,
IntWritable> driver;

    @Before
    public void setUp() throws Exception {
        mapper = new MapReduce.Map();
        reducer = new MapReduce.Reduce();
        driver = new MapReduceDriver<LongWritable, Text, Text, IntWritable,
Text, IntWritable>(mapper, reducer);
    }

    @Test
    public void mapReduceTest() throws Exception {
        driver.withInput(new LongWritable(1), new Text("a"));
        driver.withInput(new LongWritable(1), new Text("b"));
        driver.withInput(new LongWritable(1), new Text("a"));
        driver.withInput(new LongWritable(1), new Text("a"));
        driver.withInput(new LongWritable(1), new Text("b"));

        driver.withOutput(new Text("a"), new IntWritable(3));
        driver.withOutput(new Text("b"), new IntWritable(2));

        driver.runTest();
    }
}

I expect that the result is the following.
a, 3
b, 2

But the result is the following.
b, 2
a, 3

If I used LinkedHashMap instead of HashMap for maps variable, it would work
correctly.
But I think that it should work correctly if I use HashMap.
Is this incorrect?

Thanks,


2013/6/4 岸本忠士 <ki...@gmail.com>

> Ok, I'm preparing an example. Just a moment please.
>
>
> 2013/6/4 Brock Noland <br...@cloudera.com>
>
>> Can you share an example test?
>>
>>
>> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>>
>>> The charset is UTF-8.
>>> It contains ASCII characters only.
>>>
>>> Thanks,
>>>
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>
>
>

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

Ok, I'm preparing an example. Just a moment please.


2013/6/4 Brock Noland <br...@cloudera.com>

> Can you share an example test?
>
>
> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>
>> The charset is UTF-8.
>> It contains ASCII characters only.
>>
>> Thanks,
>>
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>

Re: The order of map-reduce results on MRUnit 1.0

Posted by Brock Noland <br...@cloudera.com>.

Can you share an example test?


On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:

> The charset is UTF-8.
> It contains ASCII characters only.
>
> Thanks,
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Re: The order of map-reduce results on MRUnit 1.0

Posted by 岸本忠士 <ki...@gmail.com>.

The charset is UTF-8.
It contains ASCII characters only.

Thanks,