You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mrunit.apache.org by Brock Noland <br...@cloudera.com> on 2013/06/03 18:23:19 UTC
Re: The order of map-reduce results on MRUnit 1.0
What charset sort order are you expecting?
On Fri, May 31, 2013 at 12:53 AM, 岸本忠士 <ki...@gmail.com> wrote:
> Hi,
>
> I'm trying MRUnit 1.0 on CDH4.
>
> The order of map-reduce results had been sorted in ascending of character
> on old MRUnit which was included in CDH3.
>
> But they aren't sorted in ascending of character on MRUnit 1.0.
> Why does it happen?
>
> Thanks,
>
--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
Re: The order of map-reduce results on MRUnit 1.0
Posted by 岸本忠士 <ki...@gmail.com>.
Hi,
I understand.
Thank you very much for your advice.
Re: The order of map-reduce results on MRUnit 1.0
Posted by Brock Noland <br...@cloudera.com>.
Hi,
You are storing the outputs in your reducer in a hashmap which is not
ordered. This is causing your issue. Additionally I'd note that buffering
output in your example is both an anti-pattern and unnecessary. Here is a
correct implementation:
http://github.mtv.cloudera.com/CDH/hadoop/blob/cdh4-2.0.0_4.3.0/hadoop-mapreduce1-project/src/mapred/org/apache/hadoop/mapred/lib/LongSumReducer.java
Brock
On Mon, Jun 3, 2013 at 10:06 PM, 岸本忠士 <ki...@gmail.com> wrote:
> This is an example.
>
> public class MapReduce {
> public static class Map extends Mapper<LongWritable, Text, Text,
> IntWritable> {
> @Override
> protected void map(LongWritable key, Text value, Context context) {
> try {
> context.write(new Text(value.toString()), new
> IntWritable(1));
> } catch (Exception e) {
> e.printStackTrace();
> }
> }
> }
>
> public static class Reduce extends Reducer<Text, IntWritable, Text,
> IntWritable> {
> private HashMap<String, Integer> maps = null;
>
> protected void setup(Context context) throws IOException,
> InterruptedException {
> maps = new HashMap<String, Integer>();
> }
>
> @Override
> protected void cleanup(Context context) throws IOException,
> InterruptedException {
> for (Entry<String, Integer> map : maps.entrySet()) {
> context.write(new Text(map.getKey()), new
> IntWritable(map.getValue()));
> }
> }
>
> @Override
> protected void reduce(Text key, Iterable<IntWritable> values,
> Context context) throws IOException, InterruptedException {
> int sum = 0;
> for (IntWritable value : values) {
> sum += value.get();
> }
>
> maps.put(key.toString(), sum);
> }
> }
>
> public static void main(String[] args) throws Exception {
> Configuration conf = new Configuration();
> Job job = new Job(conf, "mrunittest");
> job.setJarByClass(MapReduce.class);
> job.setMapperClass(Map.class);
> job.setReducerClass(Reduce.class);
> job.setMapOutputKeyClass(Text.class);
> job.setMapOutputValueClass(IntWritable.class);
> job.setOutputKeyClass(Text.class);
> job.setOutputValueClass(IntWritable.class);
> job.setInputFormatClass(TextInputFormat.class);
> job.setOutputFormatClass(TextOutputFormat.class);
> job.setNumReduceTasks(1);
> FileInputFormat.setInputPaths(job, "");
> FileOutputFormat.setOutputPath(job, new Path(""));
>
> job.waitForCompletion(true);
> }
> }
>
>
> public class MapReduceTest {
> protected Mapper<LongWritable, Text, Text, IntWritable> mapper;
> protected Reducer<Text, IntWritable, Text, IntWritable> reducer;
> protected MapReduceDriver<LongWritable, Text, Text, IntWritable, Text,
> IntWritable> driver;
>
> @Before
> public void setUp() throws Exception {
> mapper = new MapReduce.Map();
> reducer = new MapReduce.Reduce();
> driver = new MapReduceDriver<LongWritable, Text, Text,
> IntWritable, Text, IntWritable>(mapper, reducer);
> }
>
> @Test
> public void mapReduceTest() throws Exception {
> driver.withInput(new LongWritable(1), new Text("a"));
> driver.withInput(new LongWritable(1), new Text("b"));
> driver.withInput(new LongWritable(1), new Text("a"));
> driver.withInput(new LongWritable(1), new Text("a"));
> driver.withInput(new LongWritable(1), new Text("b"));
>
> driver.withOutput(new Text("a"), new IntWritable(3));
> driver.withOutput(new Text("b"), new IntWritable(2));
>
> driver.runTest();
> }
> }
>
> I expect that the result is the following.
> a, 3
> b, 2
>
> But the result is the following.
> b, 2
> a, 3
>
> If I used LinkedHashMap instead of HashMap for maps variable, it would
> work correctly.
> But I think that it should work correctly if I use HashMap.
> Is this incorrect?
>
> Thanks,
>
>
> 2013/6/4 岸本忠士 <ki...@gmail.com>
>
>> Ok, I'm preparing an example. Just a moment please.
>>
>>
>> 2013/6/4 Brock Noland <br...@cloudera.com>
>>
>>> Can you share an example test?
>>>
>>>
>>> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>>>
>>>> The charset is UTF-8.
>>>> It contains ASCII characters only.
>>>>
>>>> Thanks,
>>>>
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>>
>>
>>
>
--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
Re: The order of map-reduce results on MRUnit 1.0
Posted by 岸本忠士 <ki...@gmail.com>.
This is an example.
public class MapReduce {
public static class Map extends Mapper<LongWritable, Text, Text,
IntWritable> {
@Override
protected void map(LongWritable key, Text value, Context context) {
try {
context.write(new Text(value.toString()), new
IntWritable(1));
} catch (Exception e) {
e.printStackTrace();
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text,
IntWritable> {
private HashMap<String, Integer> maps = null;
protected void setup(Context context) throws IOException,
InterruptedException {
maps = new HashMap<String, Integer>();
}
@Override
protected void cleanup(Context context) throws IOException,
InterruptedException {
for (Entry<String, Integer> map : maps.entrySet()) {
context.write(new Text(map.getKey()), new
IntWritable(map.getValue()));
}
}
@Override
protected void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
maps.put(key.toString(), sum);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "mrunittest");
job.setJarByClass(MapReduce.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setNumReduceTasks(1);
FileInputFormat.setInputPaths(job, "");
FileOutputFormat.setOutputPath(job, new Path(""));
job.waitForCompletion(true);
}
}
public class MapReduceTest {
protected Mapper<LongWritable, Text, Text, IntWritable> mapper;
protected Reducer<Text, IntWritable, Text, IntWritable> reducer;
protected MapReduceDriver<LongWritable, Text, Text, IntWritable, Text,
IntWritable> driver;
@Before
public void setUp() throws Exception {
mapper = new MapReduce.Map();
reducer = new MapReduce.Reduce();
driver = new MapReduceDriver<LongWritable, Text, Text, IntWritable,
Text, IntWritable>(mapper, reducer);
}
@Test
public void mapReduceTest() throws Exception {
driver.withInput(new LongWritable(1), new Text("a"));
driver.withInput(new LongWritable(1), new Text("b"));
driver.withInput(new LongWritable(1), new Text("a"));
driver.withInput(new LongWritable(1), new Text("a"));
driver.withInput(new LongWritable(1), new Text("b"));
driver.withOutput(new Text("a"), new IntWritable(3));
driver.withOutput(new Text("b"), new IntWritable(2));
driver.runTest();
}
}
I expect that the result is the following.
a, 3
b, 2
But the result is the following.
b, 2
a, 3
If I used LinkedHashMap instead of HashMap for maps variable, it would work
correctly.
But I think that it should work correctly if I use HashMap.
Is this incorrect?
Thanks,
2013/6/4 岸本忠士 <ki...@gmail.com>
> Ok, I'm preparing an example. Just a moment please.
>
>
> 2013/6/4 Brock Noland <br...@cloudera.com>
>
>> Can you share an example test?
>>
>>
>> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>>
>>> The charset is UTF-8.
>>> It contains ASCII characters only.
>>>
>>> Thanks,
>>>
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>
>
>
Re: The order of map-reduce results on MRUnit 1.0
Posted by 岸本忠士 <ki...@gmail.com>.
Ok, I'm preparing an example. Just a moment please.
2013/6/4 Brock Noland <br...@cloudera.com>
> Can you share an example test?
>
>
> On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
>
>> The charset is UTF-8.
>> It contains ASCII characters only.
>>
>> Thanks,
>>
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>
Re: The order of map-reduce results on MRUnit 1.0
Posted by Brock Noland <br...@cloudera.com>.
Can you share an example test?
On Mon, Jun 3, 2013 at 8:36 PM, 岸本忠士 <ki...@gmail.com> wrote:
> The charset is UTF-8.
> It contains ASCII characters only.
>
> Thanks,
>
--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
Re: The order of map-reduce results on MRUnit 1.0
Posted by 岸本忠士 <ki...@gmail.com>.
The charset is UTF-8.
It contains ASCII characters only.
Thanks,