You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chaman <cs...@yahoo.com> on 2008/04/15 19:49:17 UTC

MapReduce: Two Reduce Tasks

Hello,

I am developing some applications in which I can use the output of Map to
3-4 different Reduce tasks ?
What is the best way to accomplish such task ? 

Thanks.

With regards,
csv
-- 
View this message in context: http://www.nabble.com/MapReduce%3A-Two-Reduce-Tasks-tp16703412p16703412.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: MapReduce: Two Reduce Tasks

Posted by Amar Kamat <am...@yahoo-inc.com>.
Chaman Singh Verma wrote:
> Hello,
>
> I think the question was slightly misinterpreted. What I meant by 3-4
> different task is that there are
> 3 different Reduce functionalities ( each reduce funtionalities could be
> done by many task slaves, may
> be 100). I want to reuse the output of Map for different types of operations
> ? From the examples, which
> I have come across contains one Map Function and One Reduce Function. I want
> one Map Function and
> 3-4 Reduce Functions which can utilize the output of Map Function.
>
>   
Your reduce code will generate x key-value pairs one for each reduce 
functionality. Encode the keys with the functionality information before 
calling output.collect(). 
You need to write your own OutputFormat. This output format should 
extend MultipleOutputFormat (see 
org.apache.hadoop.mapred.lib.MultipleOutputFormat.java) and override 
generateFileNameForKeyValue(K key, V value, String name). Use the 
functionality information from the key to rename the output file, for 
example
protected String generateFileNameForKeyValue(K key, V value, String name) {
  return name + "_" + decode(key); // decode will figure out the 
identifier for the functionality
}
Note that MultipleOutputFormat is available in Hadoop-0.17.
Amar
> Thanks,
>
> With Regards,
> Chaman Singh
>  
>
> Chaman Singh Verma wrote:
>   
>> Hello,
>>
>> I am developing some applications in which I can use the output of Map to
>> 3-4 different Reduce tasks ?
>> What is the best way to accomplish such task ? 
>>
>> Thanks.
>>
>> With regards,
>> csv
>>
>>     
>
>
> -----
> Chaman Singh Verma
> Poona, India
>   


Re: MapReduce: Two Reduce Tasks

Posted by Chaman Singh Verma <cs...@yahoo.com>.
Hello,

I think the question was slightly misinterpreted. What I meant by 3-4
different task is that there are
3 different Reduce functionalities ( each reduce funtionalities could be
done by many task slaves, may
be 100). I want to reuse the output of Map for different types of operations
? From the examples, which
I have come across contains one Map Function and One Reduce Function. I want
one Map Function and
3-4 Reduce Functions which can utilize the output of Map Function.

Thanks,

With Regards,
Chaman Singh
 

Chaman Singh Verma wrote:
> 
> Hello,
> 
> I am developing some applications in which I can use the output of Map to
> 3-4 different Reduce tasks ?
> What is the best way to accomplish such task ? 
> 
> Thanks.
> 
> With regards,
> csv
> 


-----
Chaman Singh Verma
Poona, India
-- 
View this message in context: http://www.nabble.com/MapReduce%3A-Two-Reduce-Tasks-tp16703412p16703422.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: MapReduce: Two Reduce Tasks

Posted by Theodore Van Rooy <mu...@gmail.com>.
I think you just want to set your reduce tasks paramaters in hadoop
streaming to 3 or 4, and make sure that all the other settings wont push it
over 3 or 4..

Why do you want just 3 or 4... have you determined that to be the optimal
number of reduces?

On Tue, Apr 15, 2008 at 11:49 AM, Chaman <cs...@yahoo.com> wrote:

>
> Hello,
>
> I am developing some applications in which I can use the output of Map to
> 3-4 different Reduce tasks ?
> What is the best way to accomplish such task ?
>
> Thanks.
>
> With regards,
> csv
> --
> View this message in context:
> http://www.nabble.com/MapReduce%3A-Two-Reduce-Tasks-tp16703412p16703412.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>


-- 
Theodore Van Rooy
http://greentheo.scroggles.com