You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Joseph Wang <jo...@yahoo.com> on 2006/03/27 05:16:44 UTC

guideline for # of reduce jobs?

Hi,

I've written several MapReduce jobs. However, I
noticed the jobs took a long time to finish
due to sort and reduce. I run tasks with
1 reducer. Is there a guideline on how
many reduce task? If I'm running job on
4 boxes, does it means I should specify
4 reduce tasks max? Would the results be
different if the number of reduce tasks
are different?

In google's implementation, they have a way
to specify partition function. Does hadoop
have similar feature?


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com