You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Steven Hafran <sj...@gmail.com> on 2011/08/17 09:36:55 UTC

Hadoop Streaming 0.20.2 and how to specify number of reducers per node -- is it possible?

hi everyone,

i have a few hadoop streaming tasks that would benefit from having the
reduce phase execute one reducer per node instead of two per node due to
high cpu and i/o.  currently, i have a 30 node cluster and specify 30
reducers.  when reviewing the job stats on the job tracker, i do see 30
reducers queued/executing; however, i have observed that those reducers are
distributed to 15 nodes resulting in only 50% use of my cluster.

after reviewing the hadoop docs, i've tried setting the following properties
when starting my streaming job; however, they don't seem to have any impact.
-jobconf mapred.tasktracker.reduce.tasks.maximum=1

how do i tell hadoop to run 1 reducer per node with streaming?

thanks in advance for your assistance!

regards,
-steven

Re: Hadoop Streaming 0.20.2 and how to specify number of reducers per node -- is it possible?

Posted by Allen Wittenauer <aw...@apache.org>.
On Aug 17, 2011, at 12:36 AM, Steven Hafran wrote:
> 
> 
> after reviewing the hadoop docs, i've tried setting the following properties
> when starting my streaming job; however, they don't seem to have any impact.
> -jobconf mapred.tasktracker.reduce.tasks.maximum=1

	"tasktracker" is the hint:  that's a server side setting.

	You're looking for the mapred.reduce.tasks settings.