You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by 심탁길 <10...@nhncorp.com> on 2007/03/19 03:13:27 UTC

Initial number of Maps on the machine

When I run Simple MR Job such as grep (about 200Maps & 4 Reduces) with 20 Opeteron Servers ( 2Way Dual-Core, 4GB RAM)

Only 2 maps are instatantiated on one machine and each map task takes 5~6 seconds to be done.  

As a result, about 50% CPU is unsed during the MR Job and the overall performance is not that good as I expected 

Configuration "mapred.tasktracker.tasks.maximum ==> 10", It only works when each map tasks last more than 10 seconds

It seemds that Hadoop framework starts MR job with the limit of 2 maps on one machine.  

When running two similar MR Jobs concurrently, the number of maps on one machine is still 2 and CPU usage is about 50% and each MR Jobs takes almost 2times longer to be done.

Then, How can I change the initial limit of map's count on the machine ?