You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Chris Quach <qu...@gmail.com> on 2008/11/25 16:38:01 UTC

Hadoop complex calculations

Hi,

I'm testing Hadoop to see if we could use for complex calculations next to
the 'standard' implementation. I've set up a grid with 10 nodes and if I run
the RandomTextWriter example only 2 nodes are used as mappers, while I
specified 10 mappers to be used. The other nodes are used for storage, but I
want them to also execute the map function. (I've had this same behaviour
with my own test program..)

Is there a way to tell the framework to use all available nodes as mappers?
Thanks in advance,

Chris

Re: Hadoop complex calculations

Posted by Karl Anderson <kr...@monkey.org>.
On 25-Nov-08, at 7:38 AM, Chris Quach wrote:

> Hi,
>
> I'm testing Hadoop to see if we could use for complex calculations  
> next to
> the 'standard' implementation. I've set up a grid with 10 nodes and  
> if I run
> the RandomTextWriter example only 2 nodes are used as mappers, while I
> specified 10 mappers to be used. The other nodes are used for  
> storage, but I
> want them to also execute the map function. (I've had this same  
> behaviour
> with my own test program..)
>
> Is there a way to tell the framework to use all available nodes as  
> mappers?
> Thanks in advance,
>
> Chris


Assuming you have more than two tasks to run in total, you're probably  
seeing all nodes being used, but only 2 at once.  If you're only  
seeing two *tasks*, that's your problem, set mapred.map.tasks and  
mapred.reduce.tasks.

If that isn't it, make sure mapred.tasktracker.map.tasks.maximum and  
mapred.tasktracker.reduce.tasks.maximum are large enough in hadoop- 
site.xml on each node. AFAIK setting conf parameters within the job or  
by command-line flags has no effect on these.  If you use the hadoop- 
ec2 tools, you can do this with hadoop-ec2-env.sh.

Karl Anderson
kra@monkey.org
http://monkey.org/~kra