You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Andrew McNabb <am...@mcnabbs.org> on 2006/03/22 23:53:22 UTC

multiple threads

Does Hadoop run multiple threads on a single slave?  In my current
experiment, I'm running with three slaves, each of which has four
processors.  Hadoop only appears to be using one processor on each
slave.  It's very possible that I've made a configuration mistake.

I have mapred.map.tasks set to 7 (this should be enough to see two jobs
on each client, though I plan on eventually setting it higher).
mapred.tasktracker.tasks.maximum is 3 (eventually this will be higher,
too).

Thanks.

-- 
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868

Re: multiple threads

Posted by Andrew McNabb <am...@mcnabbs.org>.

On Wed, Mar 22, 2006 at 03:13:40PM -0800, Doug Cutting wrote:
> Yes, although each task runs in a separate JVM, not a thread.  A slave
> will run up to mapred.tasktracker.tasks.maximum map and/or reduce
> tasks at a time.

That makes sense.

> The actual number of map tasks is determined by the number of input
> splits.  Perhaps your input data is not big enough to result in more
> than a few input splits?  A SequenceFile-format input cannot be split
> into chunks smaller than 2k bytes.

It turns out that the expected number of jobs did run on each
machine--they just finished quickly enough that I didn't see them in
top.

Thanks.

-- 
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868

Re: multiple threads

Posted by Doug Cutting <cu...@apache.org>.

Andrew McNabb wrote:
> Does Hadoop run multiple threads on a single slave?

Yes, although each task runs in a separate JVM, not a thread.  A slave 
will run up to mapred.tasktracker.tasks.maximum map and/or reduce tasks 
at a time.

> I have mapred.map.tasks set to 7 (this should be enough to see two jobs
> on each client, though I plan on eventually setting it higher).
> mapred.tasktracker.tasks.maximum is 3 (eventually this will be higher,
> too).

The actual number of map tasks is determined by the number of input 
splits.  Perhaps your input data is not big enough to result in more 
than a few input splits?  A SequenceFile-format input cannot be split 
into chunks smaller than 2k bytes.

Doug