You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Chris Embree <ce...@gmail.com> on 2012/12/18 07:12:10 UTC

Re: number of mapred slots

I think the rule of thumb (hortonworks at least) is 2x cores for maps
threads and 1x cores for reducers.  Don't have my notes here so I'm not
100%.  It's just a guideline in any event. :)

TEST, TEST, TEST.  :)

On Tue, Dec 18, 2012 at 1:08 AM, <al...@aim.com> wrote:

> Hello,
>
> I was unable to find any information regarding relationship between
> mapred slots and number of cpus on the net. All I found was that it is
> advisable to schedule two processes for one cpu.  If this is true, then for
> a slave  node with dual core( two cpus) that runs datanode, tasktracker,
> hbase regionserver and zookeeper, theoretically there is no space to run
> an additional mapred task. Any comment on this is welcome.
>
> In general what is the mapred slot and how is it related to number of cpu
> cores?
>
> Thanks in advance.
> Alex.
>

Re: number of mapred slots

Posted by al...@aim.com.
I have two slave nodes and one master. One slave node has quad core(4 cpus)(16GB mem) the other slave has  dual core (2 cpus) (16 GB mem) and master has dual core  4GB mem. I run hadoop and hbase. So, both slaves have already 4 processes (datanode, tasktracker, hbase regionserver and zookepper) and  I have this config in mapred-side.xml

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>10</value>
   <description>the number of available cores on the tasktracker machines
for map tasks
  </description>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>7</value>
   <description>the number of available cores on the tasktracker machines
for reduce tasks
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>10</value>
  <description>
    define mapred.map tasks to be number of slave hosts
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>7</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
  </description>
</property>
  
 

 To my understanding this means that number of reduce tasks must be 7. However, hadoop scheduled 10 reducers and all of them started at once. There was no pending reducers. Can anyone explain, why 10 reducers were running and where those slots come from, if there were 6 cpus and 8 processes already running in slave nodes.

Thanks.
Alex.


  

 

-----Original Message-----
From: Chris Embree <ce...@gmail.com>
To: user <us...@hadoop.apache.org>
Sent: Mon, Dec 17, 2012 10:12 pm
Subject: Re: number of mapred slots


I think the rule of thumb (hortonworks at least) is 2x cores for maps threads and 1x cores for reducers.  Don't have my notes here so I'm not 100%.  It's just a guideline in any event. :)


TEST, TEST, TEST.  :)


On Tue, Dec 18, 2012 at 1:08 AM,  <al...@aim.com> wrote:

Hello,

I was unable to find any information regarding relationship between mapred slots and number of cpus on the net. All I found was that it is advisable to schedule two processes for one cpu.  If this is true, then for a slave  node with dual core( two cpus) that runs datanode, tasktracker, hbase regionserver and zookeeper, theoretically there is no space to run an additional mapred task. Any comment on this is welcome.

In general what is the mapred slot and how is it related to number of cpu cores?

Thanks in advance.
Alex.



 

Re: number of mapred slots

Posted by al...@aim.com.
I have two slave nodes and one master. One slave node has quad core(4 cpus)(16GB mem) the other slave has  dual core (2 cpus) (16 GB mem) and master has dual core  4GB mem. I run hadoop and hbase. So, both slaves have already 4 processes (datanode, tasktracker, hbase regionserver and zookepper) and  I have this config in mapred-side.xml

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>10</value>
   <description>the number of available cores on the tasktracker machines
for map tasks
  </description>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>7</value>
   <description>the number of available cores on the tasktracker machines
for reduce tasks
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>10</value>
  <description>
    define mapred.map tasks to be number of slave hosts
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>7</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
  </description>
</property>
  
 

 To my understanding this means that number of reduce tasks must be 7. However, hadoop scheduled 10 reducers and all of them started at once. There was no pending reducers. Can anyone explain, why 10 reducers were running and where those slots come from, if there were 6 cpus and 8 processes already running in slave nodes.

Thanks.
Alex.


  

 

-----Original Message-----
From: Chris Embree <ce...@gmail.com>
To: user <us...@hadoop.apache.org>
Sent: Mon, Dec 17, 2012 10:12 pm
Subject: Re: number of mapred slots


I think the rule of thumb (hortonworks at least) is 2x cores for maps threads and 1x cores for reducers.  Don't have my notes here so I'm not 100%.  It's just a guideline in any event. :)


TEST, TEST, TEST.  :)


On Tue, Dec 18, 2012 at 1:08 AM,  <al...@aim.com> wrote:

Hello,

I was unable to find any information regarding relationship between mapred slots and number of cpus on the net. All I found was that it is advisable to schedule two processes for one cpu.  If this is true, then for a slave  node with dual core( two cpus) that runs datanode, tasktracker, hbase regionserver and zookeeper, theoretically there is no space to run an additional mapred task. Any comment on this is welcome.

In general what is the mapred slot and how is it related to number of cpu cores?

Thanks in advance.
Alex.



 

Re: number of mapred slots

Posted by al...@aim.com.
I have two slave nodes and one master. One slave node has quad core(4 cpus)(16GB mem) the other slave has  dual core (2 cpus) (16 GB mem) and master has dual core  4GB mem. I run hadoop and hbase. So, both slaves have already 4 processes (datanode, tasktracker, hbase regionserver and zookepper) and  I have this config in mapred-side.xml

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>10</value>
   <description>the number of available cores on the tasktracker machines
for map tasks
  </description>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>7</value>
   <description>the number of available cores on the tasktracker machines
for reduce tasks
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>10</value>
  <description>
    define mapred.map tasks to be number of slave hosts
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>7</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
  </description>
</property>
  
 

 To my understanding this means that number of reduce tasks must be 7. However, hadoop scheduled 10 reducers and all of them started at once. There was no pending reducers. Can anyone explain, why 10 reducers were running and where those slots come from, if there were 6 cpus and 8 processes already running in slave nodes.

Thanks.
Alex.


  

 

-----Original Message-----
From: Chris Embree <ce...@gmail.com>
To: user <us...@hadoop.apache.org>
Sent: Mon, Dec 17, 2012 10:12 pm
Subject: Re: number of mapred slots


I think the rule of thumb (hortonworks at least) is 2x cores for maps threads and 1x cores for reducers.  Don't have my notes here so I'm not 100%.  It's just a guideline in any event. :)


TEST, TEST, TEST.  :)


On Tue, Dec 18, 2012 at 1:08 AM,  <al...@aim.com> wrote:

Hello,

I was unable to find any information regarding relationship between mapred slots and number of cpus on the net. All I found was that it is advisable to schedule two processes for one cpu.  If this is true, then for a slave  node with dual core( two cpus) that runs datanode, tasktracker, hbase regionserver and zookeeper, theoretically there is no space to run an additional mapred task. Any comment on this is welcome.

In general what is the mapred slot and how is it related to number of cpu cores?

Thanks in advance.
Alex.



 

Re: number of mapred slots

Posted by al...@aim.com.
I have two slave nodes and one master. One slave node has quad core(4 cpus)(16GB mem) the other slave has  dual core (2 cpus) (16 GB mem) and master has dual core  4GB mem. I run hadoop and hbase. So, both slaves have already 4 processes (datanode, tasktracker, hbase regionserver and zookepper) and  I have this config in mapred-side.xml

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>10</value>
   <description>the number of available cores on the tasktracker machines
for map tasks
  </description>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>7</value>
   <description>the number of available cores on the tasktracker machines
for reduce tasks
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>10</value>
  <description>
    define mapred.map tasks to be number of slave hosts
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>7</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
  </description>
</property>
  
 

 To my understanding this means that number of reduce tasks must be 7. However, hadoop scheduled 10 reducers and all of them started at once. There was no pending reducers. Can anyone explain, why 10 reducers were running and where those slots come from, if there were 6 cpus and 8 processes already running in slave nodes.

Thanks.
Alex.


  

 

-----Original Message-----
From: Chris Embree <ce...@gmail.com>
To: user <us...@hadoop.apache.org>
Sent: Mon, Dec 17, 2012 10:12 pm
Subject: Re: number of mapred slots


I think the rule of thumb (hortonworks at least) is 2x cores for maps threads and 1x cores for reducers.  Don't have my notes here so I'm not 100%.  It's just a guideline in any event. :)


TEST, TEST, TEST.  :)


On Tue, Dec 18, 2012 at 1:08 AM,  <al...@aim.com> wrote:

Hello,

I was unable to find any information regarding relationship between mapred slots and number of cpus on the net. All I found was that it is advisable to schedule two processes for one cpu.  If this is true, then for a slave  node with dual core( two cpus) that runs datanode, tasktracker, hbase regionserver and zookeeper, theoretically there is no space to run an additional mapred task. Any comment on this is welcome.

In general what is the mapred slot and how is it related to number of cpu cores?

Thanks in advance.
Alex.