You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Iván de Prado <iv...@gmail.com> on 2008/05/09 16:29:19 UTC

Recover the deprecated mapred.tasktracker.tasks.maximum

https://issues.apache.org/jira/browse/HADOOP-1274 replaced the
configuration attribute mapred.tasktracker.tasks.maximum with
mapred.tasktracker.map.tasks.maximum and
mapred.tasktracker.reduce.tasks.maximum because it sometimes make sense
to have more mappers than reducers assigned to each node.

But deprecating mapred.tasktracker.tasks.maximum could be an issue in
some situations. As an example:

I have a 8 cores, 4GB, 4 nodes cluster. I want to limit the number of
tasks per node to 8. 8 tasks per nodes would use almost 100% cpu and 4
GB of the memory. I have set:

mapred.tasktracker.map.tasks.maximum -> 8
mapred.tasktracker.reduce.tasks.maximum -> 8 

1) When running only one Job at the same time, it works smoothly: 8 task
average per node, no swapping in nodes, almost 4 GB of memory usage and
100% of CPU usage. 

2) When running more than one Job at the same time, it works really bad:
16 tasks average per node, 8 GB usage of memory (4 GB swapped), and a
lot of System CPU usage.

So, I think that have sense to restore the old
attribute mapred.tasktracker.tasks.maximum making it compatible with
the new ones.

Task trackers could not:
 - run more than mapred.tasktracker.tasks.maximum tasks per node, 
 - run more than mapred.tasktracker.map.tasks.maximum mappers per
node, 
 - run more than mapred.tasktracker.reduce.tasks.maximum reducers per
node. 

Should I open a ticket with that?

Thanks!, 
Iván de Prado
www.ivanprado.es