You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Markus Jelsma <ma...@openindex.io> on 2011/12/20 00:02:39 UTC
Variable mapreduce.tasktracker.*.tasks.maximum per job
Hi,
We have many different jobs running on a 0.22.0 cluster, each with its own
memory consumption. Some jobs can easily be run with a large amount of *.tasks
per job and others require much more memory and can only be run with a minimum
number of tasks per node.
Is there any way to reconfigure a running cluster on a per job basis so we can
set the heap size and number of mapper and reduce tasks per node? If not, we
have to force all settings to a level that is right for the toughest jobs
which will have a negative impact on simpler jobs.
Thoughts?
Thanks
Re: Variable mapreduce.tasktracker.*.tasks.maximum per job
Posted by Markus Jelsma <ma...@openindex.io>.
Thanks! I'll look into it.
On Tuesday 20 December 2011 01:31:17 Arun C Murthy wrote:
> Markus,
>
> The CapacityScheduler in 0.20.205 (in fact since 0.20.203) supports the
> notion of 'high memory jobs' with which you can specify, for each job, the
> number of 'slots' for each map/reduce. For e.g. you can say for job1 that
> each map needs 2 slots and so on.
>
> Unfortunately, I don't know how well this works in 0.22 - I might be wrong,
> but I heavily doubt it's been tested in 0.22. YMMV.
>
> Hope that helps.
>
> Arun
>
> On Dec 19, 2011, at 3:02 PM, Markus Jelsma wrote:
> > Hi,
> > We have many different jobs running on a 0.22.0 cluster, each with its
> > own memory consumption. Some jobs can easily be run with a large amount
> > of *.tasks per job and others require much more memory and can only be
> > run with a minimum number of tasks per node. Is there any way to
> > reconfigure a running cluster on a per job basis so we can set the heap
> > size and number of mapper and reduce tasks per node? If not, we have to
> > force all settings to a level that is right for the toughest jobs which
> > will have a negative impact on simpler jobs. Thoughts?
> > Thanks
Re: Variable mapreduce.tasktracker.*.tasks.maximum per job
Posted by Arun C Murthy <ac...@hortonworks.com>.
Markus,
The CapacityScheduler in 0.20.205 (in fact since 0.20.203) supports the notion of 'high memory jobs' with which you can specify, for each job, the number of 'slots' for each map/reduce. For e.g. you can say for job1 that each map needs 2 slots and so on.
Unfortunately, I don't know how well this works in 0.22 - I might be wrong, but I heavily doubt it's been tested in 0.22. YMMV.
Hope that helps.
Arun
On Dec 19, 2011, at 3:02 PM, Markus Jelsma wrote:
> Hi,
> We have many different jobs running on a 0.22.0 cluster, each with its own memory consumption. Some jobs can easily be run with a large amount of *.tasks per job and others require much more memory and can only be run with a minimum number of tasks per node.
> Is there any way to reconfigure a running cluster on a per job basis so we can set the heap size and number of mapper and reduce tasks per node? If not, we have to force all settings to a level that is right for the toughest jobs which will have a negative impact on simpler jobs.
> Thoughts?
> Thanks