You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by David Parks <da...@yahoo.com> on 2013/05/13 13:21:53 UTC

Using FairScheduler to limit # of tasks

Can I use the FairScheduler to limit the number of map/reduce tasks directly
from the job configuration? E.g. I have 1 job that I know should run a more
limited # of map/reduce tasks than is set as the default, I want to
configure a queue with a limited # of map/reduce tasks, but only apply it to
that job, I don't want to deploy this queue configuration to the cluster.

 

Assuming the above answer is 'yes', if I were to limit the # of map tasks to
10 in a cluster of 10 nodes, would the fair scheduler tend to distribute
those 10 map tasks evenly across the nodes (assuming a cluster that's
otherwise unused at the moment), or would it be prone to over-loading a
single node just because those are the first open slots it sees?

 

David

 


Re: Using FairScheduler to limit # of tasks

Posted by Michel Segel <mi...@hotmail.com>.
Using fair scheduler or capacity scheduler,  you are creating a queue that is being applied to the cluster.

Having said that, you can limit who uses the special queue as well as specify the queue at the start of you job as a command line option.

HTH 

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 13, 2013, at 6:21 AM, "David Parks" <da...@yahoo.com> wrote:

> Can I use the FairScheduler to limit the number of map/reduce tasks directly from the job configuration? E.g. I have 1 job that I know should run a more limited # of map/reduce tasks than is set as the default, I want to configure a queue with a limited # of map/reduce tasks, but only apply it to that job, I don’t want to deploy this queue configuration to the cluster.
>  
> Assuming the above answer is ‘yes’, if I were to limit the # of map tasks to 10 in a cluster of 10 nodes, would the fair scheduler tend to distribute those 10 map tasks evenly across the nodes (assuming a cluster that’s otherwise unused at the moment), or would it be prone to over-loading a single node just because those are the first open slots it sees?
>  
> David
>  

Re: Using FairScheduler to limit # of tasks

Posted by Michel Segel <mi...@hotmail.com>.
Using fair scheduler or capacity scheduler,  you are creating a queue that is being applied to the cluster.

Having said that, you can limit who uses the special queue as well as specify the queue at the start of you job as a command line option.

HTH 

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 13, 2013, at 6:21 AM, "David Parks" <da...@yahoo.com> wrote:

> Can I use the FairScheduler to limit the number of map/reduce tasks directly from the job configuration? E.g. I have 1 job that I know should run a more limited # of map/reduce tasks than is set as the default, I want to configure a queue with a limited # of map/reduce tasks, but only apply it to that job, I don’t want to deploy this queue configuration to the cluster.
>  
> Assuming the above answer is ‘yes’, if I were to limit the # of map tasks to 10 in a cluster of 10 nodes, would the fair scheduler tend to distribute those 10 map tasks evenly across the nodes (assuming a cluster that’s otherwise unused at the moment), or would it be prone to over-loading a single node just because those are the first open slots it sees?
>  
> David
>  

Re: Using FairScheduler to limit # of tasks

Posted by Michel Segel <mi...@hotmail.com>.
Using fair scheduler or capacity scheduler,  you are creating a queue that is being applied to the cluster.

Having said that, you can limit who uses the special queue as well as specify the queue at the start of you job as a command line option.

HTH 

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 13, 2013, at 6:21 AM, "David Parks" <da...@yahoo.com> wrote:

> Can I use the FairScheduler to limit the number of map/reduce tasks directly from the job configuration? E.g. I have 1 job that I know should run a more limited # of map/reduce tasks than is set as the default, I want to configure a queue with a limited # of map/reduce tasks, but only apply it to that job, I don’t want to deploy this queue configuration to the cluster.
>  
> Assuming the above answer is ‘yes’, if I were to limit the # of map tasks to 10 in a cluster of 10 nodes, would the fair scheduler tend to distribute those 10 map tasks evenly across the nodes (assuming a cluster that’s otherwise unused at the moment), or would it be prone to over-loading a single node just because those are the first open slots it sees?
>  
> David
>  

Re: Using FairScheduler to limit # of tasks

Posted by Michel Segel <mi...@hotmail.com>.
Using fair scheduler or capacity scheduler,  you are creating a queue that is being applied to the cluster.

Having said that, you can limit who uses the special queue as well as specify the queue at the start of you job as a command line option.

HTH 

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 13, 2013, at 6:21 AM, "David Parks" <da...@yahoo.com> wrote:

> Can I use the FairScheduler to limit the number of map/reduce tasks directly from the job configuration? E.g. I have 1 job that I know should run a more limited # of map/reduce tasks than is set as the default, I want to configure a queue with a limited # of map/reduce tasks, but only apply it to that job, I don’t want to deploy this queue configuration to the cluster.
>  
> Assuming the above answer is ‘yes’, if I were to limit the # of map tasks to 10 in a cluster of 10 nodes, would the fair scheduler tend to distribute those 10 map tasks evenly across the nodes (assuming a cluster that’s otherwise unused at the moment), or would it be prone to over-loading a single node just because those are the first open slots it sees?
>  
> David
>