You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Lin Ma <li...@gmail.com> on 2013/01/20 15:25:50 UTC

Fair Scheduler of Hadoop

Hi guys,

I have a quick question regarding to fire scheduler of Hadoop, I am reading
this article =>
http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
question is from the following statements, "There is currently no support
for preemption of long tasks, but this is being added in
HADOOP-4665<https://issues.apache.org/jira/browse/HADOOP-4665>,
which will allow you to set how long each pool will wait before preempting
other jobs’ tasks to reach its guaranteed capacity.".

My questions are,

1. What means "preemption of long tasks"? Kill long running tasks, or pause
long running tasks to give resources to other tasks, or it means something
else?
2. I am also confused about "set how long each pool will wait before
preempting other jobs’ tasks to reach its guaranteed capacity"., what means
"reach its guaranteed capacity"? I think when using fair scheduler, each
pool has predefined resources allocation settings (and the settings
guarantees each pool has resources as configured), is that true? In what
situations each pool will not have its guaranteed (or configured) capacity?

regards,
Lin

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Thanks Joep, smart answer! All of my confusions are gone. Have a good
weekend.

regards,
Lin

On Tue, Jan 22, 2013 at 2:00 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> You could configure it like that if you wanted. Keep in mind that would
> waste some resources. Imagine a 10 minute task that has been running for 9
> minutes. If you have that task killed immediately then it would have to be
> the-scheduled and re-do all 10 minutes.
> Give it another minute and the task is complete and out if the way.
>
> So, consider how busy your cluster is overall and how much you are willing
> to wait for fairness trading this off against a certain amount of waste.
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi Joep,
>
> Excellent answer! I think you have answered my confusions. And one
> remaining issue after reading this document again, even it is old. :-)
>
> It is mentioned, "which will allow you to set how long each pool will
> wait before preempting other jobs’ tasks to reach its guaranteed capacity",
> my question is why each pool need wait here? If a pool cannot get its
> guaranteed capacity because of jobs in other pools over use the capacity,
> we should kill such jobs immediately? Appreciate if you could elaborate a
> bit more why we need wait to get even guaranteed capacity.
>
> regards,
> Lin
>
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:
>
>> Lin,
>>
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>>
>> You can set a minimum / guaranteed capacity. The sum of those across
>> pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would
>> happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other
>> pools that are not running at their minimum guaranteed capacity, some long
>> running tasks from jobs in the pool that is using more than its minimum
>> capacity get killed (to be run later again).
>>
>> Does that make sense?
>>
>> Cheers,
>>
>> Joep
>>
>> Sent from my iPhone
>>
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>>
>> Hi guys,
>>
>> I have a quick question regarding to fire scheduler of Hadoop, I am
>> reading this article =>
>> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
>> question is from the following statements, "There is currently no
>> support for preemption of long tasks, but this is being added in
>> HADOOP-4665 <https://issues.apache.org/jira/browse/HADOOP-4665>, which
>> will allow you to set how long each pool will wait before preempting other
>> jobs’ tasks to reach its guaranteed capacity.".
>>
>> My questions are,
>>
>> 1. What means "preemption of long tasks"? Kill long running tasks, or
>> pause long running tasks to give resources to other tasks, or it means
>> something else?
>> 2. I am also confused about "set how long each pool will wait before
>> preempting other jobs’ tasks to reach its guaranteed capacity"., what
>> means "reach its guaranteed capacity"? I think when using fair
>> scheduler, each pool has predefined resources allocation settings (and the
>> settings guarantees each pool has resources as configured), is that true?
>> In what situations each pool will not have its guaranteed (or configured)
>> capacity?
>>
>> regards,
>> Lin
>>
>>
>

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Thanks Joep, smart answer! All of my confusions are gone. Have a good
weekend.

regards,
Lin

On Tue, Jan 22, 2013 at 2:00 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> You could configure it like that if you wanted. Keep in mind that would
> waste some resources. Imagine a 10 minute task that has been running for 9
> minutes. If you have that task killed immediately then it would have to be
> the-scheduled and re-do all 10 minutes.
> Give it another minute and the task is complete and out if the way.
>
> So, consider how busy your cluster is overall and how much you are willing
> to wait for fairness trading this off against a certain amount of waste.
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi Joep,
>
> Excellent answer! I think you have answered my confusions. And one
> remaining issue after reading this document again, even it is old. :-)
>
> It is mentioned, "which will allow you to set how long each pool will
> wait before preempting other jobs’ tasks to reach its guaranteed capacity",
> my question is why each pool need wait here? If a pool cannot get its
> guaranteed capacity because of jobs in other pools over use the capacity,
> we should kill such jobs immediately? Appreciate if you could elaborate a
> bit more why we need wait to get even guaranteed capacity.
>
> regards,
> Lin
>
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:
>
>> Lin,
>>
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>>
>> You can set a minimum / guaranteed capacity. The sum of those across
>> pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would
>> happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other
>> pools that are not running at their minimum guaranteed capacity, some long
>> running tasks from jobs in the pool that is using more than its minimum
>> capacity get killed (to be run later again).
>>
>> Does that make sense?
>>
>> Cheers,
>>
>> Joep
>>
>> Sent from my iPhone
>>
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>>
>> Hi guys,
>>
>> I have a quick question regarding to fire scheduler of Hadoop, I am
>> reading this article =>
>> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
>> question is from the following statements, "There is currently no
>> support for preemption of long tasks, but this is being added in
>> HADOOP-4665 <https://issues.apache.org/jira/browse/HADOOP-4665>, which
>> will allow you to set how long each pool will wait before preempting other
>> jobs’ tasks to reach its guaranteed capacity.".
>>
>> My questions are,
>>
>> 1. What means "preemption of long tasks"? Kill long running tasks, or
>> pause long running tasks to give resources to other tasks, or it means
>> something else?
>> 2. I am also confused about "set how long each pool will wait before
>> preempting other jobs’ tasks to reach its guaranteed capacity"., what
>> means "reach its guaranteed capacity"? I think when using fair
>> scheduler, each pool has predefined resources allocation settings (and the
>> settings guarantees each pool has resources as configured), is that true?
>> In what situations each pool will not have its guaranteed (or configured)
>> capacity?
>>
>> regards,
>> Lin
>>
>>
>

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Thanks Joep, smart answer! All of my confusions are gone. Have a good
weekend.

regards,
Lin

On Tue, Jan 22, 2013 at 2:00 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> You could configure it like that if you wanted. Keep in mind that would
> waste some resources. Imagine a 10 minute task that has been running for 9
> minutes. If you have that task killed immediately then it would have to be
> the-scheduled and re-do all 10 minutes.
> Give it another minute and the task is complete and out if the way.
>
> So, consider how busy your cluster is overall and how much you are willing
> to wait for fairness trading this off against a certain amount of waste.
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi Joep,
>
> Excellent answer! I think you have answered my confusions. And one
> remaining issue after reading this document again, even it is old. :-)
>
> It is mentioned, "which will allow you to set how long each pool will
> wait before preempting other jobs’ tasks to reach its guaranteed capacity",
> my question is why each pool need wait here? If a pool cannot get its
> guaranteed capacity because of jobs in other pools over use the capacity,
> we should kill such jobs immediately? Appreciate if you could elaborate a
> bit more why we need wait to get even guaranteed capacity.
>
> regards,
> Lin
>
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:
>
>> Lin,
>>
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>>
>> You can set a minimum / guaranteed capacity. The sum of those across
>> pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would
>> happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other
>> pools that are not running at their minimum guaranteed capacity, some long
>> running tasks from jobs in the pool that is using more than its minimum
>> capacity get killed (to be run later again).
>>
>> Does that make sense?
>>
>> Cheers,
>>
>> Joep
>>
>> Sent from my iPhone
>>
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>>
>> Hi guys,
>>
>> I have a quick question regarding to fire scheduler of Hadoop, I am
>> reading this article =>
>> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
>> question is from the following statements, "There is currently no
>> support for preemption of long tasks, but this is being added in
>> HADOOP-4665 <https://issues.apache.org/jira/browse/HADOOP-4665>, which
>> will allow you to set how long each pool will wait before preempting other
>> jobs’ tasks to reach its guaranteed capacity.".
>>
>> My questions are,
>>
>> 1. What means "preemption of long tasks"? Kill long running tasks, or
>> pause long running tasks to give resources to other tasks, or it means
>> something else?
>> 2. I am also confused about "set how long each pool will wait before
>> preempting other jobs’ tasks to reach its guaranteed capacity"., what
>> means "reach its guaranteed capacity"? I think when using fair
>> scheduler, each pool has predefined resources allocation settings (and the
>> settings guarantees each pool has resources as configured), is that true?
>> In what situations each pool will not have its guaranteed (or configured)
>> capacity?
>>
>> regards,
>> Lin
>>
>>
>

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Thanks Joep, smart answer! All of my confusions are gone. Have a good
weekend.

regards,
Lin

On Tue, Jan 22, 2013 at 2:00 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> You could configure it like that if you wanted. Keep in mind that would
> waste some resources. Imagine a 10 minute task that has been running for 9
> minutes. If you have that task killed immediately then it would have to be
> the-scheduled and re-do all 10 minutes.
> Give it another minute and the task is complete and out if the way.
>
> So, consider how busy your cluster is overall and how much you are willing
> to wait for fairness trading this off against a certain amount of waste.
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi Joep,
>
> Excellent answer! I think you have answered my confusions. And one
> remaining issue after reading this document again, even it is old. :-)
>
> It is mentioned, "which will allow you to set how long each pool will
> wait before preempting other jobs’ tasks to reach its guaranteed capacity",
> my question is why each pool need wait here? If a pool cannot get its
> guaranteed capacity because of jobs in other pools over use the capacity,
> we should kill such jobs immediately? Appreciate if you could elaborate a
> bit more why we need wait to get even guaranteed capacity.
>
> regards,
> Lin
>
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:
>
>> Lin,
>>
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>>
>> You can set a minimum / guaranteed capacity. The sum of those across
>> pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would
>> happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other
>> pools that are not running at their minimum guaranteed capacity, some long
>> running tasks from jobs in the pool that is using more than its minimum
>> capacity get killed (to be run later again).
>>
>> Does that make sense?
>>
>> Cheers,
>>
>> Joep
>>
>> Sent from my iPhone
>>
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>>
>> Hi guys,
>>
>> I have a quick question regarding to fire scheduler of Hadoop, I am
>> reading this article =>
>> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
>> question is from the following statements, "There is currently no
>> support for preemption of long tasks, but this is being added in
>> HADOOP-4665 <https://issues.apache.org/jira/browse/HADOOP-4665>, which
>> will allow you to set how long each pool will wait before preempting other
>> jobs’ tasks to reach its guaranteed capacity.".
>>
>> My questions are,
>>
>> 1. What means "preemption of long tasks"? Kill long running tasks, or
>> pause long running tasks to give resources to other tasks, or it means
>> something else?
>> 2. I am also confused about "set how long each pool will wait before
>> preempting other jobs’ tasks to reach its guaranteed capacity"., what
>> means "reach its guaranteed capacity"? I think when using fair
>> scheduler, each pool has predefined resources allocation settings (and the
>> settings guarantees each pool has resources as configured), is that true?
>> In what situations each pool will not have its guaranteed (or configured)
>> capacity?
>>
>> regards,
>> Lin
>>
>>
>

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

You could configure it like that if you wanted. Keep in mind that would waste some resources. Imagine a 10 minute task that has been running for 9 minutes. If you have that task killed immediately then it would have to be the-scheduled and re-do all 10 minutes.
Give it another minute and the task is complete and out if the way.

So, consider how busy your cluster is overall and how much you are willing to wait for fairness trading this off against a certain amount of waste.

Cheers,

Joep

Sent from my iPhone

On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:

> Hi Joep,
> 
> Excellent answer! I think you have answered my confusions. And one remaining issue after reading this document again, even it is old. :-)
> 
> It is mentioned, "which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity", my question is why each pool need wait here? If a pool cannot get its guaranteed capacity because of jobs in other pools over use the capacity, we should kill such jobs immediately? Appreciate if you could elaborate a bit more why we need wait to get even guaranteed capacity.
> 
> regards,
> Lin
> 
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com> wrote:
>> Lin,
>> 
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>> 
>> You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).
>> 
>> Does that make sense?
>> 
>> Cheers,
>> 
>> Joep
>> 
>> Sent from my iPhone
>> 
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>> 
>>> Hi guys,
>>> 
>>> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
>>> 
>>> My questions are,
>>> 
>>> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
>>> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
>>> 
>>> regards,
>>> Lin
>

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

You could configure it like that if you wanted. Keep in mind that would waste some resources. Imagine a 10 minute task that has been running for 9 minutes. If you have that task killed immediately then it would have to be the-scheduled and re-do all 10 minutes.
Give it another minute and the task is complete and out if the way.

So, consider how busy your cluster is overall and how much you are willing to wait for fairness trading this off against a certain amount of waste.

Cheers,

Joep

Sent from my iPhone

On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:

> Hi Joep,
> 
> Excellent answer! I think you have answered my confusions. And one remaining issue after reading this document again, even it is old. :-)
> 
> It is mentioned, "which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity", my question is why each pool need wait here? If a pool cannot get its guaranteed capacity because of jobs in other pools over use the capacity, we should kill such jobs immediately? Appreciate if you could elaborate a bit more why we need wait to get even guaranteed capacity.
> 
> regards,
> Lin
> 
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com> wrote:
>> Lin,
>> 
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>> 
>> You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).
>> 
>> Does that make sense?
>> 
>> Cheers,
>> 
>> Joep
>> 
>> Sent from my iPhone
>> 
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>> 
>>> Hi guys,
>>> 
>>> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
>>> 
>>> My questions are,
>>> 
>>> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
>>> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
>>> 
>>> regards,
>>> Lin
>

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

You could configure it like that if you wanted. Keep in mind that would waste some resources. Imagine a 10 minute task that has been running for 9 minutes. If you have that task killed immediately then it would have to be the-scheduled and re-do all 10 minutes.
Give it another minute and the task is complete and out if the way.

So, consider how busy your cluster is overall and how much you are willing to wait for fairness trading this off against a certain amount of waste.

Cheers,

Joep

Sent from my iPhone

On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:

> Hi Joep,
> 
> Excellent answer! I think you have answered my confusions. And one remaining issue after reading this document again, even it is old. :-)
> 
> It is mentioned, "which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity", my question is why each pool need wait here? If a pool cannot get its guaranteed capacity because of jobs in other pools over use the capacity, we should kill such jobs immediately? Appreciate if you could elaborate a bit more why we need wait to get even guaranteed capacity.
> 
> regards,
> Lin
> 
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com> wrote:
>> Lin,
>> 
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>> 
>> You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).
>> 
>> Does that make sense?
>> 
>> Cheers,
>> 
>> Joep
>> 
>> Sent from my iPhone
>> 
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>> 
>>> Hi guys,
>>> 
>>> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
>>> 
>>> My questions are,
>>> 
>>> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
>>> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
>>> 
>>> regards,
>>> Lin
>

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

You could configure it like that if you wanted. Keep in mind that would waste some resources. Imagine a 10 minute task that has been running for 9 minutes. If you have that task killed immediately then it would have to be the-scheduled and re-do all 10 minutes.
Give it another minute and the task is complete and out if the way.

So, consider how busy your cluster is overall and how much you are willing to wait for fairness trading this off against a certain amount of waste.

Cheers,

Joep

Sent from my iPhone

On Jan 21, 2013, at 9:30 AM, Lin Ma <li...@gmail.com> wrote:

> Hi Joep,
> 
> Excellent answer! I think you have answered my confusions. And one remaining issue after reading this document again, even it is old. :-)
> 
> It is mentioned, "which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity", my question is why each pool need wait here? If a pool cannot get its guaranteed capacity because of jobs in other pools over use the capacity, we should kill such jobs immediately? Appreciate if you could elaborate a bit more why we need wait to get even guaranteed capacity.
> 
> regards,
> Lin
> 
> On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com> wrote:
>> Lin,
>> 
>> The article you are reading us old.
>> Fair scheduler does have preemption.
>> Tasks get killed and rerun later, potentially on a different node.
>> 
>> You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
>> Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
>> Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).
>> 
>> Does that make sense?
>> 
>> Cheers,
>> 
>> Joep
>> 
>> Sent from my iPhone
>> 
>> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>> 
>>> Hi guys,
>>> 
>>> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
>>> 
>>> My questions are,
>>> 
>>> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
>>> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
>>> 
>>> regards,
>>> Lin
>

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Hi Joep,

Excellent answer! I think you have answered my confusions. And one
remaining issue after reading this document again, even it is old. :-)

It is mentioned, "which will allow you to set how long each pool will wait
before preempting other jobs’ tasks to reach its guaranteed capacity", my
question is why each pool need wait here? If a pool cannot get its
guaranteed capacity because of jobs in other pools over use the capacity,
we should kill such jobs immediately? Appreciate if you could elaborate a
bit more why we need wait to get even guaranteed capacity.

regards,
Lin

On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> Lin,
>
> The article you are reading us old.
> Fair scheduler does have preemption.
> Tasks get killed and rerun later, potentially on a different node.
>
> You can set a minimum / guaranteed capacity. The sum of those across pools
> would typically equal the total capacity of your cluster or less.
> Then you can configure each pool to go beyond that capacity. That would
> happen if the cluster is temporary not used to the full capacity.
> Then when the demand for capacity increases, and jobs are queued in other
> pools that are not running at their minimum guaranteed capacity, some long
> running tasks from jobs in the pool that is using more than its minimum
> capacity get killed (to be run later again).
>
> Does that make sense?
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi guys,
>
> I have a quick question regarding to fire scheduler of Hadoop, I am
> reading this article =>
> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
> question is from the following statements, "There is currently no support
> for preemption of long tasks, but this is being added in HADOOP-4665<https://issues.apache.org/jira/browse/HADOOP-4665>,
> which will allow you to set how long each pool will wait before preempting
> other jobs’ tasks to reach its guaranteed capacity.".
>
> My questions are,
>
> 1. What means "preemption of long tasks"? Kill long running tasks, or
> pause long running tasks to give resources to other tasks, or it means
> something else?
> 2. I am also confused about "set how long each pool will wait before
> preempting other jobs’ tasks to reach its guaranteed capacity"., what
> means "reach its guaranteed capacity"? I think when using fair scheduler,
> each pool has predefined resources allocation settings (and the settings
> guarantees each pool has resources as configured), is that true? In what
> situations each pool will not have its guaranteed (or configured) capacity?
>
> regards,
> Lin
>
>

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Hi Joep,

Excellent answer! I think you have answered my confusions. And one
remaining issue after reading this document again, even it is old. :-)

It is mentioned, "which will allow you to set how long each pool will wait
before preempting other jobs’ tasks to reach its guaranteed capacity", my
question is why each pool need wait here? If a pool cannot get its
guaranteed capacity because of jobs in other pools over use the capacity,
we should kill such jobs immediately? Appreciate if you could elaborate a
bit more why we need wait to get even guaranteed capacity.

regards,
Lin

On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> Lin,
>
> The article you are reading us old.
> Fair scheduler does have preemption.
> Tasks get killed and rerun later, potentially on a different node.
>
> You can set a minimum / guaranteed capacity. The sum of those across pools
> would typically equal the total capacity of your cluster or less.
> Then you can configure each pool to go beyond that capacity. That would
> happen if the cluster is temporary not used to the full capacity.
> Then when the demand for capacity increases, and jobs are queued in other
> pools that are not running at their minimum guaranteed capacity, some long
> running tasks from jobs in the pool that is using more than its minimum
> capacity get killed (to be run later again).
>
> Does that make sense?
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi guys,
>
> I have a quick question regarding to fire scheduler of Hadoop, I am
> reading this article =>
> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
> question is from the following statements, "There is currently no support
> for preemption of long tasks, but this is being added in HADOOP-4665<https://issues.apache.org/jira/browse/HADOOP-4665>,
> which will allow you to set how long each pool will wait before preempting
> other jobs’ tasks to reach its guaranteed capacity.".
>
> My questions are,
>
> 1. What means "preemption of long tasks"? Kill long running tasks, or
> pause long running tasks to give resources to other tasks, or it means
> something else?
> 2. I am also confused about "set how long each pool will wait before
> preempting other jobs’ tasks to reach its guaranteed capacity"., what
> means "reach its guaranteed capacity"? I think when using fair scheduler,
> each pool has predefined resources allocation settings (and the settings
> guarantees each pool has resources as configured), is that true? In what
> situations each pool will not have its guaranteed (or configured) capacity?
>
> regards,
> Lin
>
>

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Hi Joep,

Excellent answer! I think you have answered my confusions. And one
remaining issue after reading this document again, even it is old. :-)

It is mentioned, "which will allow you to set how long each pool will wait
before preempting other jobs’ tasks to reach its guaranteed capacity", my
question is why each pool need wait here? If a pool cannot get its
guaranteed capacity because of jobs in other pools over use the capacity,
we should kill such jobs immediately? Appreciate if you could elaborate a
bit more why we need wait to get even guaranteed capacity.

regards,
Lin

On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> Lin,
>
> The article you are reading us old.
> Fair scheduler does have preemption.
> Tasks get killed and rerun later, potentially on a different node.
>
> You can set a minimum / guaranteed capacity. The sum of those across pools
> would typically equal the total capacity of your cluster or less.
> Then you can configure each pool to go beyond that capacity. That would
> happen if the cluster is temporary not used to the full capacity.
> Then when the demand for capacity increases, and jobs are queued in other
> pools that are not running at their minimum guaranteed capacity, some long
> running tasks from jobs in the pool that is using more than its minimum
> capacity get killed (to be run later again).
>
> Does that make sense?
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi guys,
>
> I have a quick question regarding to fire scheduler of Hadoop, I am
> reading this article =>
> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
> question is from the following statements, "There is currently no support
> for preemption of long tasks, but this is being added in HADOOP-4665<https://issues.apache.org/jira/browse/HADOOP-4665>,
> which will allow you to set how long each pool will wait before preempting
> other jobs’ tasks to reach its guaranteed capacity.".
>
> My questions are,
>
> 1. What means "preemption of long tasks"? Kill long running tasks, or
> pause long running tasks to give resources to other tasks, or it means
> something else?
> 2. I am also confused about "set how long each pool will wait before
> preempting other jobs’ tasks to reach its guaranteed capacity"., what
> means "reach its guaranteed capacity"? I think when using fair scheduler,
> each pool has predefined resources allocation settings (and the settings
> guarantees each pool has resources as configured), is that true? In what
> situations each pool will not have its guaranteed (or configured) capacity?
>
> regards,
> Lin
>
>

Re: Fair Scheduler of Hadoop

Posted by Lin Ma <li...@gmail.com>.

Hi Joep,

Excellent answer! I think you have answered my confusions. And one
remaining issue after reading this document again, even it is old. :-)

It is mentioned, "which will allow you to set how long each pool will wait
before preempting other jobs’ tasks to reach its guaranteed capacity", my
question is why each pool need wait here? If a pool cannot get its
guaranteed capacity because of jobs in other pools over use the capacity,
we should kill such jobs immediately? Appreciate if you could elaborate a
bit more why we need wait to get even guaranteed capacity.

regards,
Lin

On Mon, Jan 21, 2013 at 8:24 AM, Joep Rottinghuis <jr...@gmail.com>wrote:

> Lin,
>
> The article you are reading us old.
> Fair scheduler does have preemption.
> Tasks get killed and rerun later, potentially on a different node.
>
> You can set a minimum / guaranteed capacity. The sum of those across pools
> would typically equal the total capacity of your cluster or less.
> Then you can configure each pool to go beyond that capacity. That would
> happen if the cluster is temporary not used to the full capacity.
> Then when the demand for capacity increases, and jobs are queued in other
> pools that are not running at their minimum guaranteed capacity, some long
> running tasks from jobs in the pool that is using more than its minimum
> capacity get killed (to be run later again).
>
> Does that make sense?
>
> Cheers,
>
> Joep
>
> Sent from my iPhone
>
> On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:
>
> Hi guys,
>
> I have a quick question regarding to fire scheduler of Hadoop, I am
> reading this article =>
> http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my
> question is from the following statements, "There is currently no support
> for preemption of long tasks, but this is being added in HADOOP-4665<https://issues.apache.org/jira/browse/HADOOP-4665>,
> which will allow you to set how long each pool will wait before preempting
> other jobs’ tasks to reach its guaranteed capacity.".
>
> My questions are,
>
> 1. What means "preemption of long tasks"? Kill long running tasks, or
> pause long running tasks to give resources to other tasks, or it means
> something else?
> 2. I am also confused about "set how long each pool will wait before
> preempting other jobs’ tasks to reach its guaranteed capacity"., what
> means "reach its guaranteed capacity"? I think when using fair scheduler,
> each pool has predefined resources allocation settings (and the settings
> guarantees each pool has resources as configured), is that true? In what
> situations each pool will not have its guaranteed (or configured) capacity?
>
> regards,
> Lin
>
>

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

Lin,

The article you are reading us old.
Fair scheduler does have preemption.
Tasks get killed and rerun later, potentially on a different node.

You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).

Does that make sense?

Cheers,

Joep

Sent from my iPhone

On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:

> Hi guys,
> 
> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
> 
> My questions are,
> 
> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
> 
> regards,
> Lin

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

Lin,

The article you are reading us old.
Fair scheduler does have preemption.
Tasks get killed and rerun later, potentially on a different node.

You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).

Does that make sense?

Cheers,

Joep

Sent from my iPhone

On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:

> Hi guys,
> 
> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
> 
> My questions are,
> 
> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
> 
> regards,
> Lin

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

Lin,

The article you are reading us old.
Fair scheduler does have preemption.
Tasks get killed and rerun later, potentially on a different node.

You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).

Does that make sense?

Cheers,

Joep

Sent from my iPhone

On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:

> Hi guys,
> 
> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
> 
> My questions are,
> 
> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
> 
> regards,
> Lin

Re: Fair Scheduler of Hadoop

Posted by Joep Rottinghuis <jr...@gmail.com>.

Lin,

The article you are reading us old.
Fair scheduler does have preemption.
Tasks get killed and rerun later, potentially on a different node.

You can set a minimum / guaranteed capacity. The sum of those across pools would typically equal the total capacity of your cluster or less.
Then you can configure each pool to go beyond that capacity. That would happen if the cluster is temporary not used to the full capacity.
Then when the demand for capacity increases, and jobs are queued in other pools that are not running at their minimum guaranteed capacity, some long running tasks from jobs in the pool that is using more than its minimum capacity get killed (to be run later again).

Does that make sense?

Cheers,

Joep

Sent from my iPhone

On Jan 20, 2013, at 6:25 AM, Lin Ma <li...@gmail.com> wrote:

> Hi guys,
> 
> I have a quick question regarding to fire scheduler of Hadoop, I am reading this article => http://blog.cloudera.com/blog/2008/11/job-scheduling-in-hadoop/, my question is from the following statements, "There is currently no support for preemption of long tasks, but this is being added in HADOOP-4665, which will allow you to set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity.".
> 
> My questions are,
> 
> 1. What means "preemption of long tasks"? Kill long running tasks, or pause long running tasks to give resources to other tasks, or it means something else?
> 2. I am also confused about "set how long each pool will wait before preempting other jobs’ tasks to reach its guaranteed capacity"., what means "reach its guaranteed capacity"? I think when using fair scheduler, each pool has predefined resources allocation settings (and the settings guarantees each pool has resources as configured), is that true? In what situations each pool will not have its guaranteed (or configured) capacity?
> 
> regards,
> Lin