You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Karthik Kambatla <ka...@cloudera.com> on 2016/11/11 00:46:38 UTC

[DISCUSS] Merge FairScheduler preemption overhaul (YARN-4752) to trunk

Hi folks

We have been working on overhauling FairScheduler preemption on branch
YARN-4572. It is close to being ready for merge to trunk:

   1. Preemption considers individual ResourceRequests to satisfy.
   (YARN-5605)
   2. Preemption now works within a leaf queue and across sibling leaf
   queues. (YARN-5605)
   3. Comprehensive unit tests for app starvation and preemption - minshare
   and fairshare. (YARN-5783 and YARN-5819, the latter is close to commit)
   4. TODO: Clean up the TODOs to remove my initials and replace them with
   appropriate JIRAs.

There are some unresolved sub-tasks in the umbrella JIRA, but none of them
are regressions in the new implementation.

I just uploaded the cumulative patch to YARN-4752 for Jenkins verification
and will follow up on any issues that come up.

Would like to hear your thoughts on the merge.

Thanks
Karthik

PS: Post facto, I feel a feature branch was unnecessary for this work.
Github PRs with multiple commits for ease of review would have been enough.

Re: [DISCUSS] Merge FairScheduler preemption overhaul (YARN-4752) to trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.
Thanks Sangjin, Daniel, and Arun for your positive feedback.

Sangjin - filed HADOOP-13821 to track the findbugs work.

Arun - A synchronous approach would benefit with locking, but might lead to
delay in processing apps that do not require a preemption to satisfy their
request. Offline, you had also suggested using a PriorityQueue (instead of
a plan BlockingQueue) to track starved applications in FSStarvedApps. Filed
YARN-5893 for the same. And, on the nit about method names, I have updated
it on YARN-5885.

Daniel is reviewing YARN-5885. And, Jenkins seems to be okay with the uber
patch on YARN-4752.

I will start the VOTE thread for merge as soon as YARN-5885 goes in.

Thanks
Karthik

On Wed, Nov 16, 2016 at 11:06 AM, Arun Suresh <ar...@gmail.com> wrote:

> +1 from me too.
>
> minor nit:
> * In FSAppAttempt, maybe rename addPreemption(container) /
> removePreemption(container) to something better.
>
> Also, had an offline discussion with Karthik. Suggested to maybe
> experiment with replacing the PreemptionThread with a synchronous
> preemption approach. For eg. Maybe keep adding to the FSStarvedApps and
> when it reaches a threshold, trigger a preemption. It would likely avoid
> race conditions and reduce synchronization costs.. but am ok with trying
> this out and experimenting in follow up JIRAs.
>
> Cheers
> -Arun
>
> On Tue, Nov 15, 2016 at 1:43 PM, Daniel Templeton <da...@cloudera.com>
> wrote:
>
>> +1 from me, but no surprise since I was the branch reviewer.
>>
>> Aside from being functionally superior to the existing code, this branch
>> is also cleaner and better tested code.  We did our best to make sure that
>> the patches were high quality and all the testing bases were covered.
>>
>> To add to Karthik's points, YARN-5819 was committed early this morning,
>> and the JIRA for his TODO bullet is YARN-5885.
>>
>> Daniel
>>
>>
>> On 11/10/16 6:19 PM, Karthik Kambatla wrote:
>>
>>> Forgot to mention
>>>
>>>     - All the patches were reviewed before getting committed to the
>>> branch.
>>>     - The changes are strictly internal to the scheduler. Most of it is
>>>     limited to FairScheduler with minimal changes adding helper methods
>>> to
>>>     common scheduler code.
>>>
>>>
>>>
>>> On Thu, Nov 10, 2016 at 4:46 PM, Karthik Kambatla <ka...@cloudera.com>
>>> wrote:
>>>
>>> Hi folks
>>>>
>>>> We have been working on overhauling FairScheduler preemption on branch
>>>> YARN-4572. It is close to being ready for merge to trunk:
>>>>
>>>>     1. Preemption considers individual ResourceRequests to satisfy.
>>>>     (YARN-5605)
>>>>     2. Preemption now works within a leaf queue and across sibling leaf
>>>>     queues. (YARN-5605)
>>>>     3. Comprehensive unit tests for app starvation and preemption -
>>>>     minshare and fairshare. (YARN-5783 and YARN-5819, the latter is
>>>> close to
>>>>     commit)
>>>>     4. TODO: Clean up the TODOs to remove my initials and replace them
>>>>     with appropriate JIRAs.
>>>>
>>>> There are some unresolved sub-tasks in the umbrella JIRA, but none of
>>>> them
>>>> are regressions in the new implementation.
>>>>
>>>> I just uploaded the cumulative patch to YARN-4752 for Jenkins
>>>> verification
>>>> and will follow up on any issues that come up.
>>>>
>>>> Would like to hear your thoughts on the merge.
>>>>
>>>> Thanks
>>>> Karthik
>>>>
>>>> PS: Post facto, I feel a feature branch was unnecessary for this work.
>>>> Github PRs with multiple commits for ease of review would have been
>>>> enough.
>>>>
>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>
>>
>

Re: [DISCUSS] Merge FairScheduler preemption overhaul (YARN-4752) to trunk

Posted by Arun Suresh <ar...@gmail.com>.
+1 from me too.

minor nit:
* In FSAppAttempt, maybe rename addPreemption(container) /
removePreemption(container) to something better.

Also, had an offline discussion with Karthik. Suggested to maybe experiment
with replacing the PreemptionThread with a synchronous preemption approach.
For eg. Maybe keep adding to the FSStarvedApps and when it reaches a
threshold, trigger a preemption. It would likely avoid race conditions and
reduce synchronization costs.. but am ok with trying this out and
experimenting in follow up JIRAs.

Cheers
-Arun

On Tue, Nov 15, 2016 at 1:43 PM, Daniel Templeton <da...@cloudera.com>
wrote:

> +1 from me, but no surprise since I was the branch reviewer.
>
> Aside from being functionally superior to the existing code, this branch
> is also cleaner and better tested code.  We did our best to make sure that
> the patches were high quality and all the testing bases were covered.
>
> To add to Karthik's points, YARN-5819 was committed early this morning,
> and the JIRA for his TODO bullet is YARN-5885.
>
> Daniel
>
>
> On 11/10/16 6:19 PM, Karthik Kambatla wrote:
>
>> Forgot to mention
>>
>>     - All the patches were reviewed before getting committed to the
>> branch.
>>     - The changes are strictly internal to the scheduler. Most of it is
>>     limited to FairScheduler with minimal changes adding helper methods to
>>     common scheduler code.
>>
>>
>>
>> On Thu, Nov 10, 2016 at 4:46 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>
>> Hi folks
>>>
>>> We have been working on overhauling FairScheduler preemption on branch
>>> YARN-4572. It is close to being ready for merge to trunk:
>>>
>>>     1. Preemption considers individual ResourceRequests to satisfy.
>>>     (YARN-5605)
>>>     2. Preemption now works within a leaf queue and across sibling leaf
>>>     queues. (YARN-5605)
>>>     3. Comprehensive unit tests for app starvation and preemption -
>>>     minshare and fairshare. (YARN-5783 and YARN-5819, the latter is
>>> close to
>>>     commit)
>>>     4. TODO: Clean up the TODOs to remove my initials and replace them
>>>     with appropriate JIRAs.
>>>
>>> There are some unresolved sub-tasks in the umbrella JIRA, but none of
>>> them
>>> are regressions in the new implementation.
>>>
>>> I just uploaded the cumulative patch to YARN-4752 for Jenkins
>>> verification
>>> and will follow up on any issues that come up.
>>>
>>> Would like to hear your thoughts on the merge.
>>>
>>> Thanks
>>> Karthik
>>>
>>> PS: Post facto, I feel a feature branch was unnecessary for this work.
>>> Github PRs with multiple commits for ease of review would have been
>>> enough.
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Merge FairScheduler preemption overhaul (YARN-4752) to trunk

Posted by Daniel Templeton <da...@cloudera.com>.
+1 from me, but no surprise since I was the branch reviewer.

Aside from being functionally superior to the existing code, this branch 
is also cleaner and better tested code.  We did our best to make sure 
that the patches were high quality and all the testing bases were covered.

To add to Karthik's points, YARN-5819 was committed early this morning, 
and the JIRA for his TODO bullet is YARN-5885.

Daniel

On 11/10/16 6:19 PM, Karthik Kambatla wrote:
> Forgot to mention
>
>     - All the patches were reviewed before getting committed to the branch.
>     - The changes are strictly internal to the scheduler. Most of it is
>     limited to FairScheduler with minimal changes adding helper methods to
>     common scheduler code.
>
>
>
> On Thu, Nov 10, 2016 at 4:46 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>
>> Hi folks
>>
>> We have been working on overhauling FairScheduler preemption on branch
>> YARN-4572. It is close to being ready for merge to trunk:
>>
>>     1. Preemption considers individual ResourceRequests to satisfy.
>>     (YARN-5605)
>>     2. Preemption now works within a leaf queue and across sibling leaf
>>     queues. (YARN-5605)
>>     3. Comprehensive unit tests for app starvation and preemption -
>>     minshare and fairshare. (YARN-5783 and YARN-5819, the latter is close to
>>     commit)
>>     4. TODO: Clean up the TODOs to remove my initials and replace them
>>     with appropriate JIRAs.
>>
>> There are some unresolved sub-tasks in the umbrella JIRA, but none of them
>> are regressions in the new implementation.
>>
>> I just uploaded the cumulative patch to YARN-4752 for Jenkins verification
>> and will follow up on any issues that come up.
>>
>> Would like to hear your thoughts on the merge.
>>
>> Thanks
>> Karthik
>>
>> PS: Post facto, I feel a feature branch was unnecessary for this work.
>> Github PRs with multiple commits for ease of review would have been enough.
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS] Merge FairScheduler preemption overhaul (YARN-4752) to trunk

Posted by Sangjin Lee <sj...@apache.org>.
I did a fairly quick pass on the uber-patch, and it looks good to me.

The only small annoyance is the trivial overrides of equals() and
hashCode() just to keep findbugs quiet. I've seen this before in other
patches, and I just don't understand why findbugs requires you to create
no-op overrides when you have no intention of changing the base behavior.
Perhaps we should file a minor JIRA to fix findbugs rule on this?

On Thu, Nov 10, 2016 at 6:19 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> Forgot to mention
>
>    - All the patches were reviewed before getting committed to the branch.
>    - The changes are strictly internal to the scheduler. Most of it is
>    limited to FairScheduler with minimal changes adding helper methods to
>    common scheduler code.
>
>
>
> On Thu, Nov 10, 2016 at 4:46 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>
> > Hi folks
> >
> > We have been working on overhauling FairScheduler preemption on branch
> > YARN-4572. It is close to being ready for merge to trunk:
> >
> >    1. Preemption considers individual ResourceRequests to satisfy.
> >    (YARN-5605)
> >    2. Preemption now works within a leaf queue and across sibling leaf
> >    queues. (YARN-5605)
> >    3. Comprehensive unit tests for app starvation and preemption -
> >    minshare and fairshare. (YARN-5783 and YARN-5819, the latter is close
> to
> >    commit)
> >    4. TODO: Clean up the TODOs to remove my initials and replace them
> >    with appropriate JIRAs.
> >
> > There are some unresolved sub-tasks in the umbrella JIRA, but none of
> them
> > are regressions in the new implementation.
> >
> > I just uploaded the cumulative patch to YARN-4752 for Jenkins
> verification
> > and will follow up on any issues that come up.
> >
> > Would like to hear your thoughts on the merge.
> >
> > Thanks
> > Karthik
> >
> > PS: Post facto, I feel a feature branch was unnecessary for this work.
> > Github PRs with multiple commits for ease of review would have been
> enough.
> >
>

Re: [DISCUSS] Merge FairScheduler preemption overhaul (YARN-4752) to trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.
Forgot to mention

   - All the patches were reviewed before getting committed to the branch.
   - The changes are strictly internal to the scheduler. Most of it is
   limited to FairScheduler with minimal changes adding helper methods to
   common scheduler code.



On Thu, Nov 10, 2016 at 4:46 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> Hi folks
>
> We have been working on overhauling FairScheduler preemption on branch
> YARN-4572. It is close to being ready for merge to trunk:
>
>    1. Preemption considers individual ResourceRequests to satisfy.
>    (YARN-5605)
>    2. Preemption now works within a leaf queue and across sibling leaf
>    queues. (YARN-5605)
>    3. Comprehensive unit tests for app starvation and preemption -
>    minshare and fairshare. (YARN-5783 and YARN-5819, the latter is close to
>    commit)
>    4. TODO: Clean up the TODOs to remove my initials and replace them
>    with appropriate JIRAs.
>
> There are some unresolved sub-tasks in the umbrella JIRA, but none of them
> are regressions in the new implementation.
>
> I just uploaded the cumulative patch to YARN-4752 for Jenkins verification
> and will follow up on any issues that come up.
>
> Would like to hear your thoughts on the merge.
>
> Thanks
> Karthik
>
> PS: Post facto, I feel a feature branch was unnecessary for this work.
> Github PRs with multiple commits for ease of review would have been enough.
>