You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Jian He <jh...@hortonworks.com> on 2015/09/17 03:30:19 UTC

[VOTE] Merge YARN-1197 container resize into trunk

Hi All,

Thanks Meng Ding and Wangda Tan for all the hard work !

I would like to call a vote to merge YARN-1197 container resize into trunk.

Key idea:
This feature adds the ability for AM to change container resource size at runtime.

Details:
- This feature is tracked at https://issues.apache.org/jira/browse/YARN-1197
- It’s currently developed at a separate branch: https://github.com/apache/hadoop/commits/YARN-1197
- A uber patch(https://issues.apache.org/jira/browse/YARN-4157) generated from YARN-1197 to run against trunk  shows all unit tests have passed.
- This feature now can work end-to-end.
-  All the unresolved jiras under YARN-1197 will be the next step.

Thanks,
Wangda Tan & Meng Ding & Jian He

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
With 6 binding +1s and no -1s, this vote passes. I'm merging YARN-1197 into
trunk now. I will merge it to branch-2 after we get stable trunk build.

Thanks everyone for voting!

Regards,
Wangda

On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Karthik,
>
> Let me elaborate more to make you feel better of this change, don't be
> scared by the size of the patch :)
>
> Common RM/Scheduler part:
> - AbstractYarnScheduler new logic only.
> - AppSchedulingInfo new logic only.
> - RMContainer / RMNode state machine, new logic only
> - SchedulerApplicationAttempt / Allocation, refactoring to existing
> reservation logic so increase request reservation can reuse it, and
> refactored to simply updating container token / pull container part so
> increase/decrease/new-allocation can reuse same code.
>
> FairScheduler:
> - Small change since we updated how to pull container updated token. I
> believe it will be a straightforward change for you if you take a closer
> look at it.
>
> CapacityScheduler:
> - Most changes are separate logic or small refactorings, most complex
> allocation logic stays within IncreaseContainerAllocator.java.
>
> Please let me know where you want to get more details of implementations.
>
> I strongly suggest you to take a glance at the diff, we have already
> worked on the merge for the past one week, and we've paid a lot of extra
> time to keep YARN-1197 sync with trunk in the past several months. After
> this merge finished, a couple of weeks needed to finish end-to-end test and
> some other extra tests, it won't affect our upcoming branch-2 release.
>
> I would not prefer to merge to trunk only, all people working on RM side
> will be affected, we're very carefully avoid such divergence of RM in
> trunk/branch-2. Since nobody wants to create two different patches for
> every RM changes. And also, after this finished, other efforts can happen
> in parallel such as YARN-4091.
>
> Let me know if you have any other questions/concerns.
>
> Thanks,
> Wangda
>
> On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>
>> I am sorry, but merging a potentially disruptive change to branch-2
>> without
>> end-to-end tests seems too disruptive to me.
>>
>> I do agree with you on the potential inconvenience of having to post
>> different patches for trunk and branch-2, but I would rather have that
>> inconvenience than the risk of merging something that hasn't been
>> thoroughly tested.
>>
>> On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com> wrote:
>>
>> > Hi Karthik,
>> >
>> > Thanks for comments! However, I think only merge to trunk may not work,
>> > this patch involves thousands lines of code changes in scheduler side,
>> only
>> > putting that to trunk could lead to trunk/branch-2 totally incompatible
>> for
>> > resource manager. I think most of the code changes are new to scheduler
>> > instead of modifying existed logic, they're not very tricky to me. And
>> when
>> > 2.8 will be released is not planned yet, at least we have a couple of
>> > months to make sure this feature becomes usable and not cause existing
>> > behavior regressions.
>> >
>> > Sounds good to you?
>> >
>> > Wangda
>> >
>> >
>> > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <ka...@cloudera.com>
>> > wrote:
>> >
>> > > +1 on merging to trunk. It would be nice to have some amount of
>> testing
>> > > done before the merge, but I understand how merging to trunk would
>> likely
>> > > speed up the testing efforts.
>> > >
>> > > Let us not merge into branch-2 until after we have done a fair bit of
>> > > testing, and are comfortable including it in a release. While the code
>> > > mostly appears to not mess with existing scheduling logic, I am
>> concerned
>> > > about regressions to existing scheduling behavior.
>> > >
>> > >
>> > >
>> > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <kasha@cloudera.com
>> >
>> > > wrote:
>> > >
>> > > > By the way, for the purposes of merge vote, I believe a committer's
>> > vote
>> > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
>> > > >
>> > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com>
>> wrote:
>> > > >
>> > > >> +1 (non-binding)
>> > > >>
>> > > >> thanks
>> > > >> Zhihai Xu
>> > > >>
>> > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <xgong@hortonworks.com
>> >
>> > > >> wrote:
>> > > >>
>> > > >> > +1 Binding
>> > > >> >
>> > > >> > Thanks
>> > > >> >
>> > > >> > Xuan Gong
>> > > >> >
>> > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com>
>> > > wrote:
>> > > >> > >
>> > > >> > > +1. (Binding).
>> > > >> > >
>> > > >> > > Thanks,
>> > > >> > >
>> > > >> > > Junping
>> > > >> > > ________________________________________
>> > > >> > > From: Wangda Tan <wh...@gmail.com>
>> > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
>> > > >> > > To: yarn-dev@hadoop.apache.org
>> > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
>> > > >> > >
>> > > >> > > +1 (non-binding),
>> > > >> > >
>> > > >> > > Thanks Jian starting this thread. This can minimize effort of
>> > works
>> > > >> > across branches.
>> > > >> > >
>> > > >> > > To clarify, this feature is end-to-end code  completed, we have
>> > API,
>> > > >> > rm/nm implementations patches committed, but we haven't tested it
>> > > >> > end-to-end. Filed YARN-4175 to create an example program to test
>> it
>> > > >> > end-to-end.
>> > > >> > >
>> > > >> > > Regards,
>> > > >> > > Wangda
>> > > >> > >
>> > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com>
>> > wrote:
>> > > >> > >>
>> > > >> > >> Hi All,
>> > > >> > >>
>> > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
>> > > >> > >>
>> > > >> > >> I would like to call a vote to merge YARN-1197 container
>> resize
>> > > into
>> > > >> > trunk.
>> > > >> > >>
>> > > >> > >> Key idea:
>> > > >> > >> This feature adds the ability for AM to change container
>> resource
>> > > >> size
>> > > >> > at runtime.
>> > > >> > >>
>> > > >> > >> Details:
>> > > >> > >> - This feature is tracked at
>> > > >> > https://issues.apache.org/jira/browse/YARN-1197
>> > > >> > >> - It’s currently developed at a separate branch:
>> > > >> > https://github.com/apache/hadoop/commits/YARN-1197
>> > > >> > >> - A uber patch(
>> https://issues.apache.org/jira/browse/YARN-4157)
>> > > >> > generated from YARN-1197 to run against trunk  shows all unit
>> tests
>> > > have
>> > > >> > passed.
>> > > >> > >> - This feature now can work end-to-end.
>> > > >> > >> -  All the unresolved jiras under YARN-1197 will be the next
>> > step.
>> > > >> > >>
>> > > >> > >> Thanks,
>> > > >> > >> Wangda Tan & Meng Ding & Jian He
>> > > >> > >
>> > > >> > >
>> > > >> >
>> > > >> >
>> > > >>
>> > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
Hi Karthik,
Will do that next time, thanks!

Regards,
Wangda

On Thu, Sep 24, 2015 at 4:56 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> By the way, when we do merge votes, I was under the impression that we do
> two separate votes for trunk and branch-2. While we were discussing merging
> to branch-2 in this thread, the vote was only for trunk. I was hoping to
> exercise my opportunity to -0 on the branch-2 vote :)
>
> In the future, can we please stick to our usual approach of filing a vote
> for each branch please?
>
> On Thu, Sep 24, 2015 at 2:46 PM, Wangda Tan <wh...@gmail.com> wrote:
>
> > Thanks Chris for the suggestions,
> > I think we need a couple of weeks to finish end-to-end test, doing this
> is
> > majorly to avoid painful of resolving conflicts in the scheduler side.
> > In any case, will avoid doing this in the future.
> >
> > Regards,
> > Wangda
> >
> >
> > On Thu, Sep 24, 2015 at 2:29 PM, Chris Douglas <cd...@apache.org>
> > wrote:
> >
> > > Catching up on this thread. I'd missed Wangda's comment. This was
> > > merged without testing it end-to-end? And backported to branch-2?
> > >
> > > I share Karthik's reservations. This is a very aggressive merging
> > > strategy. Every other feature merged to branch-2 must also help
> > > stabilize YARN-1197. Going through the patch, the particular changes
> > > are not terribly threatening and the passing unit tests are
> > > encouraging. However, a few days' delay to test the feature is not
> > > asking much, particularly when you have reason to be confident in its
> > > stability.
> > >
> > > -0, after the fact. -C
> > >
> > > On Thu, Sep 24, 2015 at 11:39 AM, Wangda Tan <wh...@gmail.com>
> > wrote:
> > > > Trunk build seems stable now after the merge. A couple of yarn trunk
> > > builds
> > > > also ran fine such as:
> > > https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/
> > > > .
> > > >
> > > > I've merged this into branch-2 so as to minimize YARN dev effort
> across
> > > > branches as per the discussion thread of YARN-1197 merge.
> > > >
> > > > Thanks,
> > > > Wangda
> > > >
> > > >
> > > > On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <wh...@gmail.com>
> > wrote:
> > > >
> > > >> Thanks Karthik,
> > > >>
> > > >> Doing end-to-end test as well as regression test is the top priority
> > of
> > > >> this feature. Will keep you updated.
> > > >>
> > > >> Regards,
> > > >> Wangda
> > > >>
> > > >> On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <
> kasha@cloudera.com
> > >
> > > >> wrote:
> > > >>
> > > >>> I did skim through the patch before voting :)
> > > >>>
> > > >>> As I mentioned earlier, I understand vast majority of the code is
> net
> > > new,
> > > >>> and the likelihood of breaking changes is low. Still, I would like
> > for
> > > us
> > > >>> to be more careful and run some end-to-end and regression tests
> > before
> > > >>> including this in a release. While I won't block the merge to
> > branch-2,
> > > >>> will be a -1 on cutting a release without due testing.
> > > >>>
> > > >>> At the risk of digressing, I also feel branch-2 should be in a
> > > releasable
> > > >>> state more often than not. The only way to bring any cadence and
> > > >>> predictability to our releases is by maintaining branch-2 in such a
> > > state.
> > > >>>
> > > >>> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com>
> > > wrote:
> > > >>>
> > > >>> > Hi Karthik,
> > > >>> >
> > > >>> > Let me elaborate more to make you feel better of this change,
> don't
> > > be
> > > >>> > scared by the size of the patch :)
> > > >>> >
> > > >>> > Common RM/Scheduler part:
> > > >>> > - AbstractYarnScheduler new logic only.
> > > >>> > - AppSchedulingInfo new logic only.
> > > >>> > - RMContainer / RMNode state machine, new logic only
> > > >>> > - SchedulerApplicationAttempt / Allocation, refactoring to
> existing
> > > >>> > reservation logic so increase request reservation can reuse it,
> and
> > > >>> > refactored to simply updating container token / pull container
> part
> > > so
> > > >>> > increase/decrease/new-allocation can reuse same code.
> > > >>> >
> > > >>> > FairScheduler:
> > > >>> > - Small change since we updated how to pull container updated
> > token.
> > > I
> > > >>> > believe it will be a straightforward change for you if you take a
> > > closer
> > > >>> > look at it.
> > > >>> >
> > > >>> > CapacityScheduler:
> > > >>> > - Most changes are separate logic or small refactorings, most
> > complex
> > > >>> > allocation logic stays within IncreaseContainerAllocator.java.
> > > >>> >
> > > >>> > Please let me know where you want to get more details of
> > > >>> implementations.
> > > >>> >
> > > >>> > I strongly suggest you to take a glance at the diff, we have
> > already
> > > >>> worked
> > > >>> > on the merge for the past one week, and we've paid a lot of extra
> > > time
> > > >>> to
> > > >>> > keep YARN-1197 sync with trunk in the past several months. After
> > this
> > > >>> merge
> > > >>> > finished, a couple of weeks needed to finish end-to-end test and
> > some
> > > >>> other
> > > >>> > extra tests, it won't affect our upcoming branch-2 release.
> > > >>> >
> > > >>> > I would not prefer to merge to trunk only, all people working on
> RM
> > > side
> > > >>> > will be affected, we're very carefully avoid such divergence of
> RM
> > in
> > > >>> > trunk/branch-2. Since nobody wants to create two different
> patches
> > > for
> > > >>> > every RM changes. And also, after this finished, other efforts
> can
> > > >>> happen
> > > >>> > in parallel such as YARN-4091.
> > > >>> >
> > > >>> > Let me know if you have any other questions/concerns.
> > > >>> >
> > > >>> > Thanks,
> > > >>> > Wangda
> > > >>> >
> > > >>> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <
> > > kasha@cloudera.com>
> > > >>> > wrote:
> > > >>> >
> > > >>> > > I am sorry, but merging a potentially disruptive change to
> > branch-2
> > > >>> > without
> > > >>> > > end-to-end tests seems too disruptive to me.
> > > >>> > >
> > > >>> > > I do agree with you on the potential inconvenience of having to
> > > post
> > > >>> > > different patches for trunk and branch-2, but I would rather
> have
> > > that
> > > >>> > > inconvenience than the risk of merging something that hasn't
> been
> > > >>> > > thoroughly tested.
> > > >>> > >
> > > >>> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <
> wheeleast@gmail.com
> > >
> > > >>> wrote:
> > > >>> > >
> > > >>> > > > Hi Karthik,
> > > >>> > > >
> > > >>> > > > Thanks for comments! However, I think only merge to trunk may
> > not
> > > >>> work,
> > > >>> > > > this patch involves thousands lines of code changes in
> > scheduler
> > > >>> side,
> > > >>> > > only
> > > >>> > > > putting that to trunk could lead to trunk/branch-2 totally
> > > >>> incompatible
> > > >>> > > for
> > > >>> > > > resource manager. I think most of the code changes are new to
> > > >>> scheduler
> > > >>> > > > instead of modifying existed logic, they're not very tricky
> to
> > > me.
> > > >>> And
> > > >>> > > when
> > > >>> > > > 2.8 will be released is not planned yet, at least we have a
> > > couple
> > > >>> of
> > > >>> > > > months to make sure this feature becomes usable and not cause
> > > >>> existing
> > > >>> > > > behavior regressions.
> > > >>> > > >
> > > >>> > > > Sounds good to you?
> > > >>> > > >
> > > >>> > > > Wangda
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <
> > > >>> kasha@cloudera.com>
> > > >>> > > > wrote:
> > > >>> > > >
> > > >>> > > > > +1 on merging to trunk. It would be nice to have some
> amount
> > of
> > > >>> > testing
> > > >>> > > > > done before the merge, but I understand how merging to
> trunk
> > > would
> > > >>> > > likely
> > > >>> > > > > speed up the testing efforts.
> > > >>> > > > >
> > > >>> > > > > Let us not merge into branch-2 until after we have done a
> > fair
> > > >>> bit of
> > > >>> > > > > testing, and are comfortable including it in a release.
> While
> > > the
> > > >>> > code
> > > >>> > > > > mostly appears to not mess with existing scheduling logic,
> I
> > am
> > > >>> > > concerned
> > > >>> > > > > about regressions to existing scheduling behavior.
> > > >>> > > > >
> > > >>> > > > >
> > > >>> > > > >
> > > >>> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
> > > >>> > kasha@cloudera.com>
> > > >>> > > > > wrote:
> > > >>> > > > >
> > > >>> > > > > > By the way, for the purposes of merge vote, I believe a
> > > >>> committer's
> > > >>> > > > vote
> > > >>> > > > > > is binding. So, Wangda and Zhihai's votes should be
> > binding.
> > > :)
> > > >>> > > > > >
> > > >>> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <
> > > zxu@cloudera.com>
> > > >>> > > wrote:
> > > >>> > > > > >
> > > >>> > > > > >> +1 (non-binding)
> > > >>> > > > > >>
> > > >>> > > > > >> thanks
> > > >>> > > > > >> Zhihai Xu
> > > >>> > > > > >>
> > > >>> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
> > > >>> > xgong@hortonworks.com>
> > > >>> > > > > >> wrote:
> > > >>> > > > > >>
> > > >>> > > > > >> > +1 Binding
> > > >>> > > > > >> >
> > > >>> > > > > >> > Thanks
> > > >>> > > > > >> >
> > > >>> > > > > >> > Xuan Gong
> > > >>> > > > > >> >
> > > >>> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <
> > > >>> jdu@hortonworks.com
> > > >>> > >
> > > >>> > > > > wrote:
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > +1. (Binding).
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Thanks,
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Junping
> > > >>> > > > > >> > > ________________________________________
> > > >>> > > > > >> > > From: Wangda Tan <wh...@gmail.com>
> > > >>> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> > > >>> > > > > >> > > To: yarn-dev@hadoop.apache.org
> > > >>> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize
> > > into
> > > >>> > trunk
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > +1 (non-binding),
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Thanks Jian starting this thread. This can minimize
> > > effort
> > > >>> of
> > > >>> > > > works
> > > >>> > > > > >> > across branches.
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > To clarify, this feature is end-to-end code
> > completed,
> > > we
> > > >>> > have
> > > >>> > > > API,
> > > >>> > > > > >> > rm/nm implementations patches committed, but we
> haven't
> > > >>> tested
> > > >>> > it
> > > >>> > > > > >> > end-to-end. Filed YARN-4175 to create an example
> program
> > > to
> > > >>> test
> > > >>> > > it
> > > >>> > > > > >> > end-to-end.
> > > >>> > > > > >> > >
> > > >>> > > > > >> > > Regards,
> > > >>> > > > > >> > > Wangda
> > > >>> > > > > >> > >
> > > >>> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <
> > > jhe@hortonworks.com
> > > >>> >
> > > >>> > > > wrote:
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Hi All,
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard
> > work !
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> I would like to call a vote to merge YARN-1197
> > > container
> > > >>> > resize
> > > >>> > > > > into
> > > >>> > > > > >> > trunk.
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Key idea:
> > > >>> > > > > >> > >> This feature adds the ability for AM to change
> > > container
> > > >>> > > resource
> > > >>> > > > > >> size
> > > >>> > > > > >> > at runtime.
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Details:
> > > >>> > > > > >> > >> - This feature is tracked at
> > > >>> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
> > > >>> > > > > >> > >> - It’s currently developed at a separate branch:
> > > >>> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > > >>> > > > > >> > >> - A uber patch(
> > > >>> > https://issues.apache.org/jira/browse/YARN-4157
> > > >>> > > )
> > > >>> > > > > >> > generated from YARN-1197 to run against trunk  shows
> all
> > > unit
> > > >>> > > tests
> > > >>> > > > > have
> > > >>> > > > > >> > passed.
> > > >>> > > > > >> > >> - This feature now can work end-to-end.
> > > >>> > > > > >> > >> -  All the unresolved jiras under YARN-1197 will be
> > the
> > > >>> next
> > > >>> > > > step.
> > > >>> > > > > >> > >>
> > > >>> > > > > >> > >> Thanks,
> > > >>> > > > > >> > >> Wangda Tan & Meng Ding & Jian He
> > > >>> > > > > >> > >
> > > >>> > > > > >> > >
> > > >>> > > > > >> >
> > > >>> > > > > >> >
> > > >>> > > > > >>
> > > >>> > > > > >
> > > >>> > > > > >
> > > >>> > > > >
> > > >>> > > >
> > > >>> > >
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > >
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.
By the way, when we do merge votes, I was under the impression that we do
two separate votes for trunk and branch-2. While we were discussing merging
to branch-2 in this thread, the vote was only for trunk. I was hoping to
exercise my opportunity to -0 on the branch-2 vote :)

In the future, can we please stick to our usual approach of filing a vote
for each branch please?

On Thu, Sep 24, 2015 at 2:46 PM, Wangda Tan <wh...@gmail.com> wrote:

> Thanks Chris for the suggestions,
> I think we need a couple of weeks to finish end-to-end test, doing this is
> majorly to avoid painful of resolving conflicts in the scheduler side.
> In any case, will avoid doing this in the future.
>
> Regards,
> Wangda
>
>
> On Thu, Sep 24, 2015 at 2:29 PM, Chris Douglas <cd...@apache.org>
> wrote:
>
> > Catching up on this thread. I'd missed Wangda's comment. This was
> > merged without testing it end-to-end? And backported to branch-2?
> >
> > I share Karthik's reservations. This is a very aggressive merging
> > strategy. Every other feature merged to branch-2 must also help
> > stabilize YARN-1197. Going through the patch, the particular changes
> > are not terribly threatening and the passing unit tests are
> > encouraging. However, a few days' delay to test the feature is not
> > asking much, particularly when you have reason to be confident in its
> > stability.
> >
> > -0, after the fact. -C
> >
> > On Thu, Sep 24, 2015 at 11:39 AM, Wangda Tan <wh...@gmail.com>
> wrote:
> > > Trunk build seems stable now after the merge. A couple of yarn trunk
> > builds
> > > also ran fine such as:
> > https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/
> > > .
> > >
> > > I've merged this into branch-2 so as to minimize YARN dev effort across
> > > branches as per the discussion thread of YARN-1197 merge.
> > >
> > > Thanks,
> > > Wangda
> > >
> > >
> > > On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <wh...@gmail.com>
> wrote:
> > >
> > >> Thanks Karthik,
> > >>
> > >> Doing end-to-end test as well as regression test is the top priority
> of
> > >> this feature. Will keep you updated.
> > >>
> > >> Regards,
> > >> Wangda
> > >>
> > >> On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <kasha@cloudera.com
> >
> > >> wrote:
> > >>
> > >>> I did skim through the patch before voting :)
> > >>>
> > >>> As I mentioned earlier, I understand vast majority of the code is net
> > new,
> > >>> and the likelihood of breaking changes is low. Still, I would like
> for
> > us
> > >>> to be more careful and run some end-to-end and regression tests
> before
> > >>> including this in a release. While I won't block the merge to
> branch-2,
> > >>> will be a -1 on cutting a release without due testing.
> > >>>
> > >>> At the risk of digressing, I also feel branch-2 should be in a
> > releasable
> > >>> state more often than not. The only way to bring any cadence and
> > >>> predictability to our releases is by maintaining branch-2 in such a
> > state.
> > >>>
> > >>> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com>
> > wrote:
> > >>>
> > >>> > Hi Karthik,
> > >>> >
> > >>> > Let me elaborate more to make you feel better of this change, don't
> > be
> > >>> > scared by the size of the patch :)
> > >>> >
> > >>> > Common RM/Scheduler part:
> > >>> > - AbstractYarnScheduler new logic only.
> > >>> > - AppSchedulingInfo new logic only.
> > >>> > - RMContainer / RMNode state machine, new logic only
> > >>> > - SchedulerApplicationAttempt / Allocation, refactoring to existing
> > >>> > reservation logic so increase request reservation can reuse it, and
> > >>> > refactored to simply updating container token / pull container part
> > so
> > >>> > increase/decrease/new-allocation can reuse same code.
> > >>> >
> > >>> > FairScheduler:
> > >>> > - Small change since we updated how to pull container updated
> token.
> > I
> > >>> > believe it will be a straightforward change for you if you take a
> > closer
> > >>> > look at it.
> > >>> >
> > >>> > CapacityScheduler:
> > >>> > - Most changes are separate logic or small refactorings, most
> complex
> > >>> > allocation logic stays within IncreaseContainerAllocator.java.
> > >>> >
> > >>> > Please let me know where you want to get more details of
> > >>> implementations.
> > >>> >
> > >>> > I strongly suggest you to take a glance at the diff, we have
> already
> > >>> worked
> > >>> > on the merge for the past one week, and we've paid a lot of extra
> > time
> > >>> to
> > >>> > keep YARN-1197 sync with trunk in the past several months. After
> this
> > >>> merge
> > >>> > finished, a couple of weeks needed to finish end-to-end test and
> some
> > >>> other
> > >>> > extra tests, it won't affect our upcoming branch-2 release.
> > >>> >
> > >>> > I would not prefer to merge to trunk only, all people working on RM
> > side
> > >>> > will be affected, we're very carefully avoid such divergence of RM
> in
> > >>> > trunk/branch-2. Since nobody wants to create two different patches
> > for
> > >>> > every RM changes. And also, after this finished, other efforts can
> > >>> happen
> > >>> > in parallel such as YARN-4091.
> > >>> >
> > >>> > Let me know if you have any other questions/concerns.
> > >>> >
> > >>> > Thanks,
> > >>> > Wangda
> > >>> >
> > >>> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <
> > kasha@cloudera.com>
> > >>> > wrote:
> > >>> >
> > >>> > > I am sorry, but merging a potentially disruptive change to
> branch-2
> > >>> > without
> > >>> > > end-to-end tests seems too disruptive to me.
> > >>> > >
> > >>> > > I do agree with you on the potential inconvenience of having to
> > post
> > >>> > > different patches for trunk and branch-2, but I would rather have
> > that
> > >>> > > inconvenience than the risk of merging something that hasn't been
> > >>> > > thoroughly tested.
> > >>> > >
> > >>> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wheeleast@gmail.com
> >
> > >>> wrote:
> > >>> > >
> > >>> > > > Hi Karthik,
> > >>> > > >
> > >>> > > > Thanks for comments! However, I think only merge to trunk may
> not
> > >>> work,
> > >>> > > > this patch involves thousands lines of code changes in
> scheduler
> > >>> side,
> > >>> > > only
> > >>> > > > putting that to trunk could lead to trunk/branch-2 totally
> > >>> incompatible
> > >>> > > for
> > >>> > > > resource manager. I think most of the code changes are new to
> > >>> scheduler
> > >>> > > > instead of modifying existed logic, they're not very tricky to
> > me.
> > >>> And
> > >>> > > when
> > >>> > > > 2.8 will be released is not planned yet, at least we have a
> > couple
> > >>> of
> > >>> > > > months to make sure this feature becomes usable and not cause
> > >>> existing
> > >>> > > > behavior regressions.
> > >>> > > >
> > >>> > > > Sounds good to you?
> > >>> > > >
> > >>> > > > Wangda
> > >>> > > >
> > >>> > > >
> > >>> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <
> > >>> kasha@cloudera.com>
> > >>> > > > wrote:
> > >>> > > >
> > >>> > > > > +1 on merging to trunk. It would be nice to have some amount
> of
> > >>> > testing
> > >>> > > > > done before the merge, but I understand how merging to trunk
> > would
> > >>> > > likely
> > >>> > > > > speed up the testing efforts.
> > >>> > > > >
> > >>> > > > > Let us not merge into branch-2 until after we have done a
> fair
> > >>> bit of
> > >>> > > > > testing, and are comfortable including it in a release. While
> > the
> > >>> > code
> > >>> > > > > mostly appears to not mess with existing scheduling logic, I
> am
> > >>> > > concerned
> > >>> > > > > about regressions to existing scheduling behavior.
> > >>> > > > >
> > >>> > > > >
> > >>> > > > >
> > >>> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
> > >>> > kasha@cloudera.com>
> > >>> > > > > wrote:
> > >>> > > > >
> > >>> > > > > > By the way, for the purposes of merge vote, I believe a
> > >>> committer's
> > >>> > > > vote
> > >>> > > > > > is binding. So, Wangda and Zhihai's votes should be
> binding.
> > :)
> > >>> > > > > >
> > >>> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <
> > zxu@cloudera.com>
> > >>> > > wrote:
> > >>> > > > > >
> > >>> > > > > >> +1 (non-binding)
> > >>> > > > > >>
> > >>> > > > > >> thanks
> > >>> > > > > >> Zhihai Xu
> > >>> > > > > >>
> > >>> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
> > >>> > xgong@hortonworks.com>
> > >>> > > > > >> wrote:
> > >>> > > > > >>
> > >>> > > > > >> > +1 Binding
> > >>> > > > > >> >
> > >>> > > > > >> > Thanks
> > >>> > > > > >> >
> > >>> > > > > >> > Xuan Gong
> > >>> > > > > >> >
> > >>> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <
> > >>> jdu@hortonworks.com
> > >>> > >
> > >>> > > > > wrote:
> > >>> > > > > >> > >
> > >>> > > > > >> > > +1. (Binding).
> > >>> > > > > >> > >
> > >>> > > > > >> > > Thanks,
> > >>> > > > > >> > >
> > >>> > > > > >> > > Junping
> > >>> > > > > >> > > ________________________________________
> > >>> > > > > >> > > From: Wangda Tan <wh...@gmail.com>
> > >>> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> > >>> > > > > >> > > To: yarn-dev@hadoop.apache.org
> > >>> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize
> > into
> > >>> > trunk
> > >>> > > > > >> > >
> > >>> > > > > >> > > +1 (non-binding),
> > >>> > > > > >> > >
> > >>> > > > > >> > > Thanks Jian starting this thread. This can minimize
> > effort
> > >>> of
> > >>> > > > works
> > >>> > > > > >> > across branches.
> > >>> > > > > >> > >
> > >>> > > > > >> > > To clarify, this feature is end-to-end code
> completed,
> > we
> > >>> > have
> > >>> > > > API,
> > >>> > > > > >> > rm/nm implementations patches committed, but we haven't
> > >>> tested
> > >>> > it
> > >>> > > > > >> > end-to-end. Filed YARN-4175 to create an example program
> > to
> > >>> test
> > >>> > > it
> > >>> > > > > >> > end-to-end.
> > >>> > > > > >> > >
> > >>> > > > > >> > > Regards,
> > >>> > > > > >> > > Wangda
> > >>> > > > > >> > >
> > >>> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <
> > jhe@hortonworks.com
> > >>> >
> > >>> > > > wrote:
> > >>> > > > > >> > >>
> > >>> > > > > >> > >> Hi All,
> > >>> > > > > >> > >>
> > >>> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard
> work !
> > >>> > > > > >> > >>
> > >>> > > > > >> > >> I would like to call a vote to merge YARN-1197
> > container
> > >>> > resize
> > >>> > > > > into
> > >>> > > > > >> > trunk.
> > >>> > > > > >> > >>
> > >>> > > > > >> > >> Key idea:
> > >>> > > > > >> > >> This feature adds the ability for AM to change
> > container
> > >>> > > resource
> > >>> > > > > >> size
> > >>> > > > > >> > at runtime.
> > >>> > > > > >> > >>
> > >>> > > > > >> > >> Details:
> > >>> > > > > >> > >> - This feature is tracked at
> > >>> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
> > >>> > > > > >> > >> - It’s currently developed at a separate branch:
> > >>> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > >>> > > > > >> > >> - A uber patch(
> > >>> > https://issues.apache.org/jira/browse/YARN-4157
> > >>> > > )
> > >>> > > > > >> > generated from YARN-1197 to run against trunk  shows all
> > unit
> > >>> > > tests
> > >>> > > > > have
> > >>> > > > > >> > passed.
> > >>> > > > > >> > >> - This feature now can work end-to-end.
> > >>> > > > > >> > >> -  All the unresolved jiras under YARN-1197 will be
> the
> > >>> next
> > >>> > > > step.
> > >>> > > > > >> > >>
> > >>> > > > > >> > >> Thanks,
> > >>> > > > > >> > >> Wangda Tan & Meng Ding & Jian He
> > >>> > > > > >> > >
> > >>> > > > > >> > >
> > >>> > > > > >> >
> > >>> > > > > >> >
> > >>> > > > > >>
> > >>> > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
Thanks Chris for the suggestions,
I think we need a couple of weeks to finish end-to-end test, doing this is
majorly to avoid painful of resolving conflicts in the scheduler side.
In any case, will avoid doing this in the future.

Regards,
Wangda


On Thu, Sep 24, 2015 at 2:29 PM, Chris Douglas <cd...@apache.org> wrote:

> Catching up on this thread. I'd missed Wangda's comment. This was
> merged without testing it end-to-end? And backported to branch-2?
>
> I share Karthik's reservations. This is a very aggressive merging
> strategy. Every other feature merged to branch-2 must also help
> stabilize YARN-1197. Going through the patch, the particular changes
> are not terribly threatening and the passing unit tests are
> encouraging. However, a few days' delay to test the feature is not
> asking much, particularly when you have reason to be confident in its
> stability.
>
> -0, after the fact. -C
>
> On Thu, Sep 24, 2015 at 11:39 AM, Wangda Tan <wh...@gmail.com> wrote:
> > Trunk build seems stable now after the merge. A couple of yarn trunk
> builds
> > also ran fine such as:
> https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/
> > .
> >
> > I've merged this into branch-2 so as to minimize YARN dev effort across
> > branches as per the discussion thread of YARN-1197 merge.
> >
> > Thanks,
> > Wangda
> >
> >
> > On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <wh...@gmail.com> wrote:
> >
> >> Thanks Karthik,
> >>
> >> Doing end-to-end test as well as regression test is the top priority of
> >> this feature. Will keep you updated.
> >>
> >> Regards,
> >> Wangda
> >>
> >> On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <ka...@cloudera.com>
> >> wrote:
> >>
> >>> I did skim through the patch before voting :)
> >>>
> >>> As I mentioned earlier, I understand vast majority of the code is net
> new,
> >>> and the likelihood of breaking changes is low. Still, I would like for
> us
> >>> to be more careful and run some end-to-end and regression tests before
> >>> including this in a release. While I won't block the merge to branch-2,
> >>> will be a -1 on cutting a release without due testing.
> >>>
> >>> At the risk of digressing, I also feel branch-2 should be in a
> releasable
> >>> state more often than not. The only way to bring any cadence and
> >>> predictability to our releases is by maintaining branch-2 in such a
> state.
> >>>
> >>> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com>
> wrote:
> >>>
> >>> > Hi Karthik,
> >>> >
> >>> > Let me elaborate more to make you feel better of this change, don't
> be
> >>> > scared by the size of the patch :)
> >>> >
> >>> > Common RM/Scheduler part:
> >>> > - AbstractYarnScheduler new logic only.
> >>> > - AppSchedulingInfo new logic only.
> >>> > - RMContainer / RMNode state machine, new logic only
> >>> > - SchedulerApplicationAttempt / Allocation, refactoring to existing
> >>> > reservation logic so increase request reservation can reuse it, and
> >>> > refactored to simply updating container token / pull container part
> so
> >>> > increase/decrease/new-allocation can reuse same code.
> >>> >
> >>> > FairScheduler:
> >>> > - Small change since we updated how to pull container updated token.
> I
> >>> > believe it will be a straightforward change for you if you take a
> closer
> >>> > look at it.
> >>> >
> >>> > CapacityScheduler:
> >>> > - Most changes are separate logic or small refactorings, most complex
> >>> > allocation logic stays within IncreaseContainerAllocator.java.
> >>> >
> >>> > Please let me know where you want to get more details of
> >>> implementations.
> >>> >
> >>> > I strongly suggest you to take a glance at the diff, we have already
> >>> worked
> >>> > on the merge for the past one week, and we've paid a lot of extra
> time
> >>> to
> >>> > keep YARN-1197 sync with trunk in the past several months. After this
> >>> merge
> >>> > finished, a couple of weeks needed to finish end-to-end test and some
> >>> other
> >>> > extra tests, it won't affect our upcoming branch-2 release.
> >>> >
> >>> > I would not prefer to merge to trunk only, all people working on RM
> side
> >>> > will be affected, we're very carefully avoid such divergence of RM in
> >>> > trunk/branch-2. Since nobody wants to create two different patches
> for
> >>> > every RM changes. And also, after this finished, other efforts can
> >>> happen
> >>> > in parallel such as YARN-4091.
> >>> >
> >>> > Let me know if you have any other questions/concerns.
> >>> >
> >>> > Thanks,
> >>> > Wangda
> >>> >
> >>> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <
> kasha@cloudera.com>
> >>> > wrote:
> >>> >
> >>> > > I am sorry, but merging a potentially disruptive change to branch-2
> >>> > without
> >>> > > end-to-end tests seems too disruptive to me.
> >>> > >
> >>> > > I do agree with you on the potential inconvenience of having to
> post
> >>> > > different patches for trunk and branch-2, but I would rather have
> that
> >>> > > inconvenience than the risk of merging something that hasn't been
> >>> > > thoroughly tested.
> >>> > >
> >>> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com>
> >>> wrote:
> >>> > >
> >>> > > > Hi Karthik,
> >>> > > >
> >>> > > > Thanks for comments! However, I think only merge to trunk may not
> >>> work,
> >>> > > > this patch involves thousands lines of code changes in scheduler
> >>> side,
> >>> > > only
> >>> > > > putting that to trunk could lead to trunk/branch-2 totally
> >>> incompatible
> >>> > > for
> >>> > > > resource manager. I think most of the code changes are new to
> >>> scheduler
> >>> > > > instead of modifying existed logic, they're not very tricky to
> me.
> >>> And
> >>> > > when
> >>> > > > 2.8 will be released is not planned yet, at least we have a
> couple
> >>> of
> >>> > > > months to make sure this feature becomes usable and not cause
> >>> existing
> >>> > > > behavior regressions.
> >>> > > >
> >>> > > > Sounds good to you?
> >>> > > >
> >>> > > > Wangda
> >>> > > >
> >>> > > >
> >>> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <
> >>> kasha@cloudera.com>
> >>> > > > wrote:
> >>> > > >
> >>> > > > > +1 on merging to trunk. It would be nice to have some amount of
> >>> > testing
> >>> > > > > done before the merge, but I understand how merging to trunk
> would
> >>> > > likely
> >>> > > > > speed up the testing efforts.
> >>> > > > >
> >>> > > > > Let us not merge into branch-2 until after we have done a fair
> >>> bit of
> >>> > > > > testing, and are comfortable including it in a release. While
> the
> >>> > code
> >>> > > > > mostly appears to not mess with existing scheduling logic, I am
> >>> > > concerned
> >>> > > > > about regressions to existing scheduling behavior.
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
> >>> > kasha@cloudera.com>
> >>> > > > > wrote:
> >>> > > > >
> >>> > > > > > By the way, for the purposes of merge vote, I believe a
> >>> committer's
> >>> > > > vote
> >>> > > > > > is binding. So, Wangda and Zhihai's votes should be binding.
> :)
> >>> > > > > >
> >>> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <
> zxu@cloudera.com>
> >>> > > wrote:
> >>> > > > > >
> >>> > > > > >> +1 (non-binding)
> >>> > > > > >>
> >>> > > > > >> thanks
> >>> > > > > >> Zhihai Xu
> >>> > > > > >>
> >>> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
> >>> > xgong@hortonworks.com>
> >>> > > > > >> wrote:
> >>> > > > > >>
> >>> > > > > >> > +1 Binding
> >>> > > > > >> >
> >>> > > > > >> > Thanks
> >>> > > > > >> >
> >>> > > > > >> > Xuan Gong
> >>> > > > > >> >
> >>> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <
> >>> jdu@hortonworks.com
> >>> > >
> >>> > > > > wrote:
> >>> > > > > >> > >
> >>> > > > > >> > > +1. (Binding).
> >>> > > > > >> > >
> >>> > > > > >> > > Thanks,
> >>> > > > > >> > >
> >>> > > > > >> > > Junping
> >>> > > > > >> > > ________________________________________
> >>> > > > > >> > > From: Wangda Tan <wh...@gmail.com>
> >>> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> >>> > > > > >> > > To: yarn-dev@hadoop.apache.org
> >>> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize
> into
> >>> > trunk
> >>> > > > > >> > >
> >>> > > > > >> > > +1 (non-binding),
> >>> > > > > >> > >
> >>> > > > > >> > > Thanks Jian starting this thread. This can minimize
> effort
> >>> of
> >>> > > > works
> >>> > > > > >> > across branches.
> >>> > > > > >> > >
> >>> > > > > >> > > To clarify, this feature is end-to-end code  completed,
> we
> >>> > have
> >>> > > > API,
> >>> > > > > >> > rm/nm implementations patches committed, but we haven't
> >>> tested
> >>> > it
> >>> > > > > >> > end-to-end. Filed YARN-4175 to create an example program
> to
> >>> test
> >>> > > it
> >>> > > > > >> > end-to-end.
> >>> > > > > >> > >
> >>> > > > > >> > > Regards,
> >>> > > > > >> > > Wangda
> >>> > > > > >> > >
> >>> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <
> jhe@hortonworks.com
> >>> >
> >>> > > > wrote:
> >>> > > > > >> > >>
> >>> > > > > >> > >> Hi All,
> >>> > > > > >> > >>
> >>> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
> >>> > > > > >> > >>
> >>> > > > > >> > >> I would like to call a vote to merge YARN-1197
> container
> >>> > resize
> >>> > > > > into
> >>> > > > > >> > trunk.
> >>> > > > > >> > >>
> >>> > > > > >> > >> Key idea:
> >>> > > > > >> > >> This feature adds the ability for AM to change
> container
> >>> > > resource
> >>> > > > > >> size
> >>> > > > > >> > at runtime.
> >>> > > > > >> > >>
> >>> > > > > >> > >> Details:
> >>> > > > > >> > >> - This feature is tracked at
> >>> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
> >>> > > > > >> > >> - It’s currently developed at a separate branch:
> >>> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> >>> > > > > >> > >> - A uber patch(
> >>> > https://issues.apache.org/jira/browse/YARN-4157
> >>> > > )
> >>> > > > > >> > generated from YARN-1197 to run against trunk  shows all
> unit
> >>> > > tests
> >>> > > > > have
> >>> > > > > >> > passed.
> >>> > > > > >> > >> - This feature now can work end-to-end.
> >>> > > > > >> > >> -  All the unresolved jiras under YARN-1197 will be the
> >>> next
> >>> > > > step.
> >>> > > > > >> > >>
> >>> > > > > >> > >> Thanks,
> >>> > > > > >> > >> Wangda Tan & Meng Ding & Jian He
> >>> > > > > >> > >
> >>> > > > > >> > >
> >>> > > > > >> >
> >>> > > > > >> >
> >>> > > > > >>
> >>> > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
> >>
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Chris Douglas <cd...@apache.org>.
Catching up on this thread. I'd missed Wangda's comment. This was
merged without testing it end-to-end? And backported to branch-2?

I share Karthik's reservations. This is a very aggressive merging
strategy. Every other feature merged to branch-2 must also help
stabilize YARN-1197. Going through the patch, the particular changes
are not terribly threatening and the passing unit tests are
encouraging. However, a few days' delay to test the feature is not
asking much, particularly when you have reason to be confident in its
stability.

-0, after the fact. -C

On Thu, Sep 24, 2015 at 11:39 AM, Wangda Tan <wh...@gmail.com> wrote:
> Trunk build seems stable now after the merge. A couple of yarn trunk builds
> also ran fine such as: https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/
> .
>
> I've merged this into branch-2 so as to minimize YARN dev effort across
> branches as per the discussion thread of YARN-1197 merge.
>
> Thanks,
> Wangda
>
>
> On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <wh...@gmail.com> wrote:
>
>> Thanks Karthik,
>>
>> Doing end-to-end test as well as regression test is the top priority of
>> this feature. Will keep you updated.
>>
>> Regards,
>> Wangda
>>
>> On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <ka...@cloudera.com>
>> wrote:
>>
>>> I did skim through the patch before voting :)
>>>
>>> As I mentioned earlier, I understand vast majority of the code is net new,
>>> and the likelihood of breaking changes is low. Still, I would like for us
>>> to be more careful and run some end-to-end and regression tests before
>>> including this in a release. While I won't block the merge to branch-2,
>>> will be a -1 on cutting a release without due testing.
>>>
>>> At the risk of digressing, I also feel branch-2 should be in a releasable
>>> state more often than not. The only way to bring any cadence and
>>> predictability to our releases is by maintaining branch-2 in such a state.
>>>
>>> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com> wrote:
>>>
>>> > Hi Karthik,
>>> >
>>> > Let me elaborate more to make you feel better of this change, don't be
>>> > scared by the size of the patch :)
>>> >
>>> > Common RM/Scheduler part:
>>> > - AbstractYarnScheduler new logic only.
>>> > - AppSchedulingInfo new logic only.
>>> > - RMContainer / RMNode state machine, new logic only
>>> > - SchedulerApplicationAttempt / Allocation, refactoring to existing
>>> > reservation logic so increase request reservation can reuse it, and
>>> > refactored to simply updating container token / pull container part so
>>> > increase/decrease/new-allocation can reuse same code.
>>> >
>>> > FairScheduler:
>>> > - Small change since we updated how to pull container updated token. I
>>> > believe it will be a straightforward change for you if you take a closer
>>> > look at it.
>>> >
>>> > CapacityScheduler:
>>> > - Most changes are separate logic or small refactorings, most complex
>>> > allocation logic stays within IncreaseContainerAllocator.java.
>>> >
>>> > Please let me know where you want to get more details of
>>> implementations.
>>> >
>>> > I strongly suggest you to take a glance at the diff, we have already
>>> worked
>>> > on the merge for the past one week, and we've paid a lot of extra time
>>> to
>>> > keep YARN-1197 sync with trunk in the past several months. After this
>>> merge
>>> > finished, a couple of weeks needed to finish end-to-end test and some
>>> other
>>> > extra tests, it won't affect our upcoming branch-2 release.
>>> >
>>> > I would not prefer to merge to trunk only, all people working on RM side
>>> > will be affected, we're very carefully avoid such divergence of RM in
>>> > trunk/branch-2. Since nobody wants to create two different patches for
>>> > every RM changes. And also, after this finished, other efforts can
>>> happen
>>> > in parallel such as YARN-4091.
>>> >
>>> > Let me know if you have any other questions/concerns.
>>> >
>>> > Thanks,
>>> > Wangda
>>> >
>>> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <ka...@cloudera.com>
>>> > wrote:
>>> >
>>> > > I am sorry, but merging a potentially disruptive change to branch-2
>>> > without
>>> > > end-to-end tests seems too disruptive to me.
>>> > >
>>> > > I do agree with you on the potential inconvenience of having to post
>>> > > different patches for trunk and branch-2, but I would rather have that
>>> > > inconvenience than the risk of merging something that hasn't been
>>> > > thoroughly tested.
>>> > >
>>> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com>
>>> wrote:
>>> > >
>>> > > > Hi Karthik,
>>> > > >
>>> > > > Thanks for comments! However, I think only merge to trunk may not
>>> work,
>>> > > > this patch involves thousands lines of code changes in scheduler
>>> side,
>>> > > only
>>> > > > putting that to trunk could lead to trunk/branch-2 totally
>>> incompatible
>>> > > for
>>> > > > resource manager. I think most of the code changes are new to
>>> scheduler
>>> > > > instead of modifying existed logic, they're not very tricky to me.
>>> And
>>> > > when
>>> > > > 2.8 will be released is not planned yet, at least we have a couple
>>> of
>>> > > > months to make sure this feature becomes usable and not cause
>>> existing
>>> > > > behavior regressions.
>>> > > >
>>> > > > Sounds good to you?
>>> > > >
>>> > > > Wangda
>>> > > >
>>> > > >
>>> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <
>>> kasha@cloudera.com>
>>> > > > wrote:
>>> > > >
>>> > > > > +1 on merging to trunk. It would be nice to have some amount of
>>> > testing
>>> > > > > done before the merge, but I understand how merging to trunk would
>>> > > likely
>>> > > > > speed up the testing efforts.
>>> > > > >
>>> > > > > Let us not merge into branch-2 until after we have done a fair
>>> bit of
>>> > > > > testing, and are comfortable including it in a release. While the
>>> > code
>>> > > > > mostly appears to not mess with existing scheduling logic, I am
>>> > > concerned
>>> > > > > about regressions to existing scheduling behavior.
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
>>> > kasha@cloudera.com>
>>> > > > > wrote:
>>> > > > >
>>> > > > > > By the way, for the purposes of merge vote, I believe a
>>> committer's
>>> > > > vote
>>> > > > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
>>> > > > > >
>>> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com>
>>> > > wrote:
>>> > > > > >
>>> > > > > >> +1 (non-binding)
>>> > > > > >>
>>> > > > > >> thanks
>>> > > > > >> Zhihai Xu
>>> > > > > >>
>>> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
>>> > xgong@hortonworks.com>
>>> > > > > >> wrote:
>>> > > > > >>
>>> > > > > >> > +1 Binding
>>> > > > > >> >
>>> > > > > >> > Thanks
>>> > > > > >> >
>>> > > > > >> > Xuan Gong
>>> > > > > >> >
>>> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <
>>> jdu@hortonworks.com
>>> > >
>>> > > > > wrote:
>>> > > > > >> > >
>>> > > > > >> > > +1. (Binding).
>>> > > > > >> > >
>>> > > > > >> > > Thanks,
>>> > > > > >> > >
>>> > > > > >> > > Junping
>>> > > > > >> > > ________________________________________
>>> > > > > >> > > From: Wangda Tan <wh...@gmail.com>
>>> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
>>> > > > > >> > > To: yarn-dev@hadoop.apache.org
>>> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into
>>> > trunk
>>> > > > > >> > >
>>> > > > > >> > > +1 (non-binding),
>>> > > > > >> > >
>>> > > > > >> > > Thanks Jian starting this thread. This can minimize effort
>>> of
>>> > > > works
>>> > > > > >> > across branches.
>>> > > > > >> > >
>>> > > > > >> > > To clarify, this feature is end-to-end code  completed, we
>>> > have
>>> > > > API,
>>> > > > > >> > rm/nm implementations patches committed, but we haven't
>>> tested
>>> > it
>>> > > > > >> > end-to-end. Filed YARN-4175 to create an example program to
>>> test
>>> > > it
>>> > > > > >> > end-to-end.
>>> > > > > >> > >
>>> > > > > >> > > Regards,
>>> > > > > >> > > Wangda
>>> > > > > >> > >
>>> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jhe@hortonworks.com
>>> >
>>> > > > wrote:
>>> > > > > >> > >>
>>> > > > > >> > >> Hi All,
>>> > > > > >> > >>
>>> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
>>> > > > > >> > >>
>>> > > > > >> > >> I would like to call a vote to merge YARN-1197 container
>>> > resize
>>> > > > > into
>>> > > > > >> > trunk.
>>> > > > > >> > >>
>>> > > > > >> > >> Key idea:
>>> > > > > >> > >> This feature adds the ability for AM to change container
>>> > > resource
>>> > > > > >> size
>>> > > > > >> > at runtime.
>>> > > > > >> > >>
>>> > > > > >> > >> Details:
>>> > > > > >> > >> - This feature is tracked at
>>> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
>>> > > > > >> > >> - It’s currently developed at a separate branch:
>>> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
>>> > > > > >> > >> - A uber patch(
>>> > https://issues.apache.org/jira/browse/YARN-4157
>>> > > )
>>> > > > > >> > generated from YARN-1197 to run against trunk  shows all unit
>>> > > tests
>>> > > > > have
>>> > > > > >> > passed.
>>> > > > > >> > >> - This feature now can work end-to-end.
>>> > > > > >> > >> -  All the unresolved jiras under YARN-1197 will be the
>>> next
>>> > > > step.
>>> > > > > >> > >>
>>> > > > > >> > >> Thanks,
>>> > > > > >> > >> Wangda Tan & Meng Ding & Jian He
>>> > > > > >> > >
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >>
>>> > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
Trunk build seems stable now after the merge. A couple of yarn trunk builds
also ran fine such as: https://builds.apache.org/job/Hadoop-Yarn-trunk/1169/
.

I've merged this into branch-2 so as to minimize YARN dev effort across
branches as per the discussion thread of YARN-1197 merge.

Thanks,
Wangda


On Wed, Sep 23, 2015 at 3:15 PM, Wangda Tan <wh...@gmail.com> wrote:

> Thanks Karthik,
>
> Doing end-to-end test as well as regression test is the top priority of
> this feature. Will keep you updated.
>
> Regards,
> Wangda
>
> On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>
>> I did skim through the patch before voting :)
>>
>> As I mentioned earlier, I understand vast majority of the code is net new,
>> and the likelihood of breaking changes is low. Still, I would like for us
>> to be more careful and run some end-to-end and regression tests before
>> including this in a release. While I won't block the merge to branch-2,
>> will be a -1 on cutting a release without due testing.
>>
>> At the risk of digressing, I also feel branch-2 should be in a releasable
>> state more often than not. The only way to bring any cadence and
>> predictability to our releases is by maintaining branch-2 in such a state.
>>
>> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com> wrote:
>>
>> > Hi Karthik,
>> >
>> > Let me elaborate more to make you feel better of this change, don't be
>> > scared by the size of the patch :)
>> >
>> > Common RM/Scheduler part:
>> > - AbstractYarnScheduler new logic only.
>> > - AppSchedulingInfo new logic only.
>> > - RMContainer / RMNode state machine, new logic only
>> > - SchedulerApplicationAttempt / Allocation, refactoring to existing
>> > reservation logic so increase request reservation can reuse it, and
>> > refactored to simply updating container token / pull container part so
>> > increase/decrease/new-allocation can reuse same code.
>> >
>> > FairScheduler:
>> > - Small change since we updated how to pull container updated token. I
>> > believe it will be a straightforward change for you if you take a closer
>> > look at it.
>> >
>> > CapacityScheduler:
>> > - Most changes are separate logic or small refactorings, most complex
>> > allocation logic stays within IncreaseContainerAllocator.java.
>> >
>> > Please let me know where you want to get more details of
>> implementations.
>> >
>> > I strongly suggest you to take a glance at the diff, we have already
>> worked
>> > on the merge for the past one week, and we've paid a lot of extra time
>> to
>> > keep YARN-1197 sync with trunk in the past several months. After this
>> merge
>> > finished, a couple of weeks needed to finish end-to-end test and some
>> other
>> > extra tests, it won't affect our upcoming branch-2 release.
>> >
>> > I would not prefer to merge to trunk only, all people working on RM side
>> > will be affected, we're very carefully avoid such divergence of RM in
>> > trunk/branch-2. Since nobody wants to create two different patches for
>> > every RM changes. And also, after this finished, other efforts can
>> happen
>> > in parallel such as YARN-4091.
>> >
>> > Let me know if you have any other questions/concerns.
>> >
>> > Thanks,
>> > Wangda
>> >
>> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <ka...@cloudera.com>
>> > wrote:
>> >
>> > > I am sorry, but merging a potentially disruptive change to branch-2
>> > without
>> > > end-to-end tests seems too disruptive to me.
>> > >
>> > > I do agree with you on the potential inconvenience of having to post
>> > > different patches for trunk and branch-2, but I would rather have that
>> > > inconvenience than the risk of merging something that hasn't been
>> > > thoroughly tested.
>> > >
>> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com>
>> wrote:
>> > >
>> > > > Hi Karthik,
>> > > >
>> > > > Thanks for comments! However, I think only merge to trunk may not
>> work,
>> > > > this patch involves thousands lines of code changes in scheduler
>> side,
>> > > only
>> > > > putting that to trunk could lead to trunk/branch-2 totally
>> incompatible
>> > > for
>> > > > resource manager. I think most of the code changes are new to
>> scheduler
>> > > > instead of modifying existed logic, they're not very tricky to me.
>> And
>> > > when
>> > > > 2.8 will be released is not planned yet, at least we have a couple
>> of
>> > > > months to make sure this feature becomes usable and not cause
>> existing
>> > > > behavior regressions.
>> > > >
>> > > > Sounds good to you?
>> > > >
>> > > > Wangda
>> > > >
>> > > >
>> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <
>> kasha@cloudera.com>
>> > > > wrote:
>> > > >
>> > > > > +1 on merging to trunk. It would be nice to have some amount of
>> > testing
>> > > > > done before the merge, but I understand how merging to trunk would
>> > > likely
>> > > > > speed up the testing efforts.
>> > > > >
>> > > > > Let us not merge into branch-2 until after we have done a fair
>> bit of
>> > > > > testing, and are comfortable including it in a release. While the
>> > code
>> > > > > mostly appears to not mess with existing scheduling logic, I am
>> > > concerned
>> > > > > about regressions to existing scheduling behavior.
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
>> > kasha@cloudera.com>
>> > > > > wrote:
>> > > > >
>> > > > > > By the way, for the purposes of merge vote, I believe a
>> committer's
>> > > > vote
>> > > > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
>> > > > > >
>> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com>
>> > > wrote:
>> > > > > >
>> > > > > >> +1 (non-binding)
>> > > > > >>
>> > > > > >> thanks
>> > > > > >> Zhihai Xu
>> > > > > >>
>> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
>> > xgong@hortonworks.com>
>> > > > > >> wrote:
>> > > > > >>
>> > > > > >> > +1 Binding
>> > > > > >> >
>> > > > > >> > Thanks
>> > > > > >> >
>> > > > > >> > Xuan Gong
>> > > > > >> >
>> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <
>> jdu@hortonworks.com
>> > >
>> > > > > wrote:
>> > > > > >> > >
>> > > > > >> > > +1. (Binding).
>> > > > > >> > >
>> > > > > >> > > Thanks,
>> > > > > >> > >
>> > > > > >> > > Junping
>> > > > > >> > > ________________________________________
>> > > > > >> > > From: Wangda Tan <wh...@gmail.com>
>> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
>> > > > > >> > > To: yarn-dev@hadoop.apache.org
>> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into
>> > trunk
>> > > > > >> > >
>> > > > > >> > > +1 (non-binding),
>> > > > > >> > >
>> > > > > >> > > Thanks Jian starting this thread. This can minimize effort
>> of
>> > > > works
>> > > > > >> > across branches.
>> > > > > >> > >
>> > > > > >> > > To clarify, this feature is end-to-end code  completed, we
>> > have
>> > > > API,
>> > > > > >> > rm/nm implementations patches committed, but we haven't
>> tested
>> > it
>> > > > > >> > end-to-end. Filed YARN-4175 to create an example program to
>> test
>> > > it
>> > > > > >> > end-to-end.
>> > > > > >> > >
>> > > > > >> > > Regards,
>> > > > > >> > > Wangda
>> > > > > >> > >
>> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jhe@hortonworks.com
>> >
>> > > > wrote:
>> > > > > >> > >>
>> > > > > >> > >> Hi All,
>> > > > > >> > >>
>> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
>> > > > > >> > >>
>> > > > > >> > >> I would like to call a vote to merge YARN-1197 container
>> > resize
>> > > > > into
>> > > > > >> > trunk.
>> > > > > >> > >>
>> > > > > >> > >> Key idea:
>> > > > > >> > >> This feature adds the ability for AM to change container
>> > > resource
>> > > > > >> size
>> > > > > >> > at runtime.
>> > > > > >> > >>
>> > > > > >> > >> Details:
>> > > > > >> > >> - This feature is tracked at
>> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
>> > > > > >> > >> - It’s currently developed at a separate branch:
>> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
>> > > > > >> > >> - A uber patch(
>> > https://issues.apache.org/jira/browse/YARN-4157
>> > > )
>> > > > > >> > generated from YARN-1197 to run against trunk  shows all unit
>> > > tests
>> > > > > have
>> > > > > >> > passed.
>> > > > > >> > >> - This feature now can work end-to-end.
>> > > > > >> > >> -  All the unresolved jiras under YARN-1197 will be the
>> next
>> > > > step.
>> > > > > >> > >>
>> > > > > >> > >> Thanks,
>> > > > > >> > >> Wangda Tan & Meng Ding & Jian He
>> > > > > >> > >
>> > > > > >> > >
>> > > > > >> >
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
Thanks Karthik,

Doing end-to-end test as well as regression test is the top priority of
this feature. Will keep you updated.

Regards,
Wangda

On Wed, Sep 23, 2015 at 3:01 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> I did skim through the patch before voting :)
>
> As I mentioned earlier, I understand vast majority of the code is net new,
> and the likelihood of breaking changes is low. Still, I would like for us
> to be more careful and run some end-to-end and regression tests before
> including this in a release. While I won't block the merge to branch-2,
> will be a -1 on cutting a release without due testing.
>
> At the risk of digressing, I also feel branch-2 should be in a releasable
> state more often than not. The only way to bring any cadence and
> predictability to our releases is by maintaining branch-2 in such a state.
>
> On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com> wrote:
>
> > Hi Karthik,
> >
> > Let me elaborate more to make you feel better of this change, don't be
> > scared by the size of the patch :)
> >
> > Common RM/Scheduler part:
> > - AbstractYarnScheduler new logic only.
> > - AppSchedulingInfo new logic only.
> > - RMContainer / RMNode state machine, new logic only
> > - SchedulerApplicationAttempt / Allocation, refactoring to existing
> > reservation logic so increase request reservation can reuse it, and
> > refactored to simply updating container token / pull container part so
> > increase/decrease/new-allocation can reuse same code.
> >
> > FairScheduler:
> > - Small change since we updated how to pull container updated token. I
> > believe it will be a straightforward change for you if you take a closer
> > look at it.
> >
> > CapacityScheduler:
> > - Most changes are separate logic or small refactorings, most complex
> > allocation logic stays within IncreaseContainerAllocator.java.
> >
> > Please let me know where you want to get more details of implementations.
> >
> > I strongly suggest you to take a glance at the diff, we have already
> worked
> > on the merge for the past one week, and we've paid a lot of extra time to
> > keep YARN-1197 sync with trunk in the past several months. After this
> merge
> > finished, a couple of weeks needed to finish end-to-end test and some
> other
> > extra tests, it won't affect our upcoming branch-2 release.
> >
> > I would not prefer to merge to trunk only, all people working on RM side
> > will be affected, we're very carefully avoid such divergence of RM in
> > trunk/branch-2. Since nobody wants to create two different patches for
> > every RM changes. And also, after this finished, other efforts can happen
> > in parallel such as YARN-4091.
> >
> > Let me know if you have any other questions/concerns.
> >
> > Thanks,
> > Wangda
> >
> > On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <ka...@cloudera.com>
> > wrote:
> >
> > > I am sorry, but merging a potentially disruptive change to branch-2
> > without
> > > end-to-end tests seems too disruptive to me.
> > >
> > > I do agree with you on the potential inconvenience of having to post
> > > different patches for trunk and branch-2, but I would rather have that
> > > inconvenience than the risk of merging something that hasn't been
> > > thoroughly tested.
> > >
> > > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com>
> wrote:
> > >
> > > > Hi Karthik,
> > > >
> > > > Thanks for comments! However, I think only merge to trunk may not
> work,
> > > > this patch involves thousands lines of code changes in scheduler
> side,
> > > only
> > > > putting that to trunk could lead to trunk/branch-2 totally
> incompatible
> > > for
> > > > resource manager. I think most of the code changes are new to
> scheduler
> > > > instead of modifying existed logic, they're not very tricky to me.
> And
> > > when
> > > > 2.8 will be released is not planned yet, at least we have a couple of
> > > > months to make sure this feature becomes usable and not cause
> existing
> > > > behavior regressions.
> > > >
> > > > Sounds good to you?
> > > >
> > > > Wangda
> > > >
> > > >
> > > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <
> kasha@cloudera.com>
> > > > wrote:
> > > >
> > > > > +1 on merging to trunk. It would be nice to have some amount of
> > testing
> > > > > done before the merge, but I understand how merging to trunk would
> > > likely
> > > > > speed up the testing efforts.
> > > > >
> > > > > Let us not merge into branch-2 until after we have done a fair bit
> of
> > > > > testing, and are comfortable including it in a release. While the
> > code
> > > > > mostly appears to not mess with existing scheduling logic, I am
> > > concerned
> > > > > about regressions to existing scheduling behavior.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
> > kasha@cloudera.com>
> > > > > wrote:
> > > > >
> > > > > > By the way, for the purposes of merge vote, I believe a
> committer's
> > > > vote
> > > > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
> > > > > >
> > > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com>
> > > wrote:
> > > > > >
> > > > > >> +1 (non-binding)
> > > > > >>
> > > > > >> thanks
> > > > > >> Zhihai Xu
> > > > > >>
> > > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
> > xgong@hortonworks.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > +1 Binding
> > > > > >> >
> > > > > >> > Thanks
> > > > > >> >
> > > > > >> > Xuan Gong
> > > > > >> >
> > > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <
> jdu@hortonworks.com
> > >
> > > > > wrote:
> > > > > >> > >
> > > > > >> > > +1. (Binding).
> > > > > >> > >
> > > > > >> > > Thanks,
> > > > > >> > >
> > > > > >> > > Junping
> > > > > >> > > ________________________________________
> > > > > >> > > From: Wangda Tan <wh...@gmail.com>
> > > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> > > > > >> > > To: yarn-dev@hadoop.apache.org
> > > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into
> > trunk
> > > > > >> > >
> > > > > >> > > +1 (non-binding),
> > > > > >> > >
> > > > > >> > > Thanks Jian starting this thread. This can minimize effort
> of
> > > > works
> > > > > >> > across branches.
> > > > > >> > >
> > > > > >> > > To clarify, this feature is end-to-end code  completed, we
> > have
> > > > API,
> > > > > >> > rm/nm implementations patches committed, but we haven't tested
> > it
> > > > > >> > end-to-end. Filed YARN-4175 to create an example program to
> test
> > > it
> > > > > >> > end-to-end.
> > > > > >> > >
> > > > > >> > > Regards,
> > > > > >> > > Wangda
> > > > > >> > >
> > > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com>
> > > > wrote:
> > > > > >> > >>
> > > > > >> > >> Hi All,
> > > > > >> > >>
> > > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
> > > > > >> > >>
> > > > > >> > >> I would like to call a vote to merge YARN-1197 container
> > resize
> > > > > into
> > > > > >> > trunk.
> > > > > >> > >>
> > > > > >> > >> Key idea:
> > > > > >> > >> This feature adds the ability for AM to change container
> > > resource
> > > > > >> size
> > > > > >> > at runtime.
> > > > > >> > >>
> > > > > >> > >> Details:
> > > > > >> > >> - This feature is tracked at
> > > > > >> > https://issues.apache.org/jira/browse/YARN-1197
> > > > > >> > >> - It’s currently developed at a separate branch:
> > > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > > > > >> > >> - A uber patch(
> > https://issues.apache.org/jira/browse/YARN-4157
> > > )
> > > > > >> > generated from YARN-1197 to run against trunk  shows all unit
> > > tests
> > > > > have
> > > > > >> > passed.
> > > > > >> > >> - This feature now can work end-to-end.
> > > > > >> > >> -  All the unresolved jiras under YARN-1197 will be the
> next
> > > > step.
> > > > > >> > >>
> > > > > >> > >> Thanks,
> > > > > >> > >> Wangda Tan & Meng Ding & Jian He
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.
I did skim through the patch before voting :)

As I mentioned earlier, I understand vast majority of the code is net new,
and the likelihood of breaking changes is low. Still, I would like for us
to be more careful and run some end-to-end and regression tests before
including this in a release. While I won't block the merge to branch-2,
will be a -1 on cutting a release without due testing.

At the risk of digressing, I also feel branch-2 should be in a releasable
state more often than not. The only way to bring any cadence and
predictability to our releases is by maintaining branch-2 in such a state.

On Tue, Sep 22, 2015 at 8:14 PM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Karthik,
>
> Let me elaborate more to make you feel better of this change, don't be
> scared by the size of the patch :)
>
> Common RM/Scheduler part:
> - AbstractYarnScheduler new logic only.
> - AppSchedulingInfo new logic only.
> - RMContainer / RMNode state machine, new logic only
> - SchedulerApplicationAttempt / Allocation, refactoring to existing
> reservation logic so increase request reservation can reuse it, and
> refactored to simply updating container token / pull container part so
> increase/decrease/new-allocation can reuse same code.
>
> FairScheduler:
> - Small change since we updated how to pull container updated token. I
> believe it will be a straightforward change for you if you take a closer
> look at it.
>
> CapacityScheduler:
> - Most changes are separate logic or small refactorings, most complex
> allocation logic stays within IncreaseContainerAllocator.java.
>
> Please let me know where you want to get more details of implementations.
>
> I strongly suggest you to take a glance at the diff, we have already worked
> on the merge for the past one week, and we've paid a lot of extra time to
> keep YARN-1197 sync with trunk in the past several months. After this merge
> finished, a couple of weeks needed to finish end-to-end test and some other
> extra tests, it won't affect our upcoming branch-2 release.
>
> I would not prefer to merge to trunk only, all people working on RM side
> will be affected, we're very carefully avoid such divergence of RM in
> trunk/branch-2. Since nobody wants to create two different patches for
> every RM changes. And also, after this finished, other efforts can happen
> in parallel such as YARN-4091.
>
> Let me know if you have any other questions/concerns.
>
> Thanks,
> Wangda
>
> On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>
> > I am sorry, but merging a potentially disruptive change to branch-2
> without
> > end-to-end tests seems too disruptive to me.
> >
> > I do agree with you on the potential inconvenience of having to post
> > different patches for trunk and branch-2, but I would rather have that
> > inconvenience than the risk of merging something that hasn't been
> > thoroughly tested.
> >
> > On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com> wrote:
> >
> > > Hi Karthik,
> > >
> > > Thanks for comments! However, I think only merge to trunk may not work,
> > > this patch involves thousands lines of code changes in scheduler side,
> > only
> > > putting that to trunk could lead to trunk/branch-2 totally incompatible
> > for
> > > resource manager. I think most of the code changes are new to scheduler
> > > instead of modifying existed logic, they're not very tricky to me. And
> > when
> > > 2.8 will be released is not planned yet, at least we have a couple of
> > > months to make sure this feature becomes usable and not cause existing
> > > behavior regressions.
> > >
> > > Sounds good to you?
> > >
> > > Wangda
> > >
> > >
> > > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <ka...@cloudera.com>
> > > wrote:
> > >
> > > > +1 on merging to trunk. It would be nice to have some amount of
> testing
> > > > done before the merge, but I understand how merging to trunk would
> > likely
> > > > speed up the testing efforts.
> > > >
> > > > Let us not merge into branch-2 until after we have done a fair bit of
> > > > testing, and are comfortable including it in a release. While the
> code
> > > > mostly appears to not mess with existing scheduling logic, I am
> > concerned
> > > > about regressions to existing scheduling behavior.
> > > >
> > > >
> > > >
> > > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <
> kasha@cloudera.com>
> > > > wrote:
> > > >
> > > > > By the way, for the purposes of merge vote, I believe a committer's
> > > vote
> > > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
> > > > >
> > > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com>
> > wrote:
> > > > >
> > > > >> +1 (non-binding)
> > > > >>
> > > > >> thanks
> > > > >> Zhihai Xu
> > > > >>
> > > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <
> xgong@hortonworks.com>
> > > > >> wrote:
> > > > >>
> > > > >> > +1 Binding
> > > > >> >
> > > > >> > Thanks
> > > > >> >
> > > > >> > Xuan Gong
> > > > >> >
> > > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jdu@hortonworks.com
> >
> > > > wrote:
> > > > >> > >
> > > > >> > > +1. (Binding).
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > >
> > > > >> > > Junping
> > > > >> > > ________________________________________
> > > > >> > > From: Wangda Tan <wh...@gmail.com>
> > > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> > > > >> > > To: yarn-dev@hadoop.apache.org
> > > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into
> trunk
> > > > >> > >
> > > > >> > > +1 (non-binding),
> > > > >> > >
> > > > >> > > Thanks Jian starting this thread. This can minimize effort of
> > > works
> > > > >> > across branches.
> > > > >> > >
> > > > >> > > To clarify, this feature is end-to-end code  completed, we
> have
> > > API,
> > > > >> > rm/nm implementations patches committed, but we haven't tested
> it
> > > > >> > end-to-end. Filed YARN-4175 to create an example program to test
> > it
> > > > >> > end-to-end.
> > > > >> > >
> > > > >> > > Regards,
> > > > >> > > Wangda
> > > > >> > >
> > > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com>
> > > wrote:
> > > > >> > >>
> > > > >> > >> Hi All,
> > > > >> > >>
> > > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
> > > > >> > >>
> > > > >> > >> I would like to call a vote to merge YARN-1197 container
> resize
> > > > into
> > > > >> > trunk.
> > > > >> > >>
> > > > >> > >> Key idea:
> > > > >> > >> This feature adds the ability for AM to change container
> > resource
> > > > >> size
> > > > >> > at runtime.
> > > > >> > >>
> > > > >> > >> Details:
> > > > >> > >> - This feature is tracked at
> > > > >> > https://issues.apache.org/jira/browse/YARN-1197
> > > > >> > >> - It’s currently developed at a separate branch:
> > > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > > > >> > >> - A uber patch(
> https://issues.apache.org/jira/browse/YARN-4157
> > )
> > > > >> > generated from YARN-1197 to run against trunk  shows all unit
> > tests
> > > > have
> > > > >> > passed.
> > > > >> > >> - This feature now can work end-to-end.
> > > > >> > >> -  All the unresolved jiras under YARN-1197 will be the next
> > > step.
> > > > >> > >>
> > > > >> > >> Thanks,
> > > > >> > >> Wangda Tan & Meng Ding & Jian He
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
Hi Karthik,

Let me elaborate more to make you feel better of this change, don't be
scared by the size of the patch :)

Common RM/Scheduler part:
- AbstractYarnScheduler new logic only.
- AppSchedulingInfo new logic only.
- RMContainer / RMNode state machine, new logic only
- SchedulerApplicationAttempt / Allocation, refactoring to existing
reservation logic so increase request reservation can reuse it, and
refactored to simply updating container token / pull container part so
increase/decrease/new-allocation can reuse same code.

FairScheduler:
- Small change since we updated how to pull container updated token. I
believe it will be a straightforward change for you if you take a closer
look at it.

CapacityScheduler:
- Most changes are separate logic or small refactorings, most complex
allocation logic stays within IncreaseContainerAllocator.java.

Please let me know where you want to get more details of implementations.

I strongly suggest you to take a glance at the diff, we have already worked
on the merge for the past one week, and we've paid a lot of extra time to
keep YARN-1197 sync with trunk in the past several months. After this merge
finished, a couple of weeks needed to finish end-to-end test and some other
extra tests, it won't affect our upcoming branch-2 release.

I would not prefer to merge to trunk only, all people working on RM side
will be affected, we're very carefully avoid such divergence of RM in
trunk/branch-2. Since nobody wants to create two different patches for
every RM changes. And also, after this finished, other efforts can happen
in parallel such as YARN-4091.

Let me know if you have any other questions/concerns.

Thanks,
Wangda

On Tue, Sep 22, 2015 at 7:26 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> I am sorry, but merging a potentially disruptive change to branch-2 without
> end-to-end tests seems too disruptive to me.
>
> I do agree with you on the potential inconvenience of having to post
> different patches for trunk and branch-2, but I would rather have that
> inconvenience than the risk of merging something that hasn't been
> thoroughly tested.
>
> On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com> wrote:
>
> > Hi Karthik,
> >
> > Thanks for comments! However, I think only merge to trunk may not work,
> > this patch involves thousands lines of code changes in scheduler side,
> only
> > putting that to trunk could lead to trunk/branch-2 totally incompatible
> for
> > resource manager. I think most of the code changes are new to scheduler
> > instead of modifying existed logic, they're not very tricky to me. And
> when
> > 2.8 will be released is not planned yet, at least we have a couple of
> > months to make sure this feature becomes usable and not cause existing
> > behavior regressions.
> >
> > Sounds good to you?
> >
> > Wangda
> >
> >
> > On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <ka...@cloudera.com>
> > wrote:
> >
> > > +1 on merging to trunk. It would be nice to have some amount of testing
> > > done before the merge, but I understand how merging to trunk would
> likely
> > > speed up the testing efforts.
> > >
> > > Let us not merge into branch-2 until after we have done a fair bit of
> > > testing, and are comfortable including it in a release. While the code
> > > mostly appears to not mess with existing scheduling logic, I am
> concerned
> > > about regressions to existing scheduling behavior.
> > >
> > >
> > >
> > > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <ka...@cloudera.com>
> > > wrote:
> > >
> > > > By the way, for the purposes of merge vote, I believe a committer's
> > vote
> > > > is binding. So, Wangda and Zhihai's votes should be binding. :)
> > > >
> > > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com>
> wrote:
> > > >
> > > >> +1 (non-binding)
> > > >>
> > > >> thanks
> > > >> Zhihai Xu
> > > >>
> > > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <xg...@hortonworks.com>
> > > >> wrote:
> > > >>
> > > >> > +1 Binding
> > > >> >
> > > >> > Thanks
> > > >> >
> > > >> > Xuan Gong
> > > >> >
> > > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com>
> > > wrote:
> > > >> > >
> > > >> > > +1. (Binding).
> > > >> > >
> > > >> > > Thanks,
> > > >> > >
> > > >> > > Junping
> > > >> > > ________________________________________
> > > >> > > From: Wangda Tan <wh...@gmail.com>
> > > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> > > >> > > To: yarn-dev@hadoop.apache.org
> > > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
> > > >> > >
> > > >> > > +1 (non-binding),
> > > >> > >
> > > >> > > Thanks Jian starting this thread. This can minimize effort of
> > works
> > > >> > across branches.
> > > >> > >
> > > >> > > To clarify, this feature is end-to-end code  completed, we have
> > API,
> > > >> > rm/nm implementations patches committed, but we haven't tested it
> > > >> > end-to-end. Filed YARN-4175 to create an example program to test
> it
> > > >> > end-to-end.
> > > >> > >
> > > >> > > Regards,
> > > >> > > Wangda
> > > >> > >
> > > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com>
> > wrote:
> > > >> > >>
> > > >> > >> Hi All,
> > > >> > >>
> > > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
> > > >> > >>
> > > >> > >> I would like to call a vote to merge YARN-1197 container resize
> > > into
> > > >> > trunk.
> > > >> > >>
> > > >> > >> Key idea:
> > > >> > >> This feature adds the ability for AM to change container
> resource
> > > >> size
> > > >> > at runtime.
> > > >> > >>
> > > >> > >> Details:
> > > >> > >> - This feature is tracked at
> > > >> > https://issues.apache.org/jira/browse/YARN-1197
> > > >> > >> - It’s currently developed at a separate branch:
> > > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > > >> > >> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157
> )
> > > >> > generated from YARN-1197 to run against trunk  shows all unit
> tests
> > > have
> > > >> > passed.
> > > >> > >> - This feature now can work end-to-end.
> > > >> > >> -  All the unresolved jiras under YARN-1197 will be the next
> > step.
> > > >> > >>
> > > >> > >> Thanks,
> > > >> > >> Wangda Tan & Meng Ding & Jian He
> > > >> > >
> > > >> > >
> > > >> >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.
I am sorry, but merging a potentially disruptive change to branch-2 without
end-to-end tests seems too disruptive to me.

I do agree with you on the potential inconvenience of having to post
different patches for trunk and branch-2, but I would rather have that
inconvenience than the risk of merging something that hasn't been
thoroughly tested.

On Tue, Sep 22, 2015 at 6:18 PM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Karthik,
>
> Thanks for comments! However, I think only merge to trunk may not work,
> this patch involves thousands lines of code changes in scheduler side, only
> putting that to trunk could lead to trunk/branch-2 totally incompatible for
> resource manager. I think most of the code changes are new to scheduler
> instead of modifying existed logic, they're not very tricky to me. And when
> 2.8 will be released is not planned yet, at least we have a couple of
> months to make sure this feature becomes usable and not cause existing
> behavior regressions.
>
> Sounds good to you?
>
> Wangda
>
>
> On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>
> > +1 on merging to trunk. It would be nice to have some amount of testing
> > done before the merge, but I understand how merging to trunk would likely
> > speed up the testing efforts.
> >
> > Let us not merge into branch-2 until after we have done a fair bit of
> > testing, and are comfortable including it in a release. While the code
> > mostly appears to not mess with existing scheduling logic, I am concerned
> > about regressions to existing scheduling behavior.
> >
> >
> >
> > On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <ka...@cloudera.com>
> > wrote:
> >
> > > By the way, for the purposes of merge vote, I believe a committer's
> vote
> > > is binding. So, Wangda and Zhihai's votes should be binding. :)
> > >
> > > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com> wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> thanks
> > >> Zhihai Xu
> > >>
> > >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <xg...@hortonworks.com>
> > >> wrote:
> > >>
> > >> > +1 Binding
> > >> >
> > >> > Thanks
> > >> >
> > >> > Xuan Gong
> > >> >
> > >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com>
> > wrote:
> > >> > >
> > >> > > +1. (Binding).
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Junping
> > >> > > ________________________________________
> > >> > > From: Wangda Tan <wh...@gmail.com>
> > >> > > Sent: Thursday, September 17, 2015 3:19 AM
> > >> > > To: yarn-dev@hadoop.apache.org
> > >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
> > >> > >
> > >> > > +1 (non-binding),
> > >> > >
> > >> > > Thanks Jian starting this thread. This can minimize effort of
> works
> > >> > across branches.
> > >> > >
> > >> > > To clarify, this feature is end-to-end code  completed, we have
> API,
> > >> > rm/nm implementations patches committed, but we haven't tested it
> > >> > end-to-end. Filed YARN-4175 to create an example program to test it
> > >> > end-to-end.
> > >> > >
> > >> > > Regards,
> > >> > > Wangda
> > >> > >
> > >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com>
> wrote:
> > >> > >>
> > >> > >> Hi All,
> > >> > >>
> > >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
> > >> > >>
> > >> > >> I would like to call a vote to merge YARN-1197 container resize
> > into
> > >> > trunk.
> > >> > >>
> > >> > >> Key idea:
> > >> > >> This feature adds the ability for AM to change container resource
> > >> size
> > >> > at runtime.
> > >> > >>
> > >> > >> Details:
> > >> > >> - This feature is tracked at
> > >> > https://issues.apache.org/jira/browse/YARN-1197
> > >> > >> - It’s currently developed at a separate branch:
> > >> > https://github.com/apache/hadoop/commits/YARN-1197
> > >> > >> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157)
> > >> > generated from YARN-1197 to run against trunk  shows all unit tests
> > have
> > >> > passed.
> > >> > >> - This feature now can work end-to-end.
> > >> > >> -  All the unresolved jiras under YARN-1197 will be the next
> step.
> > >> > >>
> > >> > >> Thanks,
> > >> > >> Wangda Tan & Meng Ding & Jian He
> > >> > >
> > >> > >
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
Hi Karthik,

Thanks for comments! However, I think only merge to trunk may not work,
this patch involves thousands lines of code changes in scheduler side, only
putting that to trunk could lead to trunk/branch-2 totally incompatible for
resource manager. I think most of the code changes are new to scheduler
instead of modifying existed logic, they're not very tricky to me. And when
2.8 will be released is not planned yet, at least we have a couple of
months to make sure this feature becomes usable and not cause existing
behavior regressions.

Sounds good to you?

Wangda


On Tue, Sep 22, 2015 at 3:48 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> +1 on merging to trunk. It would be nice to have some amount of testing
> done before the merge, but I understand how merging to trunk would likely
> speed up the testing efforts.
>
> Let us not merge into branch-2 until after we have done a fair bit of
> testing, and are comfortable including it in a release. While the code
> mostly appears to not mess with existing scheduling logic, I am concerned
> about regressions to existing scheduling behavior.
>
>
>
> On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
>
> > By the way, for the purposes of merge vote, I believe a committer's vote
> > is binding. So, Wangda and Zhihai's votes should be binding. :)
> >
> > On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com> wrote:
> >
> >> +1 (non-binding)
> >>
> >> thanks
> >> Zhihai Xu
> >>
> >> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <xg...@hortonworks.com>
> >> wrote:
> >>
> >> > +1 Binding
> >> >
> >> > Thanks
> >> >
> >> > Xuan Gong
> >> >
> >> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com>
> wrote:
> >> > >
> >> > > +1. (Binding).
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Junping
> >> > > ________________________________________
> >> > > From: Wangda Tan <wh...@gmail.com>
> >> > > Sent: Thursday, September 17, 2015 3:19 AM
> >> > > To: yarn-dev@hadoop.apache.org
> >> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
> >> > >
> >> > > +1 (non-binding),
> >> > >
> >> > > Thanks Jian starting this thread. This can minimize effort of works
> >> > across branches.
> >> > >
> >> > > To clarify, this feature is end-to-end code  completed, we have API,
> >> > rm/nm implementations patches committed, but we haven't tested it
> >> > end-to-end. Filed YARN-4175 to create an example program to test it
> >> > end-to-end.
> >> > >
> >> > > Regards,
> >> > > Wangda
> >> > >
> >> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com> wrote:
> >> > >>
> >> > >> Hi All,
> >> > >>
> >> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
> >> > >>
> >> > >> I would like to call a vote to merge YARN-1197 container resize
> into
> >> > trunk.
> >> > >>
> >> > >> Key idea:
> >> > >> This feature adds the ability for AM to change container resource
> >> size
> >> > at runtime.
> >> > >>
> >> > >> Details:
> >> > >> - This feature is tracked at
> >> > https://issues.apache.org/jira/browse/YARN-1197
> >> > >> - It’s currently developed at a separate branch:
> >> > https://github.com/apache/hadoop/commits/YARN-1197
> >> > >> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157)
> >> > generated from YARN-1197 to run against trunk  shows all unit tests
> have
> >> > passed.
> >> > >> - This feature now can work end-to-end.
> >> > >> -  All the unresolved jiras under YARN-1197 will be the next step.
> >> > >>
> >> > >> Thanks,
> >> > >> Wangda Tan & Meng Ding & Jian He
> >> > >
> >> > >
> >> >
> >> >
> >>
> >
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.
+1 on merging to trunk. It would be nice to have some amount of testing
done before the merge, but I understand how merging to trunk would likely
speed up the testing efforts.

Let us not merge into branch-2 until after we have done a fair bit of
testing, and are comfortable including it in a release. While the code
mostly appears to not mess with existing scheduling logic, I am concerned
about regressions to existing scheduling behavior.



On Tue, Sep 22, 2015 at 1:28 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> By the way, for the purposes of merge vote, I believe a committer's vote
> is binding. So, Wangda and Zhihai's votes should be binding. :)
>
> On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com> wrote:
>
>> +1 (non-binding)
>>
>> thanks
>> Zhihai Xu
>>
>> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <xg...@hortonworks.com>
>> wrote:
>>
>> > +1 Binding
>> >
>> > Thanks
>> >
>> > Xuan Gong
>> >
>> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com> wrote:
>> > >
>> > > +1. (Binding).
>> > >
>> > > Thanks,
>> > >
>> > > Junping
>> > > ________________________________________
>> > > From: Wangda Tan <wh...@gmail.com>
>> > > Sent: Thursday, September 17, 2015 3:19 AM
>> > > To: yarn-dev@hadoop.apache.org
>> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
>> > >
>> > > +1 (non-binding),
>> > >
>> > > Thanks Jian starting this thread. This can minimize effort of works
>> > across branches.
>> > >
>> > > To clarify, this feature is end-to-end code  completed, we have API,
>> > rm/nm implementations patches committed, but we haven't tested it
>> > end-to-end. Filed YARN-4175 to create an example program to test it
>> > end-to-end.
>> > >
>> > > Regards,
>> > > Wangda
>> > >
>> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com> wrote:
>> > >>
>> > >> Hi All,
>> > >>
>> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
>> > >>
>> > >> I would like to call a vote to merge YARN-1197 container resize into
>> > trunk.
>> > >>
>> > >> Key idea:
>> > >> This feature adds the ability for AM to change container resource
>> size
>> > at runtime.
>> > >>
>> > >> Details:
>> > >> - This feature is tracked at
>> > https://issues.apache.org/jira/browse/YARN-1197
>> > >> - It’s currently developed at a separate branch:
>> > https://github.com/apache/hadoop/commits/YARN-1197
>> > >> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157)
>> > generated from YARN-1197 to run against trunk  shows all unit tests have
>> > passed.
>> > >> - This feature now can work end-to-end.
>> > >> -  All the unresolved jiras under YARN-1197 will be the next step.
>> > >>
>> > >> Thanks,
>> > >> Wangda Tan & Meng Ding & Jian He
>> > >
>> > >
>> >
>> >
>>
>
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Karthik Kambatla <ka...@cloudera.com>.
By the way, for the purposes of merge vote, I believe a committer's vote is
binding. So, Wangda and Zhihai's votes should be binding. :)

On Tue, Sep 22, 2015 at 11:38 AM, Zhihai Xu <zx...@cloudera.com> wrote:

> +1 (non-binding)
>
> thanks
> Zhihai Xu
>
> On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <xg...@hortonworks.com> wrote:
>
> > +1 Binding
> >
> > Thanks
> >
> > Xuan Gong
> >
> > > On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com> wrote:
> > >
> > > +1. (Binding).
> > >
> > > Thanks,
> > >
> > > Junping
> > > ________________________________________
> > > From: Wangda Tan <wh...@gmail.com>
> > > Sent: Thursday, September 17, 2015 3:19 AM
> > > To: yarn-dev@hadoop.apache.org
> > > Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
> > >
> > > +1 (non-binding),
> > >
> > > Thanks Jian starting this thread. This can minimize effort of works
> > across branches.
> > >
> > > To clarify, this feature is end-to-end code  completed, we have API,
> > rm/nm implementations patches committed, but we haven't tested it
> > end-to-end. Filed YARN-4175 to create an example program to test it
> > end-to-end.
> > >
> > > Regards,
> > > Wangda
> > >
> > >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com> wrote:
> > >>
> > >> Hi All,
> > >>
> > >> Thanks Meng Ding and Wangda Tan for all the hard work !
> > >>
> > >> I would like to call a vote to merge YARN-1197 container resize into
> > trunk.
> > >>
> > >> Key idea:
> > >> This feature adds the ability for AM to change container resource size
> > at runtime.
> > >>
> > >> Details:
> > >> - This feature is tracked at
> > https://issues.apache.org/jira/browse/YARN-1197
> > >> - It’s currently developed at a separate branch:
> > https://github.com/apache/hadoop/commits/YARN-1197
> > >> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157)
> > generated from YARN-1197 to run against trunk  shows all unit tests have
> > passed.
> > >> - This feature now can work end-to-end.
> > >> -  All the unresolved jiras under YARN-1197 will be the next step.
> > >>
> > >> Thanks,
> > >> Wangda Tan & Meng Ding & Jian He
> > >
> > >
> >
> >
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Zhihai Xu <zx...@cloudera.com>.
+1 (non-binding)

thanks
Zhihai Xu

On Tue, Sep 22, 2015 at 12:10 AM, Xuan Gong <xg...@hortonworks.com> wrote:

> +1 Binding
>
> Thanks
>
> Xuan Gong
>
> > On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com> wrote:
> >
> > +1. (Binding).
> >
> > Thanks,
> >
> > Junping
> > ________________________________________
> > From: Wangda Tan <wh...@gmail.com>
> > Sent: Thursday, September 17, 2015 3:19 AM
> > To: yarn-dev@hadoop.apache.org
> > Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
> >
> > +1 (non-binding),
> >
> > Thanks Jian starting this thread. This can minimize effort of works
> across branches.
> >
> > To clarify, this feature is end-to-end code  completed, we have API,
> rm/nm implementations patches committed, but we haven't tested it
> end-to-end. Filed YARN-4175 to create an example program to test it
> end-to-end.
> >
> > Regards,
> > Wangda
> >
> >> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com> wrote:
> >>
> >> Hi All,
> >>
> >> Thanks Meng Ding and Wangda Tan for all the hard work !
> >>
> >> I would like to call a vote to merge YARN-1197 container resize into
> trunk.
> >>
> >> Key idea:
> >> This feature adds the ability for AM to change container resource size
> at runtime.
> >>
> >> Details:
> >> - This feature is tracked at
> https://issues.apache.org/jira/browse/YARN-1197
> >> - It’s currently developed at a separate branch:
> https://github.com/apache/hadoop/commits/YARN-1197
> >> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157)
> generated from YARN-1197 to run against trunk  shows all unit tests have
> passed.
> >> - This feature now can work end-to-end.
> >> -  All the unresolved jiras under YARN-1197 will be the next step.
> >>
> >> Thanks,
> >> Wangda Tan & Meng Ding & Jian He
> >
> >
>
>

Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Xuan Gong <xg...@hortonworks.com>.
+1 Binding

Thanks

Xuan Gong

> On Sep 22, 2015, at 12:03 AM, Junping Du <jd...@hortonworks.com> wrote:
> 
> +1. (Binding).
> 
> Thanks,
> 
> Junping
> ________________________________________
> From: Wangda Tan <wh...@gmail.com>
> Sent: Thursday, September 17, 2015 3:19 AM
> To: yarn-dev@hadoop.apache.org
> Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk
> 
> +1 (non-binding),
> 
> Thanks Jian starting this thread. This can minimize effort of works across branches.
> 
> To clarify, this feature is end-to-end code  completed, we have API, rm/nm implementations patches committed, but we haven't tested it end-to-end. Filed YARN-4175 to create an example program to test it end-to-end.
> 
> Regards,
> Wangda
> 
>> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com> wrote:
>> 
>> Hi All,
>> 
>> Thanks Meng Ding and Wangda Tan for all the hard work !
>> 
>> I would like to call a vote to merge YARN-1197 container resize into trunk.
>> 
>> Key idea:
>> This feature adds the ability for AM to change container resource size at runtime.
>> 
>> Details:
>> - This feature is tracked at https://issues.apache.org/jira/browse/YARN-1197
>> - It’s currently developed at a separate branch: https://github.com/apache/hadoop/commits/YARN-1197
>> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157) generated from YARN-1197 to run against trunk  shows all unit tests have passed.
>> - This feature now can work end-to-end.
>> -  All the unresolved jiras under YARN-1197 will be the next step.
>> 
>> Thanks,
>> Wangda Tan & Meng Ding & Jian He
> 
> 


Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Junping Du <jd...@hortonworks.com>.
+1. (Binding).

Thanks,

Junping
________________________________________
From: Wangda Tan <wh...@gmail.com>
Sent: Thursday, September 17, 2015 3:19 AM
To: yarn-dev@hadoop.apache.org
Subject: Re: [VOTE] Merge YARN-1197 container resize into trunk

+1 (non-binding),

Thanks Jian starting this thread. This can minimize effort of works across branches.

To clarify, this feature is end-to-end code  completed, we have API, rm/nm implementations patches committed, but we haven't tested it end-to-end. Filed YARN-4175 to create an example program to test it end-to-end.

Regards,
Wangda

> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com> wrote:
>
> Hi All,
>
> Thanks Meng Ding and Wangda Tan for all the hard work !
>
> I would like to call a vote to merge YARN-1197 container resize into trunk.
>
> Key idea:
> This feature adds the ability for AM to change container resource size at runtime.
>
> Details:
> - This feature is tracked at https://issues.apache.org/jira/browse/YARN-1197
> - It’s currently developed at a separate branch: https://github.com/apache/hadoop/commits/YARN-1197
> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157) generated from YARN-1197 to run against trunk  shows all unit tests have passed.
> - This feature now can work end-to-end.
> -  All the unresolved jiras under YARN-1197 will be the next step.
>
> Thanks,
> Wangda Tan & Meng Ding & Jian He


Re: [VOTE] Merge YARN-1197 container resize into trunk

Posted by Wangda Tan <wh...@gmail.com>.
+1 (non-binding),

Thanks Jian starting this thread. This can minimize effort of works across branches.

To clarify, this feature is end-to-end code  completed, we have API, rm/nm implementations patches committed, but we haven't tested it end-to-end. Filed YARN-4175 to create an example program to test it end-to-end.

Regards,
Wangda

> On Sep 16, 2015, at 6:30 PM, Jian He <jh...@hortonworks.com> wrote:
> 
> Hi All,
> 
> Thanks Meng Ding and Wangda Tan for all the hard work !
> 
> I would like to call a vote to merge YARN-1197 container resize into trunk.
> 
> Key idea:
> This feature adds the ability for AM to change container resource size at runtime.
> 
> Details:
> - This feature is tracked at https://issues.apache.org/jira/browse/YARN-1197
> - It’s currently developed at a separate branch: https://github.com/apache/hadoop/commits/YARN-1197
> - A uber patch(https://issues.apache.org/jira/browse/YARN-4157) generated from YARN-1197 to run against trunk  shows all unit tests have passed.
> - This feature now can work end-to-end.
> -  All the unresolved jiras under YARN-1197 will be the next step.
> 
> Thanks,
> Wangda Tan & Meng Ding & Jian He