You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Jonathan Hung <jy...@gmail.com> on 2019/12/04 18:55:41 UTC

Re: [DISCUSS] Making 2.10 the last minor 2.x release

FYI, starting the rename process, beginning with INFRA-19521.

Jonathan Hung


On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
wrote:

> Hey guys,
>
> I think we diverged a bit from the initial topic of this discussion, which
> is removing branch-2.10, and changing the version of branch-2 from
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> Sounds like the subject line for this thread "Making 2.10 the last minor
> 2.x release" confused people.
> It is in fact a wider matter that can be discussed when somebody actually
> proposes to release 2.11, which I understand nobody does at the moment.
>
> So if anybody objects removing branch-2.10 please make an argument.
> Otherwise we should go ahead and just do it next week.
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
> Thanks,
> --Konstantin
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Thanks for the detailed thoughts, everyone.
>>
>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>> releases. As for putting features into minor/patch releases, if we keep the
>> convention of putting new features only into minor releases, my assumption
>> is still that it's unlikely people will want to get them into branch-2
>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>> even really removed support for java 7 in branch-2 (much less java 8), so I
>> feel moving to java 11 would go along with a move to branch 3. And as you
>> mentioned, if people really want to use java 11 on branch-2, we can always
>> revive branch-2. But for now I think the convenience of not needing to port
>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> potentially needing to revive branch-2.
>>
>> Jonathan Hung
>>
>>
>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>
>>> +1 for 2.10.x as last release for 2.x version.
>>>
>>> Software would become more compatible when more companies stress test
>>> the same software and making improvements in trunk.  Some may be extra
>>> caution on moving up the version because obligation internally to keep
>>> things running.  Company obligation should not be the driving force to
>>> maintain Hadoop branches.  There is no proper collaboration in the
>>> community when every name brand company maintains its own Hadoop 2.x
>>> version.  I think it would be more healthy for the community to reduce the
>>> branch forking and spend energy on trunk to harden the software.  This will
>>> give more confidence to move up the version than trying to fix n
>>> permutations breakage like Flash fixing the timeline.
>>>
>>> Apache license stated, there is no warranty of any kind for code
>>> contributions.  Fewer community release process should improve software
>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>
>>> regards,
>>> Eric
>>>
>>>
>>>
>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> <eb...@verizonmedia.com.invalid> wrote:
>>>
>>>> Hello all,
>>>>
>>>> Is it written anywhere what the difference is between a minor release
>>>> and a
>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> have
>>>> looked around and I can't find anything other than some compatibility
>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>>> this would help shape my opinion on whether or not to keep branch-2
>>>> alive.
>>>> My current understanding is that we can't really break compatibility in
>>>> either a minor or point release. But the only mention of the difference
>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>> and
>>>> Unstable tags, and how to deal with changing default configuration
>>>> values.
>>>> So it seems like there really isn't a big official difference between
>>>> the
>>>> two. In my mind, the functional difference between the two is that the
>>>> minor releases may have added features and rewrites, while the point
>>>> releases only have bug fixes. This might be an incorrect understanding,
>>>> but
>>>> that's what I have gathered from watching the releases over the last few
>>>> years. Whether or not this is a correct understanding, I think that this
>>>> needs to be documented somewhere, even if it is just a convention.
>>>>
>>>> Given my assumed understanding of minor vs point releases, here are the
>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>> correct me for anything you feel is missing or inadequate.
>>>> Pros:
>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>> 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> Cons:
>>>> - Bug fixes are less likely to be put into 2.10.x
>>>> - An extra branch to maintain
>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> patches to if they should go all the way back to 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> So on the one hand you get added stability in fewer features being
>>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>> But
>>>> we don't live in a perfect world and committers will make mistakes
>>>> either
>>>> because of lack of knowledge or simply because they made a mistake. If
>>>> we
>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>> (for
>>>> whatever reason) commit valid bug fixes back all the way to
>>>> branch-2.10. If
>>>> we don't have a branch-2, committers who want their borderline risky
>>>> feature in the 2.x line will err on the side of putting it into
>>>> branch-2.10
>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>> quite
>>>> a few assumptions here based on my own experiences, so I would like to
>>>> hear
>>>> if others have similar or opposing views.
>>>>
>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>> killing
>>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>>> why
>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>> trying
>>>> to move forward, keeping as many companies on similar branches as
>>>> possible
>>>> is a good way to make sure the code is well-tested. However, from a
>>>> stability point of view, moving to 3.x is still scary and being able to
>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>> bridge release effort has been very good at making it possible for
>>>> people
>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> that
>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>> due to
>>>> potential performance degradation at large scale.
>>>>
>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> support to
>>>> 2.x, surely not everyone is going to want that (at least not
>>>> immediately).
>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>> across
>>>> point releases within the same minor release except if the JVM version
>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>> release until Java 8 becomes unsupported (though one could argue that
>>>> it is
>>>> already unsupported since Oracle is no longer giving public Java 8
>>>> update).
>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> catalyst for a branch-2 revival?
>>>>
>>>> Not sure if this really leads to any sort of answer from me on whether
>>>> or
>>>> not we should keep branch-2 alive, but these are the things that I am
>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>> or
>>>> not is committers not being on the same page with where they should
>>>> commit
>>>> their patches.
>>>>
>>>> Eric
>>>>
>>>> [1]
>>>>
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> [2]
>>>>
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>
>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>> wrote:
>>>>
>>>> > Hi Konstantin,
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> the
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> least.
>>>> > I worry
>>>> >  that some committers may want to put new features into a branch 2
>>>> release,
>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>> don't
>>>> > always
>>>> >  catch corner cases or performance problems for some time (usually not
>>>> > until
>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>> be
>>>> > very
>>>> >  difficult to back out those changes.
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>> but I
>>>> > do
>>>> >  have these reservations.
>>>> >
>>>> > Thanks,
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>> > shv.hadoop@gmail.com> wrote:
>>>> > Hi Eric,
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> release the
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> between
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>>> in
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> > 1. eliminate confusion which branches people should commit their
>>>> back-ports
>>>> > to
>>>> > 2. save engineering effort committing to more branches than necessary
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>> > release 2.11 we can resurrect the branch.
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> > Thanks,
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>> pros
>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> are
>>>> > much
>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>> > people
>>>> > > outside of our company who expressed interest in getting new
>>>> features to
>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>> 2.10.0
>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>> > >
>>>> > > In any case, we can always reverse this decision if we really need
>>>> to, by
>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>> confusion
>>>> > IMO.
>>>> > >
>>>> > > Jonathan Hung
>>>> > >
>>>> > >
>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> epayne@apache.org>
>>>> > > wrote:
>>>> > >
>>>> > > > Thanks Jonathan for opening the discussion.
>>>> > > >
>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>> released,
>>>> > and
>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>> > premature
>>>> > > to
>>>> > > > make a decision at this point that there will never be a need for
>>>> a
>>>> > 2.11
>>>> > > > release.
>>>> > > >
>>>> > > > -Eric
>>>> > > >
>>>> > > >
>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>> > > > jyhung2357@gmail.com> wrote:
>>>> > > >
>>>> > > > Hi folks,
>>>> > > >
>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>> be a
>>>> > > bridge
>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>> minor
>>>> > > > release line in branch-2. Currently, the main issue is that
>>>> there's
>>>> > many
>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> going
>>>> > into
>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>> will
>>>> > > > likely never see the light of day unless they are backported to
>>>> > > > branch-2.10.
>>>> > > >
>>>> > > > To do this, I propose we:
>>>> > > >
>>>> > > >  - Delete branch-2.10
>>>> > > >  - Rename branch-2 to branch-2.10
>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> > > >
>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>> release
>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> > > >
>>>> > > > Thoughts?
>>>> > > >
>>>> > > > Jonathan Hung
>>>> > > >
>>>> > > > [1]
>>>> > >
>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> > > >
>>>> > >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>