You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Akira Ajisaka <aa...@apache.org> on 2020/04/15 09:25:45 UTC

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>