You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by Jonathan Hung <jy...@gmail.com> on 2019/11/15 02:51:40 UTC

[DISCUSS] Making 2.10 the last minor 2.x release

Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

   - Delete branch-2.10
   - Rename branch-2 to branch-2.10
   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Erik Krogen <xk...@apache.org>.
I'm in support of this. The current scheme is confusing, and as you
mentioned, is making the backport strategy less clear. It reminds me of the
branch-2.8 vs. branch-2 (destined for 2.9) days when various fixes would
make it into one or the other.

One other action item would be to do a quick verification that the new
branch-2.10 (current branch-2) has any fixes which were put into the
current branch-2.10. It should be a superset of the changes that went into
the two branches.

Thanks for the proposal, Jonathan!

Erik

On Thu, Nov 14, 2019 at 11:26 PM Jonathan Hung <jy...@gmail.com> wrote:

> Some other additional items we would need:
>
>    - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
>    2.10.1
>    - Remove 2.11.0 as a version in these projects
>
>
> Jonathan Hung
>
>
> On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> > minor release line in branch-2. Currently, the main issue is that there's
> > many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> > into branch-2.10 (which will become 2.10.1), so the fixes in branch-2
> will
> > likely never see the light of day unless they are backported to
> branch-2.10.
> >
> > To do this, I propose we:
> >
> >    - Delete branch-2.10
> >    - Rename branch-2 to branch-2.10
> >    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Erik Krogen <xk...@apache.org>.
I'm in support of this. The current scheme is confusing, and as you
mentioned, is making the backport strategy less clear. It reminds me of the
branch-2.8 vs. branch-2 (destined for 2.9) days when various fixes would
make it into one or the other.

One other action item would be to do a quick verification that the new
branch-2.10 (current branch-2) has any fixes which were put into the
current branch-2.10. It should be a superset of the changes that went into
the two branches.

Thanks for the proposal, Jonathan!

Erik

On Thu, Nov 14, 2019 at 11:26 PM Jonathan Hung <jy...@gmail.com> wrote:

> Some other additional items we would need:
>
>    - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
>    2.10.1
>    - Remove 2.11.0 as a version in these projects
>
>
> Jonathan Hung
>
>
> On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> > minor release line in branch-2. Currently, the main issue is that there's
> > many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> > into branch-2.10 (which will become 2.10.1), so the fixes in branch-2
> will
> > likely never see the light of day unless they are backported to
> branch-2.10.
> >
> > To do this, I propose we:
> >
> >    - Delete branch-2.10
> >    - Rename branch-2 to branch-2.10
> >    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Erik Krogen <xk...@apache.org>.
I'm in support of this. The current scheme is confusing, and as you
mentioned, is making the backport strategy less clear. It reminds me of the
branch-2.8 vs. branch-2 (destined for 2.9) days when various fixes would
make it into one or the other.

One other action item would be to do a quick verification that the new
branch-2.10 (current branch-2) has any fixes which were put into the
current branch-2.10. It should be a superset of the changes that went into
the two branches.

Thanks for the proposal, Jonathan!

Erik

On Thu, Nov 14, 2019 at 11:26 PM Jonathan Hung <jy...@gmail.com> wrote:

> Some other additional items we would need:
>
>    - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
>    2.10.1
>    - Remove 2.11.0 as a version in these projects
>
>
> Jonathan Hung
>
>
> On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> > minor release line in branch-2. Currently, the main issue is that there's
> > many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> > into branch-2.10 (which will become 2.10.1), so the fixes in branch-2
> will
> > likely never see the light of day unless they are backported to
> branch-2.10.
> >
> > To do this, I propose we:
> >
> >    - Delete branch-2.10
> >    - Rename branch-2 to branch-2.10
> >    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Erik Krogen <xk...@apache.org>.
I'm in support of this. The current scheme is confusing, and as you
mentioned, is making the backport strategy less clear. It reminds me of the
branch-2.8 vs. branch-2 (destined for 2.9) days when various fixes would
make it into one or the other.

One other action item would be to do a quick verification that the new
branch-2.10 (current branch-2) has any fixes which were put into the
current branch-2.10. It should be a superset of the changes that went into
the two branches.

Thanks for the proposal, Jonathan!

Erik

On Thu, Nov 14, 2019 at 11:26 PM Jonathan Hung <jy...@gmail.com> wrote:

> Some other additional items we would need:
>
>    - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
>    2.10.1
>    - Remove 2.11.0 as a version in these projects
>
>
> Jonathan Hung
>
>
> On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> > minor release line in branch-2. Currently, the main issue is that there's
> > many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> > into branch-2.10 (which will become 2.10.1), so the fixes in branch-2
> will
> > likely never see the light of day unless they are backported to
> branch-2.10.
> >
> > To do this, I propose we:
> >
> >    - Delete branch-2.10
> >    - Rename branch-2 to branch-2.10
> >    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Some other additional items we would need:

   - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
   2.10.1
   - Remove 2.11.0 as a version in these projects


Jonathan Hung


On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com> wrote:

> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a
> bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> minor release line in branch-2. Currently, the main issue is that there's
> many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to branch-2.10.
>
> To do this, I propose we:
>
>    - Delete branch-2.10
>    - Rename branch-2 to branch-2.10
>    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Wangda Tan <wh...@gmail.com>.
+1, thanks Jonathan for bringing this up!

On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Steve Loughran <st...@cloudera.com.INVALID>.
On Thu, Nov 21, 2019 at 11:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>


I don't think we should evern support java 11 in branch-2.

"If you want to use a recent version of java - use a recent version of
hadoop"

I've not been backporting my stuff for a long, long time, and my general
stance with branch-2 bug reports on bits of the code that I work on is
"what does it do on hadoop 3.2?"

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Steve Loughran <st...@cloudera.com.INVALID>.
On Thu, Nov 21, 2019 at 11:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>


I don't think we should evern support java 11 in branch-2.

"If you want to use a recent version of java - use a recent version of
hadoop"

I've not been backporting my stuff for a long, long time, and my general
stance with branch-2 bug reports on bits of the code that I work on is
"what does it do on hadoop 3.2?"

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Source code has been deleted from branch-2. Thanks Akira for taking this up!

Jonathan Hung


On Thu, Apr 16, 2020 at 11:40 AM Jonathan Hung <jy...@gmail.com> wrote:

> Makes sense. I've cherry-picked the commits in branch-2 that were missed
> in branch-2.10.
>
> Jonathan Hung
>
>
> On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Hi folks,
>>
>> I am still seeing some changes are being committed to branch-2.
>> I'd like to delete the source code from branch-2 to avoid mistakes.
>> https://issues.apache.org/jira/browse/HADOOP-16988
>>
>> -Akira
>>
>> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>>
>>> Hi Jim,
>>> Thanx for catching, I have configured the build to run on branch-2.10.
>>>
>>> -Ayush
>>>
>>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <
>>> james.brennan@verizonmedia.com> wrote:
>>>
>>>> It looks like QBT tests are still being run on branch-2 (
>>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>>> and they are not very helpful at this point.
>>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>>
>>>> Jim
>>>>
>>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you, Ayush.
>>>>>
>>>>> I understand we should keep branch-2 as is, as well as master.
>>>>>
>>>>> -Akira
>>>>>
>>>>>
>>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Hi Akira
>>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>>> protected,
>>>>> > we can’t delete it directly.
>>>>> >
>>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>>> >
>>>>> > -Ayush
>>>>> >
>>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > Thank you for your work, Jonathan.
>>>>> >
>>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>>> remove
>>>>> > it?
>>>>> > I think the branch should be protected if possible.
>>>>> >
>>>>> > -Akira
>>>>> >
>>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>>> > wrote:
>>>>> >
>>>>> > It's done. The new commit chain is: trunk -> branch-3.2 ->
>>>>> branch-3.1 ->
>>>>> >
>>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>>> please
>>>>> >
>>>>> > don't try to commit to it)
>>>>> >
>>>>> >
>>>>> > Completed procedure:
>>>>> >
>>>>> >
>>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>>> >
>>>>> >   - Delete old branch-2.10
>>>>> >
>>>>> >   - Rename branch-2 to (new) branch-2.10
>>>>> >
>>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>>> >
>>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>>> >
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>>> >
>>>>> > shv.hadoop@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hey guys,
>>>>> >
>>>>> >
>>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>>> >
>>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>>> from
>>>>> >
>>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>>> >
>>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>>> minor
>>>>> >
>>>>> > 2.x release" confused people.
>>>>> >
>>>>> > It is in fact a wider matter that can be discussed when somebody
>>>>> >
>>>>> > actually
>>>>> >
>>>>> > proposes to release 2.11, which I understand nobody does at the
>>>>> moment.
>>>>> >
>>>>> >
>>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>>> >
>>>>> > Otherwise we should go ahead and just do it next week.
>>>>> >
>>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>>> sync.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks for the detailed thoughts, everyone.
>>>>> >
>>>>> >
>>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>>> patch
>>>>> >
>>>>> > releases. As for putting features into minor/patch releases, if we
>>>>> >
>>>>> > keep the
>>>>> >
>>>>> > convention of putting new features only into minor releases, my
>>>>> >
>>>>> > assumption
>>>>> >
>>>>> > is still that it's unlikely people will want to get them into
>>>>> branch-2
>>>>> >
>>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>>> >
>>>>> > haven't
>>>>> >
>>>>> > even really removed support for java 7 in branch-2 (much less java
>>>>> 8),
>>>>> >
>>>>> > so I
>>>>> >
>>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>>> >
>>>>> > you
>>>>> >
>>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>>> >
>>>>> > always
>>>>> >
>>>>> > revive branch-2. But for now I think the convenience of not needing
>>>>> to
>>>>> >
>>>>> > port
>>>>> >
>>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> >
>>>>> > potentially needing to revive branch-2.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > +1 for 2.10.x as last release for 2.x version.
>>>>> >
>>>>> >
>>>>> > Software would become more compatible when more companies stress test
>>>>> >
>>>>> > the same software and making improvements in trunk.  Some may be
>>>>> extra
>>>>> >
>>>>> > caution on moving up the version because obligation internally to
>>>>> keep
>>>>> >
>>>>> > things running.  Company obligation should not be the driving force
>>>>> to
>>>>> >
>>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>>> >
>>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>>> >
>>>>> > version.  I think it would be more healthy for the community to
>>>>> >
>>>>> > reduce the
>>>>> >
>>>>> > branch forking and spend energy on trunk to harden the software.
>>>>> >
>>>>> > This will
>>>>> >
>>>>> > give more confidence to move up the version than trying to fix n
>>>>> >
>>>>> > permutations breakage like Flash fixing the timeline.
>>>>> >
>>>>> >
>>>>> > Apache license stated, there is no warranty of any kind for code
>>>>> >
>>>>> > contributions.  Fewer community release process should improve
>>>>> >
>>>>> > software
>>>>> >
>>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>>> >
>>>>> > goals.
>>>>> >
>>>>> >
>>>>> > regards,
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>> >
>>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>>> >
>>>>> >
>>>>> > Hello all,
>>>>> >
>>>>> >
>>>>> > Is it written anywhere what the difference is between a minor release
>>>>> >
>>>>> > and a
>>>>> >
>>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> >
>>>>> > have
>>>>> >
>>>>> > looked around and I can't find anything other than some compatibility
>>>>> >
>>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> >
>>>>> > think
>>>>> >
>>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>>> >
>>>>> > alive.
>>>>> >
>>>>> > My current understanding is that we can't really break compatibility
>>>>> >
>>>>> > in
>>>>> >
>>>>> > either a minor or point release. But the only mention of the
>>>>> >
>>>>> > difference
>>>>> >
>>>>> > between minor and point releases is how to deal with Stable,
>>>>> >
>>>>> > Evolving,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > Unstable tags, and how to deal with changing default configuration
>>>>> >
>>>>> > values.
>>>>> >
>>>>> > So it seems like there really isn't a big official difference between
>>>>> >
>>>>> > the
>>>>> >
>>>>> > two. In my mind, the functional difference between the two is that
>>>>> >
>>>>> > the
>>>>> >
>>>>> > minor releases may have added features and rewrites, while the point
>>>>> >
>>>>> > releases only have bug fixes. This might be an incorrect
>>>>> >
>>>>> > understanding, but
>>>>> >
>>>>> > that's what I have gathered from watching the releases over the last
>>>>> >
>>>>> > few
>>>>> >
>>>>> > years. Whether or not this is a correct understanding, I think that
>>>>> >
>>>>> > this
>>>>> >
>>>>> > needs to be documented somewhere, even if it is just a convention.
>>>>> >
>>>>> >
>>>>> > Given my assumed understanding of minor vs point releases, here are
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>>> >
>>>>> > correct me for anything you feel is missing or inadequate.
>>>>> >
>>>>> > Pros:
>>>>> >
>>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>>> >
>>>>> > into
>>>>> >
>>>>> > 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > Cons:
>>>>> >
>>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>>> >
>>>>> > - An extra branch to maintain
>>>>> >
>>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> >
>>>>> > patches to if they should go all the way back to 2.10.x
>>>>> >
>>>>> > - It is less necessary to move to 3.x
>>>>> >
>>>>> >
>>>>> > So on the one hand you get added stability in fewer features being
>>>>> >
>>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> >
>>>>> > being
>>>>> >
>>>>> > committed. In a perfect world, we wouldn't have to make this
>>>>> >
>>>>> > tradeoff.
>>>>> >
>>>>> > But
>>>>> >
>>>>> > we don't live in a perfect world and committers will make mistakes
>>>>> >
>>>>> > either
>>>>> >
>>>>> > because of lack of knowledge or simply because they made a mistake.
>>>>> >
>>>>> > If
>>>>> >
>>>>> > we
>>>>> >
>>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>>> >
>>>>> > to
>>>>> >
>>>>> > (for
>>>>> >
>>>>> > whatever reason) commit valid bug fixes back all the way to
>>>>> >
>>>>> > branch-2.10. If
>>>>> >
>>>>> > we don't have a branch-2, committers who want their borderline risky
>>>>> >
>>>>> > feature in the 2.x line will err on the side of putting it into
>>>>> >
>>>>> > branch-2.10
>>>>> >
>>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>>> >
>>>>> > quite
>>>>> >
>>>>> > a few assumptions here based on my own experiences, so I would like
>>>>> >
>>>>> > to
>>>>> >
>>>>> > hear
>>>>> >
>>>>> > if others have similar or opposing views.
>>>>> >
>>>>> >
>>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> >
>>>>> > killing
>>>>> >
>>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>>> >
>>>>> > is why
>>>>> >
>>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>>> >
>>>>> > trying
>>>>> >
>>>>> > to move forward, keeping as many companies on similar branches as
>>>>> >
>>>>> > possible
>>>>> >
>>>>> > is a good way to make sure the code is well-tested. However, from a
>>>>> >
>>>>> > stability point of view, moving to 3.x is still scary and being able
>>>>> >
>>>>> > to
>>>>> >
>>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > bridge release effort has been very good at making it possible for
>>>>> >
>>>>> > people
>>>>> >
>>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> >
>>>>> > that
>>>>> >
>>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>>> >
>>>>> > due to
>>>>> >
>>>>> > potential performance degradation at large scale.
>>>>> >
>>>>> >
>>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>>> >
>>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> >
>>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> >
>>>>> > support to
>>>>> >
>>>>> > 2.x, surely not everyone is going to want that (at least not
>>>>> >
>>>>> > immediately).
>>>>> >
>>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>>> >
>>>>> > across
>>>>> >
>>>>> > point releases within the same minor release except if the JVM
>>>>> >
>>>>> > version
>>>>> >
>>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> >
>>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>>> >
>>>>> > it is
>>>>> >
>>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>>> >
>>>>> > update).
>>>>> >
>>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> >
>>>>> > catalyst for a branch-2 revival?
>>>>> >
>>>>> >
>>>>> > Not sure if this really leads to any sort of answer from me on
>>>>> >
>>>>> > whether
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>>> >
>>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > or
>>>>> >
>>>>> > not is committers not being on the same page with where they should
>>>>> >
>>>>> > commit
>>>>> >
>>>>> > their patches.
>>>>> >
>>>>> >
>>>>> > Eric
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> > [2]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> >
>>>>> >
>>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Hi Konstantin,
>>>>> >
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> >
>>>>> > the
>>>>> >
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> >
>>>>> > least.
>>>>> >
>>>>> > I worry
>>>>> >
>>>>> > that some committers may want to put new features into a branch 2
>>>>> >
>>>>> > release,
>>>>> >
>>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>>> >
>>>>> > don't
>>>>> >
>>>>> > always
>>>>> >
>>>>> > catch corner cases or performance problems for some time (usually
>>>>> >
>>>>> > not
>>>>> >
>>>>> > until
>>>>> >
>>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>>> >
>>>>> > may
>>>>> >
>>>>> > be
>>>>> >
>>>>> > very
>>>>> >
>>>>> > difficult to back out those changes.
>>>>> >
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>>> >
>>>>> > idea,
>>>>> >
>>>>> > but I
>>>>> >
>>>>> > do
>>>>> >
>>>>> > have these reservations.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>>> >
>>>>> > <
>>>>> >
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> >
>>>>> > Hi Eric,
>>>>> >
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> >
>>>>> > release the
>>>>> >
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> >
>>>>> > between
>>>>> >
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> >
>>>>> > not in
>>>>> >
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> >
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> >
>>>>> > back-ports
>>>>> >
>>>>> > to
>>>>> >
>>>>> > 2. save engineering effort committing to more branches than
>>>>> >
>>>>> > necessary
>>>>> >
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>>> >
>>>>> > to
>>>>> >
>>>>> > release 2.11 we can resurrect the branch.
>>>>> >
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>>> >
>>>>> > jyhung2357@gmail.com
>>>>> >
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>>> >
>>>>> > the
>>>>> >
>>>>> > pros
>>>>> >
>>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> >
>>>>> > are
>>>>> >
>>>>> > much
>>>>> >
>>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>>> >
>>>>> > many
>>>>> >
>>>>> > people
>>>>> >
>>>>> > outside of our company who expressed interest in getting new
>>>>> >
>>>>> > features to
>>>>> >
>>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>>> >
>>>>> > after
>>>>> >
>>>>> > 2.10.0
>>>>> >
>>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> >
>>>>> > branch-2.10, so it's already diverged quite a bit.
>>>>> >
>>>>> >
>>>>> > In any case, we can always reverse this decision if we really
>>>>> >
>>>>> > need
>>>>> >
>>>>> > to, by
>>>>> >
>>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>>> >
>>>>> > confusion
>>>>> >
>>>>> > IMO.
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> >
>>>>> > epayne@apache.org>
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> >
>>>>> > Thanks Jonathan for opening the discussion.
>>>>> >
>>>>> >
>>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>>> >
>>>>> > released,
>>>>> >
>>>>> > and
>>>>> >
>>>>> > moving to 2.10 will take some time for the community. It seems
>>>>> >
>>>>> > premature
>>>>> >
>>>>> > to
>>>>> >
>>>>> > make a decision at this point that there will never be a need
>>>>> >
>>>>> > for a
>>>>> >
>>>>> > 2.11
>>>>> >
>>>>> > release.
>>>>> >
>>>>> >
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>>> >
>>>>> > <
>>>>> >
>>>>> > jyhung2357@gmail.com> wrote:
>>>>> >
>>>>> >
>>>>> > Hi folks,
>>>>> >
>>>>> >
>>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>>> >
>>>>> > be a
>>>>> >
>>>>> > bridge
>>>>> >
>>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>>> >
>>>>> > last
>>>>> >
>>>>> > minor
>>>>> >
>>>>> > release line in branch-2. Currently, the main issue is that
>>>>> >
>>>>> > there's
>>>>> >
>>>>> > many
>>>>> >
>>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> >
>>>>> > going
>>>>> >
>>>>> > into
>>>>> >
>>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>>> >
>>>>> > branch-2
>>>>> >
>>>>> > will
>>>>> >
>>>>> > likely never see the light of day unless they are backported to
>>>>> >
>>>>> > branch-2.10.
>>>>> >
>>>>> >
>>>>> > To do this, I propose we:
>>>>> >
>>>>> >
>>>>> > - Delete branch-2.10
>>>>> >
>>>>> > - Rename branch-2 to branch-2.10
>>>>> >
>>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> >
>>>>> >
>>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> >
>>>>> > release
>>>>> >
>>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> >
>>>>> > ->
>>>>> >
>>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> >
>>>>> >
>>>>> > Thoughts?
>>>>> >
>>>>> >
>>>>> > Jonathan Hung
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> >
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> >
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Makes sense. I've cherry-picked the commits in branch-2 that were missed in
branch-2.10.

Jonathan Hung


On Wed, Apr 15, 2020 at 2:25 AM Akira Ajisaka <aa...@apache.org> wrote:

> Hi folks,
>
> I am still seeing some changes are being committed to branch-2.
> I'd like to delete the source code from branch-2 to avoid mistakes.
> https://issues.apache.org/jira/browse/HADOOP-16988
>
> -Akira
>
> On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:
>
>> Hi Jim,
>> Thanx for catching, I have configured the build to run on branch-2.10.
>>
>> -Ayush
>>
>> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
>> wrote:
>>
>>> It looks like QBT tests are still being run on branch-2 (
>>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>>> and they are not very helpful at this point.
>>> Can we change the QBT tests to run against branch-2.10 instead?
>>>
>>> Jim
>>>
>>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>>> wrote:
>>>
>>>> Thank you, Ayush.
>>>>
>>>> I understand we should keep branch-2 as is, as well as master.
>>>>
>>>> -Akira
>>>>
>>>>
>>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi Akira
>>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>>> > But the INFRA people closed as wont do and yes, the branch is
>>>> protected,
>>>> > we can’t delete it directly.
>>>> >
>>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>>> >
>>>> > -Ayush
>>>> >
>>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org>
>>>> wrote:
>>>> >
>>>> > Thank you for your work, Jonathan.
>>>> >
>>>> > I found branch-2 has been unintentionally pushed again. Would you
>>>> remove
>>>> > it?
>>>> > I think the branch should be protected if possible.
>>>> >
>>>> > -Akira
>>>> >
>>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>>> ->
>>>> >
>>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>>> please
>>>> >
>>>> > don't try to commit to it)
>>>> >
>>>> >
>>>> > Completed procedure:
>>>> >
>>>> >
>>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>>> >
>>>> >   - Delete old branch-2.10
>>>> >
>>>> >   - Rename branch-2 to (new) branch-2.10
>>>> >
>>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>>> >
>>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>>> >
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>>> >
>>>> > shv.hadoop@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hey guys,
>>>> >
>>>> >
>>>> > I think we diverged a bit from the initial topic of this discussion,
>>>> >
>>>> > which is removing branch-2.10, and changing the version of branch-2
>>>> from
>>>> >
>>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> >
>>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>>> minor
>>>> >
>>>> > 2.x release" confused people.
>>>> >
>>>> > It is in fact a wider matter that can be discussed when somebody
>>>> >
>>>> > actually
>>>> >
>>>> > proposes to release 2.11, which I understand nobody does at the
>>>> moment.
>>>> >
>>>> >
>>>> > So if anybody objects removing branch-2.10 please make an argument.
>>>> >
>>>> > Otherwise we should go ahead and just do it next week.
>>>> >
>>>> > I see people still struggling to keep branch-2 and branch-2.10 in
>>>> sync.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks for the detailed thoughts, everyone.
>>>> >
>>>> >
>>>> > Eric (Badger), my understanding is the same as yours re. minor vs
>>>> patch
>>>> >
>>>> > releases. As for putting features into minor/patch releases, if we
>>>> >
>>>> > keep the
>>>> >
>>>> > convention of putting new features only into minor releases, my
>>>> >
>>>> > assumption
>>>> >
>>>> > is still that it's unlikely people will want to get them into branch-2
>>>> >
>>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>>> >
>>>> > haven't
>>>> >
>>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>>> >
>>>> > so I
>>>> >
>>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>>> >
>>>> > you
>>>> >
>>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>>> >
>>>> > always
>>>> >
>>>> > revive branch-2. But for now I think the convenience of not needing to
>>>> >
>>>> > port
>>>> >
>>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>> >
>>>> > potentially needing to revive branch-2.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com>
>>>> wrote:
>>>> >
>>>> >
>>>> > +1 for 2.10.x as last release for 2.x version.
>>>> >
>>>> >
>>>> > Software would become more compatible when more companies stress test
>>>> >
>>>> > the same software and making improvements in trunk.  Some may be extra
>>>> >
>>>> > caution on moving up the version because obligation internally to keep
>>>> >
>>>> > things running.  Company obligation should not be the driving force to
>>>> >
>>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>>> >
>>>> > community when every name brand company maintains its own Hadoop 2.x
>>>> >
>>>> > version.  I think it would be more healthy for the community to
>>>> >
>>>> > reduce the
>>>> >
>>>> > branch forking and spend energy on trunk to harden the software.
>>>> >
>>>> > This will
>>>> >
>>>> > give more confidence to move up the version than trying to fix n
>>>> >
>>>> > permutations breakage like Flash fixing the timeline.
>>>> >
>>>> >
>>>> > Apache license stated, there is no warranty of any kind for code
>>>> >
>>>> > contributions.  Fewer community release process should improve
>>>> >
>>>> > software
>>>> >
>>>> > quality when eyes are on trunk, and help steering toward the same end
>>>> >
>>>> > goals.
>>>> >
>>>> >
>>>> > regards,
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> >
>>>> > <eb...@verizonmedia.com.invalid> wrote:
>>>> >
>>>> >
>>>> > Hello all,
>>>> >
>>>> >
>>>> > Is it written anywhere what the difference is between a minor release
>>>> >
>>>> > and a
>>>> >
>>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> >
>>>> > have
>>>> >
>>>> > looked around and I can't find anything other than some compatibility
>>>> >
>>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>> >
>>>> > think
>>>> >
>>>> > this would help shape my opinion on whether or not to keep branch-2
>>>> >
>>>> > alive.
>>>> >
>>>> > My current understanding is that we can't really break compatibility
>>>> >
>>>> > in
>>>> >
>>>> > either a minor or point release. But the only mention of the
>>>> >
>>>> > difference
>>>> >
>>>> > between minor and point releases is how to deal with Stable,
>>>> >
>>>> > Evolving,
>>>> >
>>>> > and
>>>> >
>>>> > Unstable tags, and how to deal with changing default configuration
>>>> >
>>>> > values.
>>>> >
>>>> > So it seems like there really isn't a big official difference between
>>>> >
>>>> > the
>>>> >
>>>> > two. In my mind, the functional difference between the two is that
>>>> >
>>>> > the
>>>> >
>>>> > minor releases may have added features and rewrites, while the point
>>>> >
>>>> > releases only have bug fixes. This might be an incorrect
>>>> >
>>>> > understanding, but
>>>> >
>>>> > that's what I have gathered from watching the releases over the last
>>>> >
>>>> > few
>>>> >
>>>> > years. Whether or not this is a correct understanding, I think that
>>>> >
>>>> > this
>>>> >
>>>> > needs to be documented somewhere, even if it is just a convention.
>>>> >
>>>> >
>>>> > Given my assumed understanding of minor vs point releases, here are
>>>> >
>>>> > the
>>>> >
>>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>>> >
>>>> > correct me for anything you feel is missing or inadequate.
>>>> >
>>>> > Pros:
>>>> >
>>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>>> >
>>>> > into
>>>> >
>>>> > 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > Cons:
>>>> >
>>>> > - Bug fixes are less likely to be put into 2.10.x
>>>> >
>>>> > - An extra branch to maintain
>>>> >
>>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> >
>>>> > patches to if they should go all the way back to 2.10.x
>>>> >
>>>> > - It is less necessary to move to 3.x
>>>> >
>>>> >
>>>> > So on the one hand you get added stability in fewer features being
>>>> >
>>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>>> >
>>>> > being
>>>> >
>>>> > committed. In a perfect world, we wouldn't have to make this
>>>> >
>>>> > tradeoff.
>>>> >
>>>> > But
>>>> >
>>>> > we don't live in a perfect world and committers will make mistakes
>>>> >
>>>> > either
>>>> >
>>>> > because of lack of knowledge or simply because they made a mistake.
>>>> >
>>>> > If
>>>> >
>>>> > we
>>>> >
>>>> > have a branch-2, committers will forget, not know to, or choose not
>>>> >
>>>> > to
>>>> >
>>>> > (for
>>>> >
>>>> > whatever reason) commit valid bug fixes back all the way to
>>>> >
>>>> > branch-2.10. If
>>>> >
>>>> > we don't have a branch-2, committers who want their borderline risky
>>>> >
>>>> > feature in the 2.x line will err on the side of putting it into
>>>> >
>>>> > branch-2.10
>>>> >
>>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>>> >
>>>> > quite
>>>> >
>>>> > a few assumptions here based on my own experiences, so I would like
>>>> >
>>>> > to
>>>> >
>>>> > hear
>>>> >
>>>> > if others have similar or opposing views.
>>>> >
>>>> >
>>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>>> >
>>>> > killing
>>>> >
>>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>>> >
>>>> > is why
>>>> >
>>>> > I have added movement to 3.x as both a pro and a con. As a community
>>>> >
>>>> > trying
>>>> >
>>>> > to move forward, keeping as many companies on similar branches as
>>>> >
>>>> > possible
>>>> >
>>>> > is a good way to make sure the code is well-tested. However, from a
>>>> >
>>>> > stability point of view, moving to 3.x is still scary and being able
>>>> >
>>>> > to
>>>> >
>>>> > stay on 2.x until you are comfortable to move is very nice. The
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > bridge release effort has been very good at making it possible for
>>>> >
>>>> > people
>>>> >
>>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> >
>>>> > that
>>>> >
>>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>>> >
>>>> > due to
>>>> >
>>>> > potential performance degradation at large scale.
>>>> >
>>>> >
>>>> > A question I'm pondering is what happens when we move to Java 11 and
>>>> >
>>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>>> >
>>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> >
>>>> > support to
>>>> >
>>>> > 2.x, surely not everyone is going to want that (at least not
>>>> >
>>>> > immediately).
>>>> >
>>>> > The 2.10 documentation states, "The JVM requirements will not change
>>>> >
>>>> > across
>>>> >
>>>> > point releases within the same minor release except if the JVM
>>>> >
>>>> > version
>>>> >
>>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>>> >
>>>> > release until Java 8 becomes unsupported (though one could argue that
>>>> >
>>>> > it is
>>>> >
>>>> > already unsupported since Oracle is no longer giving public Java 8
>>>> >
>>>> > update).
>>>> >
>>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> >
>>>> > catalyst for a branch-2 revival?
>>>> >
>>>> >
>>>> > Not sure if this really leads to any sort of answer from me on
>>>> >
>>>> > whether
>>>> >
>>>> > or
>>>> >
>>>> > not we should keep branch-2 alive, but these are the things that I am
>>>> >
>>>> > weighing in my mind. For me, the bigger problem beyond having
>>>> >
>>>> > branch-2
>>>> >
>>>> > or
>>>> >
>>>> > not is committers not being on the same page with where they should
>>>> >
>>>> > commit
>>>> >
>>>> > their patches.
>>>> >
>>>> >
>>>> > Eric
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> > [2]
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> >
>>>> >
>>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Hi Konstantin,
>>>> >
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> >
>>>> > the
>>>> >
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> >
>>>> > least.
>>>> >
>>>> > I worry
>>>> >
>>>> > that some committers may want to put new features into a branch 2
>>>> >
>>>> > release,
>>>> >
>>>> > and without a branch-2, they will go directly into 2.10. Since we
>>>> >
>>>> > don't
>>>> >
>>>> > always
>>>> >
>>>> > catch corner cases or performance problems for some time (usually
>>>> >
>>>> > not
>>>> >
>>>> > until
>>>> >
>>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>>> >
>>>> > may
>>>> >
>>>> > be
>>>> >
>>>> > very
>>>> >
>>>> > difficult to back out those changes.
>>>> >
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>>> >
>>>> > idea,
>>>> >
>>>> > but I
>>>> >
>>>> > do
>>>> >
>>>> > have these reservations.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>>> >
>>>> > <
>>>> >
>>>> > shv.hadoop@gmail.com> wrote:
>>>> >
>>>> > Hi Eric,
>>>> >
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> >
>>>> > release the
>>>> >
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> >
>>>> > between
>>>> >
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>> >
>>>> > not in
>>>> >
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> >
>>>> > 1. eliminate confusion which branches people should commit their
>>>> >
>>>> > back-ports
>>>> >
>>>> > to
>>>> >
>>>> > 2. save engineering effort committing to more branches than
>>>> >
>>>> > necessary
>>>> >
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>>> >
>>>> > to
>>>> >
>>>> > release 2.11 we can resurrect the branch.
>>>> >
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> >
>>>> > Thanks,
>>>> >
>>>> > --Konstantin
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>>> >
>>>> > jyhung2357@gmail.com
>>>> >
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>>> >
>>>> > the
>>>> >
>>>> > pros
>>>> >
>>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> >
>>>> > are
>>>> >
>>>> > much
>>>> >
>>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>>> >
>>>> > many
>>>> >
>>>> > people
>>>> >
>>>> > outside of our company who expressed interest in getting new
>>>> >
>>>> > features to
>>>> >
>>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>>> >
>>>> > after
>>>> >
>>>> > 2.10.0
>>>> >
>>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> >
>>>> > branch-2.10, so it's already diverged quite a bit.
>>>> >
>>>> >
>>>> > In any case, we can always reverse this decision if we really
>>>> >
>>>> > need
>>>> >
>>>> > to, by
>>>> >
>>>> > recreating branch-2. But this proposal would reduce a lot of
>>>> >
>>>> > confusion
>>>> >
>>>> > IMO.
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> >
>>>> > epayne@apache.org>
>>>> >
>>>> > wrote:
>>>> >
>>>> >
>>>> > Thanks Jonathan for opening the discussion.
>>>> >
>>>> >
>>>> > I am not in favor of this proposal. 2.10 was very recently
>>>> >
>>>> > released,
>>>> >
>>>> > and
>>>> >
>>>> > moving to 2.10 will take some time for the community. It seems
>>>> >
>>>> > premature
>>>> >
>>>> > to
>>>> >
>>>> > make a decision at this point that there will never be a need
>>>> >
>>>> > for a
>>>> >
>>>> > 2.11
>>>> >
>>>> > release.
>>>> >
>>>> >
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>>> >
>>>> > <
>>>> >
>>>> > jyhung2357@gmail.com> wrote:
>>>> >
>>>> >
>>>> > Hi folks,
>>>> >
>>>> >
>>>> > Given the release of 2.10.0, and the fact that it's intended to
>>>> >
>>>> > be a
>>>> >
>>>> > bridge
>>>> >
>>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>>> >
>>>> > last
>>>> >
>>>> > minor
>>>> >
>>>> > release line in branch-2. Currently, the main issue is that
>>>> >
>>>> > there's
>>>> >
>>>> > many
>>>> >
>>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> >
>>>> > going
>>>> >
>>>> > into
>>>> >
>>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>>> >
>>>> > branch-2
>>>> >
>>>> > will
>>>> >
>>>> > likely never see the light of day unless they are backported to
>>>> >
>>>> > branch-2.10.
>>>> >
>>>> >
>>>> > To do this, I propose we:
>>>> >
>>>> >
>>>> > - Delete branch-2.10
>>>> >
>>>> > - Rename branch-2 to branch-2.10
>>>> >
>>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> >
>>>> >
>>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>>> >
>>>> > release
>>>> >
>>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>>> >
>>>> > ->
>>>> >
>>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> >
>>>> >
>>>> > Thoughts?
>>>> >
>>>> >
>>>> > Jonathan Hung
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> >
>>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> >
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> >
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Hi folks,

I am still seeing some changes are being committed to branch-2.
I'd like to delete the source code from branch-2 to avoid mistakes.
https://issues.apache.org/jira/browse/HADOOP-16988

-Akira

On Wed, Jan 1, 2020 at 2:38 AM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Jim,
> Thanx for catching, I have configured the build to run on branch-2.10.
>
> -Ayush
>
> On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
> wrote:
>
>> It looks like QBT tests are still being run on branch-2 (
>> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
>> and they are not very helpful at this point.
>> Can we change the QBT tests to run against branch-2.10 instead?
>>
>> Jim
>>
>> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org>
>> wrote:
>>
>>> Thank you, Ayush.
>>>
>>> I understand we should keep branch-2 as is, as well as master.
>>>
>>> -Akira
>>>
>>>
>>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>>
>>> > Hi Akira
>>> > Seems there was an INFRA ticket for that. INFRA-19581,
>>> > But the INFRA people closed as wont do and yes, the branch is
>>> protected,
>>> > we can’t delete it directly.
>>> >
>>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>>> >
>>> > -Ayush
>>> >
>>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>>> >
>>> > Thank you for your work, Jonathan.
>>> >
>>> > I found branch-2 has been unintentionally pushed again. Would you
>>> remove
>>> > it?
>>> > I think the branch should be protected if possible.
>>> >
>>> > -Akira
>>> >
>>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1
>>> ->
>>> >
>>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>>> please
>>> >
>>> > don't try to commit to it)
>>> >
>>> >
>>> > Completed procedure:
>>> >
>>> >
>>> >   - Verified everything in old branch-2.10 was in old branch-2
>>> >
>>> >   - Delete old branch-2.10
>>> >
>>> >   - Rename branch-2 to (new) branch-2.10
>>> >
>>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>>> >
>>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>>> >
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > FYI, starting the rename process, beginning with INFRA-19521.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>>> >
>>> > shv.hadoop@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hey guys,
>>> >
>>> >
>>> > I think we diverged a bit from the initial topic of this discussion,
>>> >
>>> > which is removing branch-2.10, and changing the version of branch-2
>>> from
>>> >
>>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>> >
>>> > Sounds like the subject line for this thread "Making 2.10 the last
>>> minor
>>> >
>>> > 2.x release" confused people.
>>> >
>>> > It is in fact a wider matter that can be discussed when somebody
>>> >
>>> > actually
>>> >
>>> > proposes to release 2.11, which I understand nobody does at the moment.
>>> >
>>> >
>>> > So if anybody objects removing branch-2.10 please make an argument.
>>> >
>>> > Otherwise we should go ahead and just do it next week.
>>> >
>>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks for the detailed thoughts, everyone.
>>> >
>>> >
>>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> >
>>> > releases. As for putting features into minor/patch releases, if we
>>> >
>>> > keep the
>>> >
>>> > convention of putting new features only into minor releases, my
>>> >
>>> > assumption
>>> >
>>> > is still that it's unlikely people will want to get them into branch-2
>>> >
>>> > (based on the 2.10.0 release process). For the java 11 issue, we
>>> >
>>> > haven't
>>> >
>>> > even really removed support for java 7 in branch-2 (much less java 8),
>>> >
>>> > so I
>>> >
>>> > feel moving to java 11 would go along with a move to branch 3. And as
>>> >
>>> > you
>>> >
>>> > mentioned, if people really want to use java 11 on branch-2, we can
>>> >
>>> > always
>>> >
>>> > revive branch-2. But for now I think the convenience of not needing to
>>> >
>>> > port
>>> >
>>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> >
>>> > potentially needing to revive branch-2.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>> >
>>> >
>>> > +1 for 2.10.x as last release for 2.x version.
>>> >
>>> >
>>> > Software would become more compatible when more companies stress test
>>> >
>>> > the same software and making improvements in trunk.  Some may be extra
>>> >
>>> > caution on moving up the version because obligation internally to keep
>>> >
>>> > things running.  Company obligation should not be the driving force to
>>> >
>>> > maintain Hadoop branches.  There is no proper collaboration in the
>>> >
>>> > community when every name brand company maintains its own Hadoop 2.x
>>> >
>>> > version.  I think it would be more healthy for the community to
>>> >
>>> > reduce the
>>> >
>>> > branch forking and spend energy on trunk to harden the software.
>>> >
>>> > This will
>>> >
>>> > give more confidence to move up the version than trying to fix n
>>> >
>>> > permutations breakage like Flash fixing the timeline.
>>> >
>>> >
>>> > Apache license stated, there is no warranty of any kind for code
>>> >
>>> > contributions.  Fewer community release process should improve
>>> >
>>> > software
>>> >
>>> > quality when eyes are on trunk, and help steering toward the same end
>>> >
>>> > goals.
>>> >
>>> >
>>> > regards,
>>> >
>>> > Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> >
>>> > <eb...@verizonmedia.com.invalid> wrote:
>>> >
>>> >
>>> > Hello all,
>>> >
>>> >
>>> > Is it written anywhere what the difference is between a minor release
>>> >
>>> > and a
>>> >
>>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>>> >
>>> > have
>>> >
>>> > looked around and I can't find anything other than some compatibility
>>> >
>>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>> >
>>> > think
>>> >
>>> > this would help shape my opinion on whether or not to keep branch-2
>>> >
>>> > alive.
>>> >
>>> > My current understanding is that we can't really break compatibility
>>> >
>>> > in
>>> >
>>> > either a minor or point release. But the only mention of the
>>> >
>>> > difference
>>> >
>>> > between minor and point releases is how to deal with Stable,
>>> >
>>> > Evolving,
>>> >
>>> > and
>>> >
>>> > Unstable tags, and how to deal with changing default configuration
>>> >
>>> > values.
>>> >
>>> > So it seems like there really isn't a big official difference between
>>> >
>>> > the
>>> >
>>> > two. In my mind, the functional difference between the two is that
>>> >
>>> > the
>>> >
>>> > minor releases may have added features and rewrites, while the point
>>> >
>>> > releases only have bug fixes. This might be an incorrect
>>> >
>>> > understanding, but
>>> >
>>> > that's what I have gathered from watching the releases over the last
>>> >
>>> > few
>>> >
>>> > years. Whether or not this is a correct understanding, I think that
>>> >
>>> > this
>>> >
>>> > needs to be documented somewhere, even if it is just a convention.
>>> >
>>> >
>>> > Given my assumed understanding of minor vs point releases, here are
>>> >
>>> > the
>>> >
>>> > pros/cons that I can think of for having a branch-2. Please add on or
>>> >
>>> > correct me for anything you feel is missing or inadequate.
>>> >
>>> > Pros:
>>> >
>>> > - Features/rewrites/higher-risk patches are less likely to be put
>>> >
>>> > into
>>> >
>>> > 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > Cons:
>>> >
>>> > - Bug fixes are less likely to be put into 2.10.x
>>> >
>>> > - An extra branch to maintain
>>> >
>>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>>> >
>>> > patches to if they should go all the way back to 2.10.x
>>> >
>>> > - It is less necessary to move to 3.x
>>> >
>>> >
>>> > So on the one hand you get added stability in fewer features being
>>> >
>>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>>> >
>>> > being
>>> >
>>> > committed. In a perfect world, we wouldn't have to make this
>>> >
>>> > tradeoff.
>>> >
>>> > But
>>> >
>>> > we don't live in a perfect world and committers will make mistakes
>>> >
>>> > either
>>> >
>>> > because of lack of knowledge or simply because they made a mistake.
>>> >
>>> > If
>>> >
>>> > we
>>> >
>>> > have a branch-2, committers will forget, not know to, or choose not
>>> >
>>> > to
>>> >
>>> > (for
>>> >
>>> > whatever reason) commit valid bug fixes back all the way to
>>> >
>>> > branch-2.10. If
>>> >
>>> > we don't have a branch-2, committers who want their borderline risky
>>> >
>>> > feature in the 2.x line will err on the side of putting it into
>>> >
>>> > branch-2.10
>>> >
>>> > instead of proposing the creation of a branch-2. Clearly I have made
>>> >
>>> > quite
>>> >
>>> > a few assumptions here based on my own experiences, so I would like
>>> >
>>> > to
>>> >
>>> > hear
>>> >
>>> > if others have similar or opposing views.
>>> >
>>> >
>>> > As far as 3.x goes, to me it seems like some of the reasoning for
>>> >
>>> > killing
>>> >
>>> > branch-2 is due to an effort to push the community towards 3.x. This
>>> >
>>> > is why
>>> >
>>> > I have added movement to 3.x as both a pro and a con. As a community
>>> >
>>> > trying
>>> >
>>> > to move forward, keeping as many companies on similar branches as
>>> >
>>> > possible
>>> >
>>> > is a good way to make sure the code is well-tested. However, from a
>>> >
>>> > stability point of view, moving to 3.x is still scary and being able
>>> >
>>> > to
>>> >
>>> > stay on 2.x until you are comfortable to move is very nice. The
>>> >
>>> > 2.10.0
>>> >
>>> > bridge release effort has been very good at making it possible for
>>> >
>>> > people
>>> >
>>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> >
>>> > that
>>> >
>>> > it is reasonable for companies to want to be extra cautious with 3.x
>>> >
>>> > due to
>>> >
>>> > potential performance degradation at large scale.
>>> >
>>> >
>>> > A question I'm pondering is what happens when we move to Java 11 and
>>> >
>>> > someone is still on 2.x? If they want to backport HADOOP-15338
>>> >
>>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> >
>>> > support to
>>> >
>>> > 2.x, surely not everyone is going to want that (at least not
>>> >
>>> > immediately).
>>> >
>>> > The 2.10 documentation states, "The JVM requirements will not change
>>> >
>>> > across
>>> >
>>> > point releases within the same minor release except if the JVM
>>> >
>>> > version
>>> >
>>> > under question becomes unsupported" [1], so this would warrant a 2.11
>>> >
>>> > release until Java 8 becomes unsupported (though one could argue that
>>> >
>>> > it is
>>> >
>>> > already unsupported since Oracle is no longer giving public Java 8
>>> >
>>> > update).
>>> >
>>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>>> >
>>> > catalyst for a branch-2 revival?
>>> >
>>> >
>>> > Not sure if this really leads to any sort of answer from me on
>>> >
>>> > whether
>>> >
>>> > or
>>> >
>>> > not we should keep branch-2 alive, but these are the things that I am
>>> >
>>> > weighing in my mind. For me, the bigger problem beyond having
>>> >
>>> > branch-2
>>> >
>>> > or
>>> >
>>> > not is committers not being on the same page with where they should
>>> >
>>> > commit
>>> >
>>> > their patches.
>>> >
>>> >
>>> > Eric
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> > [2]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> >
>>> >
>>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Hi Konstantin,
>>> >
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about
>>> >
>>> > the
>>> >
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> >
>>> > least.
>>> >
>>> > I worry
>>> >
>>> > that some committers may want to put new features into a branch 2
>>> >
>>> > release,
>>> >
>>> > and without a branch-2, they will go directly into 2.10. Since we
>>> >
>>> > don't
>>> >
>>> > always
>>> >
>>> > catch corner cases or performance problems for some time (usually
>>> >
>>> > not
>>> >
>>> > until
>>> >
>>> > the release is deployed to a busy, 4-thousand node cluster), it
>>> >
>>> > may
>>> >
>>> > be
>>> >
>>> > very
>>> >
>>> > difficult to back out those changes.
>>> >
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the
>>> >
>>> > idea,
>>> >
>>> > but I
>>> >
>>> > do
>>> >
>>> > have these reservations.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>>> >
>>> > <
>>> >
>>> > shv.hadoop@gmail.com> wrote:
>>> >
>>> > Hi Eric,
>>> >
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> >
>>> > release the
>>> >
>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>> >
>>> > between
>>> >
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>> >
>>> > not in
>>> >
>>> > the picture right now, and many people may object this idea.
>>> >
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> >
>>> > 1. eliminate confusion which branches people should commit their
>>> >
>>> > back-ports
>>> >
>>> > to
>>> >
>>> > 2. save engineering effort committing to more branches than
>>> >
>>> > necessary
>>> >
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide
>>> >
>>> > to
>>> >
>>> > release 2.11 we can resurrect the branch.
>>> >
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > --Konstantin
>>> >
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>>> >
>>> > jyhung2357@gmail.com
>>> >
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Eric for the comments - regarding your concerns, I feel
>>> >
>>> > the
>>> >
>>> > pros
>>> >
>>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>> >
>>> > are
>>> >
>>> > much
>>> >
>>> > higher than a new 2.11 minor release. (There didn't seem to be
>>> >
>>> > many
>>> >
>>> > people
>>> >
>>> > outside of our company who expressed interest in getting new
>>> >
>>> > features to
>>> >
>>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>>> >
>>> > after
>>> >
>>> > 2.10.0
>>> >
>>> > release, there's 29 patches that have gone into branch-2 and 9 in
>>> >
>>> > branch-2.10, so it's already diverged quite a bit.
>>> >
>>> >
>>> > In any case, we can always reverse this decision if we really
>>> >
>>> > need
>>> >
>>> > to, by
>>> >
>>> > recreating branch-2. But this proposal would reduce a lot of
>>> >
>>> > confusion
>>> >
>>> > IMO.
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> >
>>> > epayne@apache.org>
>>> >
>>> > wrote:
>>> >
>>> >
>>> > Thanks Jonathan for opening the discussion.
>>> >
>>> >
>>> > I am not in favor of this proposal. 2.10 was very recently
>>> >
>>> > released,
>>> >
>>> > and
>>> >
>>> > moving to 2.10 will take some time for the community. It seems
>>> >
>>> > premature
>>> >
>>> > to
>>> >
>>> > make a decision at this point that there will never be a need
>>> >
>>> > for a
>>> >
>>> > 2.11
>>> >
>>> > release.
>>> >
>>> >
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>>> >
>>> > <
>>> >
>>> > jyhung2357@gmail.com> wrote:
>>> >
>>> >
>>> > Hi folks,
>>> >
>>> >
>>> > Given the release of 2.10.0, and the fact that it's intended to
>>> >
>>> > be a
>>> >
>>> > bridge
>>> >
>>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>>> >
>>> > last
>>> >
>>> > minor
>>> >
>>> > release line in branch-2. Currently, the main issue is that
>>> >
>>> > there's
>>> >
>>> > many
>>> >
>>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>> >
>>> > going
>>> >
>>> > into
>>> >
>>> > branch-2.10 (which will become 2.10.1), so the fixes in
>>> >
>>> > branch-2
>>> >
>>> > will
>>> >
>>> > likely never see the light of day unless they are backported to
>>> >
>>> > branch-2.10.
>>> >
>>> >
>>> > To do this, I propose we:
>>> >
>>> >
>>> > - Delete branch-2.10
>>> >
>>> > - Rename branch-2 to branch-2.10
>>> >
>>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> >
>>> >
>>> > This way we get all the current branch-2 fixes into the 2.10.x
>>> >
>>> > release
>>> >
>>> > line. Then the commit chain will look like: trunk -> branch-3.2
>>> >
>>> > ->
>>> >
>>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> >
>>> >
>>> > Thoughts?
>>> >
>>> >
>>> > Jonathan Hung
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> >
>>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> >
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> >
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Jim,
Thanx for catching, I have configured the build to run on branch-2.10.

-Ayush

On Tue, 31 Dec 2019 at 22:50, Jim Brennan <ja...@verizonmedia.com>
wrote:

> It looks like QBT tests are still being run on branch-2 (
> https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
> and they are not very helpful at this point.
> Can we change the QBT tests to run against branch-2.10 instead?
>
> Jim
>
> On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:
>
>> Thank you, Ayush.
>>
>> I understand we should keep branch-2 as is, as well as master.
>>
>> -Akira
>>
>>
>> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>>
>> > Hi Akira
>> > Seems there was an INFRA ticket for that. INFRA-19581,
>> > But the INFRA people closed as wont do and yes, the branch is protected,
>> > we can’t delete it directly.
>> >
>> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
>> >
>> > -Ayush
>> >
>> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>> >
>> > Thank you for your work, Jonathan.
>> >
>> > I found branch-2 has been unintentionally pushed again. Would you remove
>> > it?
>> > I think the branch should be protected if possible.
>> >
>> > -Akira
>> >
>> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> >
>> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
>> please
>> >
>> > don't try to commit to it)
>> >
>> >
>> > Completed procedure:
>> >
>> >
>> >   - Verified everything in old branch-2.10 was in old branch-2
>> >
>> >   - Delete old branch-2.10
>> >
>> >   - Rename branch-2 to (new) branch-2.10
>> >
>> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >   - Renamed fix versions from 2.11.0 to 2.10.1
>> >
>> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> >
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > FYI, starting the rename process, beginning with INFRA-19521.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> >
>> > shv.hadoop@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Hey guys,
>> >
>> >
>> > I think we diverged a bit from the initial topic of this discussion,
>> >
>> > which is removing branch-2.10, and changing the version of branch-2 from
>> >
>> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> >
>> > Sounds like the subject line for this thread "Making 2.10 the last minor
>> >
>> > 2.x release" confused people.
>> >
>> > It is in fact a wider matter that can be discussed when somebody
>> >
>> > actually
>> >
>> > proposes to release 2.11, which I understand nobody does at the moment.
>> >
>> >
>> > So if anybody objects removing branch-2.10 please make an argument.
>> >
>> > Otherwise we should go ahead and just do it next week.
>> >
>> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> >
>> > wrote:
>> >
>> >
>> > Thanks for the detailed thoughts, everyone.
>> >
>> >
>> > Eric (Badger), my understanding is the same as yours re. minor vs patch
>> >
>> > releases. As for putting features into minor/patch releases, if we
>> >
>> > keep the
>> >
>> > convention of putting new features only into minor releases, my
>> >
>> > assumption
>> >
>> > is still that it's unlikely people will want to get them into branch-2
>> >
>> > (based on the 2.10.0 release process). For the java 11 issue, we
>> >
>> > haven't
>> >
>> > even really removed support for java 7 in branch-2 (much less java 8),
>> >
>> > so I
>> >
>> > feel moving to java 11 would go along with a move to branch 3. And as
>> >
>> > you
>> >
>> > mentioned, if people really want to use java 11 on branch-2, we can
>> >
>> > always
>> >
>> > revive branch-2. But for now I think the convenience of not needing to
>> >
>> > port
>> >
>> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> >
>> > potentially needing to revive branch-2.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>> >
>> >
>> > +1 for 2.10.x as last release for 2.x version.
>> >
>> >
>> > Software would become more compatible when more companies stress test
>> >
>> > the same software and making improvements in trunk.  Some may be extra
>> >
>> > caution on moving up the version because obligation internally to keep
>> >
>> > things running.  Company obligation should not be the driving force to
>> >
>> > maintain Hadoop branches.  There is no proper collaboration in the
>> >
>> > community when every name brand company maintains its own Hadoop 2.x
>> >
>> > version.  I think it would be more healthy for the community to
>> >
>> > reduce the
>> >
>> > branch forking and spend energy on trunk to harden the software.
>> >
>> > This will
>> >
>> > give more confidence to move up the version than trying to fix n
>> >
>> > permutations breakage like Flash fixing the timeline.
>> >
>> >
>> > Apache license stated, there is no warranty of any kind for code
>> >
>> > contributions.  Fewer community release process should improve
>> >
>> > software
>> >
>> > quality when eyes are on trunk, and help steering toward the same end
>> >
>> > goals.
>> >
>> >
>> > regards,
>> >
>> > Eric
>> >
>> >
>> >
>> >
>> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> >
>> > <eb...@verizonmedia.com.invalid> wrote:
>> >
>> >
>> > Hello all,
>> >
>> >
>> > Is it written anywhere what the difference is between a minor release
>> >
>> > and a
>> >
>> > point/dot/maintenance (I'll use "point" from here on out) release? I
>> >
>> > have
>> >
>> > looked around and I can't find anything other than some compatibility
>> >
>> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
>> >
>> > think
>> >
>> > this would help shape my opinion on whether or not to keep branch-2
>> >
>> > alive.
>> >
>> > My current understanding is that we can't really break compatibility
>> >
>> > in
>> >
>> > either a minor or point release. But the only mention of the
>> >
>> > difference
>> >
>> > between minor and point releases is how to deal with Stable,
>> >
>> > Evolving,
>> >
>> > and
>> >
>> > Unstable tags, and how to deal with changing default configuration
>> >
>> > values.
>> >
>> > So it seems like there really isn't a big official difference between
>> >
>> > the
>> >
>> > two. In my mind, the functional difference between the two is that
>> >
>> > the
>> >
>> > minor releases may have added features and rewrites, while the point
>> >
>> > releases only have bug fixes. This might be an incorrect
>> >
>> > understanding, but
>> >
>> > that's what I have gathered from watching the releases over the last
>> >
>> > few
>> >
>> > years. Whether or not this is a correct understanding, I think that
>> >
>> > this
>> >
>> > needs to be documented somewhere, even if it is just a convention.
>> >
>> >
>> > Given my assumed understanding of minor vs point releases, here are
>> >
>> > the
>> >
>> > pros/cons that I can think of for having a branch-2. Please add on or
>> >
>> > correct me for anything you feel is missing or inadequate.
>> >
>> > Pros:
>> >
>> > - Features/rewrites/higher-risk patches are less likely to be put
>> >
>> > into
>> >
>> > 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > Cons:
>> >
>> > - Bug fixes are less likely to be put into 2.10.x
>> >
>> > - An extra branch to maintain
>> >
>> >  - Committers have an extra branch (5 vs 4 total branches) to commit
>> >
>> > patches to if they should go all the way back to 2.10.x
>> >
>> > - It is less necessary to move to 3.x
>> >
>> >
>> > So on the one hand you get added stability in fewer features being
>> >
>> > committed to 2.10.x, but then on the other you get fewer bug fixes
>> >
>> > being
>> >
>> > committed. In a perfect world, we wouldn't have to make this
>> >
>> > tradeoff.
>> >
>> > But
>> >
>> > we don't live in a perfect world and committers will make mistakes
>> >
>> > either
>> >
>> > because of lack of knowledge or simply because they made a mistake.
>> >
>> > If
>> >
>> > we
>> >
>> > have a branch-2, committers will forget, not know to, or choose not
>> >
>> > to
>> >
>> > (for
>> >
>> > whatever reason) commit valid bug fixes back all the way to
>> >
>> > branch-2.10. If
>> >
>> > we don't have a branch-2, committers who want their borderline risky
>> >
>> > feature in the 2.x line will err on the side of putting it into
>> >
>> > branch-2.10
>> >
>> > instead of proposing the creation of a branch-2. Clearly I have made
>> >
>> > quite
>> >
>> > a few assumptions here based on my own experiences, so I would like
>> >
>> > to
>> >
>> > hear
>> >
>> > if others have similar or opposing views.
>> >
>> >
>> > As far as 3.x goes, to me it seems like some of the reasoning for
>> >
>> > killing
>> >
>> > branch-2 is due to an effort to push the community towards 3.x. This
>> >
>> > is why
>> >
>> > I have added movement to 3.x as both a pro and a con. As a community
>> >
>> > trying
>> >
>> > to move forward, keeping as many companies on similar branches as
>> >
>> > possible
>> >
>> > is a good way to make sure the code is well-tested. However, from a
>> >
>> > stability point of view, moving to 3.x is still scary and being able
>> >
>> > to
>> >
>> > stay on 2.x until you are comfortable to move is very nice. The
>> >
>> > 2.10.0
>> >
>> > bridge release effort has been very good at making it possible for
>> >
>> > people
>> >
>> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>> >
>> > that
>> >
>> > it is reasonable for companies to want to be extra cautious with 3.x
>> >
>> > due to
>> >
>> > potential performance degradation at large scale.
>> >
>> >
>> > A question I'm pondering is what happens when we move to Java 11 and
>> >
>> > someone is still on 2.x? If they want to backport HADOOP-15338
>> >
>> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>> >
>> > support to
>> >
>> > 2.x, surely not everyone is going to want that (at least not
>> >
>> > immediately).
>> >
>> > The 2.10 documentation states, "The JVM requirements will not change
>> >
>> > across
>> >
>> > point releases within the same minor release except if the JVM
>> >
>> > version
>> >
>> > under question becomes unsupported" [1], so this would warrant a 2.11
>> >
>> > release until Java 8 becomes unsupported (though one could argue that
>> >
>> > it is
>> >
>> > already unsupported since Oracle is no longer giving public Java 8
>> >
>> > update).
>> >
>> > If we don't keep branch-2 around now, would a Java 11 backport be the
>> >
>> > catalyst for a branch-2 revival?
>> >
>> >
>> > Not sure if this really leads to any sort of answer from me on
>> >
>> > whether
>> >
>> > or
>> >
>> > not we should keep branch-2 alive, but these are the things that I am
>> >
>> > weighing in my mind. For me, the bigger problem beyond having
>> >
>> > branch-2
>> >
>> > or
>> >
>> > not is committers not being on the same page with where they should
>> >
>> > commit
>> >
>> > their patches.
>> >
>> >
>> > Eric
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> >
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> >
>> >
>> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>> >
>> >
>> > wrote:
>> >
>> >
>> > Hi Konstantin,
>> >
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about
>> >
>> > the
>> >
>> > stability of 2.10, since we will be on it for a couple of years at
>> >
>> > least.
>> >
>> > I worry
>> >
>> > that some committers may want to put new features into a branch 2
>> >
>> > release,
>> >
>> > and without a branch-2, they will go directly into 2.10. Since we
>> >
>> > don't
>> >
>> > always
>> >
>> > catch corner cases or performance problems for some time (usually
>> >
>> > not
>> >
>> > until
>> >
>> > the release is deployed to a busy, 4-thousand node cluster), it
>> >
>> > may
>> >
>> > be
>> >
>> > very
>> >
>> > difficult to back out those changes.
>> >
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the
>> >
>> > idea,
>> >
>> > but I
>> >
>> > do
>> >
>> > have these reservations.
>> >
>> >
>> > Thanks,
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> >
>> > <
>> >
>> > shv.hadoop@gmail.com> wrote:
>> >
>> > Hi Eric,
>> >
>> >
>> > We had a long discussion on this list regarding making the 2.10
>> >
>> > release the
>> >
>> > last of branch-2 releases. We intended 2.10 as a bridge release
>> >
>> > between
>> >
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>> >
>> > not in
>> >
>> > the picture right now, and many people may object this idea.
>> >
>> >
>> > I understand Jonathan's proposal as an attempt to
>> >
>> > 1. eliminate confusion which branches people should commit their
>> >
>> > back-ports
>> >
>> > to
>> >
>> > 2. save engineering effort committing to more branches than
>> >
>> > necessary
>> >
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide
>> >
>> > to
>> >
>> > release 2.11 we can resurrect the branch.
>> >
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> >
>> > Thanks,
>> >
>> > --Konstantin
>> >
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> >
>> > jyhung2357@gmail.com
>> >
>> >
>> > wrote:
>> >
>> >
>> > Thanks Eric for the comments - regarding your concerns, I feel
>> >
>> > the
>> >
>> > pros
>> >
>> > outweigh the cons. To me, the chances of patch releases on 2.10.x
>> >
>> > are
>> >
>> > much
>> >
>> > higher than a new 2.11 minor release. (There didn't seem to be
>> >
>> > many
>> >
>> > people
>> >
>> > outside of our company who expressed interest in getting new
>> >
>> > features to
>> >
>> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> >
>> > after
>> >
>> > 2.10.0
>> >
>> > release, there's 29 patches that have gone into branch-2 and 9 in
>> >
>> > branch-2.10, so it's already diverged quite a bit.
>> >
>> >
>> > In any case, we can always reverse this decision if we really
>> >
>> > need
>> >
>> > to, by
>> >
>> > recreating branch-2. But this proposal would reduce a lot of
>> >
>> > confusion
>> >
>> > IMO.
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> >
>> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>> >
>> > epayne@apache.org>
>> >
>> > wrote:
>> >
>> >
>> > Thanks Jonathan for opening the discussion.
>> >
>> >
>> > I am not in favor of this proposal. 2.10 was very recently
>> >
>> > released,
>> >
>> > and
>> >
>> > moving to 2.10 will take some time for the community. It seems
>> >
>> > premature
>> >
>> > to
>> >
>> > make a decision at this point that there will never be a need
>> >
>> > for a
>> >
>> > 2.11
>> >
>> > release.
>> >
>> >
>> > -Eric
>> >
>> >
>> >
>> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> >
>> > <
>> >
>> > jyhung2357@gmail.com> wrote:
>> >
>> >
>> > Hi folks,
>> >
>> >
>> > Given the release of 2.10.0, and the fact that it's intended to
>> >
>> > be a
>> >
>> > bridge
>> >
>> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> >
>> > last
>> >
>> > minor
>> >
>> > release line in branch-2. Currently, the main issue is that
>> >
>> > there's
>> >
>> > many
>> >
>> > fixes going into branch-2 (the theoretical 2.11.0) that's not
>> >
>> > going
>> >
>> > into
>> >
>> > branch-2.10 (which will become 2.10.1), so the fixes in
>> >
>> > branch-2
>> >
>> > will
>> >
>> > likely never see the light of day unless they are backported to
>> >
>> > branch-2.10.
>> >
>> >
>> > To do this, I propose we:
>> >
>> >
>> > - Delete branch-2.10
>> >
>> > - Rename branch-2 to branch-2.10
>> >
>> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> >
>> >
>> > This way we get all the current branch-2 fixes into the 2.10.x
>> >
>> > release
>> >
>> > line. Then the commit chain will look like: trunk -> branch-3.2
>> >
>> > ->
>> >
>> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> >
>> >
>> > Thoughts?
>> >
>> >
>> > Jonathan Hung
>> >
>> >
>> > [1]
>> >
>> >
>> >
>> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> >
>> >
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> >
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> >
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jim Brennan <ja...@verizonmedia.com.INVALID>.
It looks like QBT tests are still being run on branch-2 (
https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-branch2-java7-linux-x86/),
and they are not very helpful at this point.
Can we change the QBT tests to run against branch-2.10 instead?

Jim

On Mon, Dec 23, 2019 at 7:44 PM Akira Ajisaka <aa...@apache.org> wrote:

> Thank you, Ayush.
>
> I understand we should keep branch-2 as is, as well as master.
>
> -Akira
>
>
> On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:
>
> > Hi Akira
> > Seems there was an INFRA ticket for that. INFRA-19581,
> > But the INFRA people closed as wont do and yes, the branch is protected,
> > we can’t delete it directly.
> >
> > Ref: https://issues.apache.org/jira/browse/INFRA-19581
> >
> > -Ayush
> >
> > On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> >
> > Thank you for your work, Jonathan.
> >
> > I found branch-2 has been unintentionally pushed again. Would you remove
> > it?
> > I think the branch should be protected if possible.
> >
> > -Akira
> >
> > On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> > wrote:
> >
> > It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> >
> > branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists,
> please
> >
> > don't try to commit to it)
> >
> >
> > Completed procedure:
> >
> >
> >   - Verified everything in old branch-2.10 was in old branch-2
> >
> >   - Delete old branch-2.10
> >
> >   - Rename branch-2 to (new) branch-2.10
> >
> >   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >   - Renamed fix versions from 2.11.0 to 2.10.1
> >
> >   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
> >
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> >
> > shv.hadoop@gmail.com>
> >
> > wrote:
> >
> >
> > Hey guys,
> >
> >
> > I think we diverged a bit from the initial topic of this discussion,
> >
> > which is removing branch-2.10, and changing the version of branch-2 from
> >
> > 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >
> > Sounds like the subject line for this thread "Making 2.10 the last minor
> >
> > 2.x release" confused people.
> >
> > It is in fact a wider matter that can be discussed when somebody
> >
> > actually
> >
> > proposes to release 2.11, which I understand nobody does at the moment.
> >
> >
> > So if anybody objects removing branch-2.10 please make an argument.
> >
> > Otherwise we should go ahead and just do it next week.
> >
> > I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> > On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >
> > wrote:
> >
> >
> > Thanks for the detailed thoughts, everyone.
> >
> >
> > Eric (Badger), my understanding is the same as yours re. minor vs patch
> >
> > releases. As for putting features into minor/patch releases, if we
> >
> > keep the
> >
> > convention of putting new features only into minor releases, my
> >
> > assumption
> >
> > is still that it's unlikely people will want to get them into branch-2
> >
> > (based on the 2.10.0 release process). For the java 11 issue, we
> >
> > haven't
> >
> > even really removed support for java 7 in branch-2 (much less java 8),
> >
> > so I
> >
> > feel moving to java 11 would go along with a move to branch 3. And as
> >
> > you
> >
> > mentioned, if people really want to use java 11 on branch-2, we can
> >
> > always
> >
> > revive branch-2. But for now I think the convenience of not needing to
> >
> > port
> >
> > to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >
> > potentially needing to revive branch-2.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >
> >
> > +1 for 2.10.x as last release for 2.x version.
> >
> >
> > Software would become more compatible when more companies stress test
> >
> > the same software and making improvements in trunk.  Some may be extra
> >
> > caution on moving up the version because obligation internally to keep
> >
> > things running.  Company obligation should not be the driving force to
> >
> > maintain Hadoop branches.  There is no proper collaboration in the
> >
> > community when every name brand company maintains its own Hadoop 2.x
> >
> > version.  I think it would be more healthy for the community to
> >
> > reduce the
> >
> > branch forking and spend energy on trunk to harden the software.
> >
> > This will
> >
> > give more confidence to move up the version than trying to fix n
> >
> > permutations breakage like Flash fixing the timeline.
> >
> >
> > Apache license stated, there is no warranty of any kind for code
> >
> > contributions.  Fewer community release process should improve
> >
> > software
> >
> > quality when eyes are on trunk, and help steering toward the same end
> >
> > goals.
> >
> >
> > regards,
> >
> > Eric
> >
> >
> >
> >
> > On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >
> > <eb...@verizonmedia.com.invalid> wrote:
> >
> >
> > Hello all,
> >
> >
> > Is it written anywhere what the difference is between a minor release
> >
> > and a
> >
> > point/dot/maintenance (I'll use "point" from here on out) release? I
> >
> > have
> >
> > looked around and I can't find anything other than some compatibility
> >
> > documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >
> > think
> >
> > this would help shape my opinion on whether or not to keep branch-2
> >
> > alive.
> >
> > My current understanding is that we can't really break compatibility
> >
> > in
> >
> > either a minor or point release. But the only mention of the
> >
> > difference
> >
> > between minor and point releases is how to deal with Stable,
> >
> > Evolving,
> >
> > and
> >
> > Unstable tags, and how to deal with changing default configuration
> >
> > values.
> >
> > So it seems like there really isn't a big official difference between
> >
> > the
> >
> > two. In my mind, the functional difference between the two is that
> >
> > the
> >
> > minor releases may have added features and rewrites, while the point
> >
> > releases only have bug fixes. This might be an incorrect
> >
> > understanding, but
> >
> > that's what I have gathered from watching the releases over the last
> >
> > few
> >
> > years. Whether or not this is a correct understanding, I think that
> >
> > this
> >
> > needs to be documented somewhere, even if it is just a convention.
> >
> >
> > Given my assumed understanding of minor vs point releases, here are
> >
> > the
> >
> > pros/cons that I can think of for having a branch-2. Please add on or
> >
> > correct me for anything you feel is missing or inadequate.
> >
> > Pros:
> >
> > - Features/rewrites/higher-risk patches are less likely to be put
> >
> > into
> >
> > 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > Cons:
> >
> > - Bug fixes are less likely to be put into 2.10.x
> >
> > - An extra branch to maintain
> >
> >  - Committers have an extra branch (5 vs 4 total branches) to commit
> >
> > patches to if they should go all the way back to 2.10.x
> >
> > - It is less necessary to move to 3.x
> >
> >
> > So on the one hand you get added stability in fewer features being
> >
> > committed to 2.10.x, but then on the other you get fewer bug fixes
> >
> > being
> >
> > committed. In a perfect world, we wouldn't have to make this
> >
> > tradeoff.
> >
> > But
> >
> > we don't live in a perfect world and committers will make mistakes
> >
> > either
> >
> > because of lack of knowledge or simply because they made a mistake.
> >
> > If
> >
> > we
> >
> > have a branch-2, committers will forget, not know to, or choose not
> >
> > to
> >
> > (for
> >
> > whatever reason) commit valid bug fixes back all the way to
> >
> > branch-2.10. If
> >
> > we don't have a branch-2, committers who want their borderline risky
> >
> > feature in the 2.x line will err on the side of putting it into
> >
> > branch-2.10
> >
> > instead of proposing the creation of a branch-2. Clearly I have made
> >
> > quite
> >
> > a few assumptions here based on my own experiences, so I would like
> >
> > to
> >
> > hear
> >
> > if others have similar or opposing views.
> >
> >
> > As far as 3.x goes, to me it seems like some of the reasoning for
> >
> > killing
> >
> > branch-2 is due to an effort to push the community towards 3.x. This
> >
> > is why
> >
> > I have added movement to 3.x as both a pro and a con. As a community
> >
> > trying
> >
> > to move forward, keeping as many companies on similar branches as
> >
> > possible
> >
> > is a good way to make sure the code is well-tested. However, from a
> >
> > stability point of view, moving to 3.x is still scary and being able
> >
> > to
> >
> > stay on 2.x until you are comfortable to move is very nice. The
> >
> > 2.10.0
> >
> > bridge release effort has been very good at making it possible for
> >
> > people
> >
> > to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >
> > that
> >
> > it is reasonable for companies to want to be extra cautious with 3.x
> >
> > due to
> >
> > potential performance degradation at large scale.
> >
> >
> > A question I'm pondering is what happens when we move to Java 11 and
> >
> > someone is still on 2.x? If they want to backport HADOOP-15338
> >
> > <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >
> > support to
> >
> > 2.x, surely not everyone is going to want that (at least not
> >
> > immediately).
> >
> > The 2.10 documentation states, "The JVM requirements will not change
> >
> > across
> >
> > point releases within the same minor release except if the JVM
> >
> > version
> >
> > under question becomes unsupported" [1], so this would warrant a 2.11
> >
> > release until Java 8 becomes unsupported (though one could argue that
> >
> > it is
> >
> > already unsupported since Oracle is no longer giving public Java 8
> >
> > update).
> >
> > If we don't keep branch-2 around now, would a Java 11 backport be the
> >
> > catalyst for a branch-2 revival?
> >
> >
> > Not sure if this really leads to any sort of answer from me on
> >
> > whether
> >
> > or
> >
> > not we should keep branch-2 alive, but these are the things that I am
> >
> > weighing in my mind. For me, the bigger problem beyond having
> >
> > branch-2
> >
> > or
> >
> > not is committers not being on the same page with where they should
> >
> > commit
> >
> > their patches.
> >
> >
> > Eric
> >
> >
> > [1]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> > [2]
> >
> >
> >
> >
> >
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >
> >
> > On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >
> > wrote:
> >
> >
> > Hi Konstantin,
> >
> >
> > Sure, I understand those concerns. On the other hand, I worry about
> >
> > the
> >
> > stability of 2.10, since we will be on it for a couple of years at
> >
> > least.
> >
> > I worry
> >
> > that some committers may want to put new features into a branch 2
> >
> > release,
> >
> > and without a branch-2, they will go directly into 2.10. Since we
> >
> > don't
> >
> > always
> >
> > catch corner cases or performance problems for some time (usually
> >
> > not
> >
> > until
> >
> > the release is deployed to a busy, 4-thousand node cluster), it
> >
> > may
> >
> > be
> >
> > very
> >
> > difficult to back out those changes.
> >
> >
> > It sounds like I'm in the minority here, so I'm not nixing the
> >
> > idea,
> >
> > but I
> >
> > do
> >
> > have these reservations.
> >
> >
> > Thanks,
> >
> > -Eric
> >
> >
> >
> >
> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> >
> > <
> >
> > shv.hadoop@gmail.com> wrote:
> >
> > Hi Eric,
> >
> >
> > We had a long discussion on this list regarding making the 2.10
> >
> > release the
> >
> > last of branch-2 releases. We intended 2.10 as a bridge release
> >
> > between
> >
> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >
> > not in
> >
> > the picture right now, and many people may object this idea.
> >
> >
> > I understand Jonathan's proposal as an attempt to
> >
> > 1. eliminate confusion which branches people should commit their
> >
> > back-ports
> >
> > to
> >
> > 2. save engineering effort committing to more branches than
> >
> > necessary
> >
> >
> > "Branches are cheap" as our founder used to say. If we ever decide
> >
> > to
> >
> > release 2.11 we can resurrect the branch.
> >
> > Until then I am in favor of Jonathan's proposal +1.
> >
> >
> > Thanks,
> >
> > --Konstantin
> >
> >
> >
> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> >
> > jyhung2357@gmail.com
> >
> >
> > wrote:
> >
> >
> > Thanks Eric for the comments - regarding your concerns, I feel
> >
> > the
> >
> > pros
> >
> > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >
> > are
> >
> > much
> >
> > higher than a new 2.11 minor release. (There didn't seem to be
> >
> > many
> >
> > people
> >
> > outside of our company who expressed interest in getting new
> >
> > features to
> >
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> >
> > after
> >
> > 2.10.0
> >
> > release, there's 29 patches that have gone into branch-2 and 9 in
> >
> > branch-2.10, so it's already diverged quite a bit.
> >
> >
> > In any case, we can always reverse this decision if we really
> >
> > need
> >
> > to, by
> >
> > recreating branch-2. But this proposal would reduce a lot of
> >
> > confusion
> >
> > IMO.
> >
> >
> > Jonathan Hung
> >
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >
> > epayne@apache.org>
> >
> > wrote:
> >
> >
> > Thanks Jonathan for opening the discussion.
> >
> >
> > I am not in favor of this proposal. 2.10 was very recently
> >
> > released,
> >
> > and
> >
> > moving to 2.10 will take some time for the community. It seems
> >
> > premature
> >
> > to
> >
> > make a decision at this point that there will never be a need
> >
> > for a
> >
> > 2.11
> >
> > release.
> >
> >
> > -Eric
> >
> >
> >
> > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> >
> > <
> >
> > jyhung2357@gmail.com> wrote:
> >
> >
> > Hi folks,
> >
> >
> > Given the release of 2.10.0, and the fact that it's intended to
> >
> > be a
> >
> > bridge
> >
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> >
> > last
> >
> > minor
> >
> > release line in branch-2. Currently, the main issue is that
> >
> > there's
> >
> > many
> >
> > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >
> > going
> >
> > into
> >
> > branch-2.10 (which will become 2.10.1), so the fixes in
> >
> > branch-2
> >
> > will
> >
> > likely never see the light of day unless they are backported to
> >
> > branch-2.10.
> >
> >
> > To do this, I propose we:
> >
> >
> > - Delete branch-2.10
> >
> > - Rename branch-2 to branch-2.10
> >
> > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> >
> > This way we get all the current branch-2 fixes into the 2.10.x
> >
> > release
> >
> > line. Then the commit chain will look like: trunk -> branch-3.2
> >
> > ->
> >
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> >
> > Thoughts?
> >
> >
> > Jonathan Hung
> >
> >
> > [1]
> >
> >
> >
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> >
> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >
> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you, Ayush.

I understand we should keep branch-2 as is, as well as master.

-Akira


On Mon, Dec 23, 2019 at 9:14 PM Ayush Saxena <ay...@gmail.com> wrote:

> Hi Akira
> Seems there was an INFRA ticket for that. INFRA-19581,
> But the INFRA people closed as wont do and yes, the branch is protected,
> we can’t delete it directly.
>
> Ref: https://issues.apache.org/jira/browse/INFRA-19581
>
> -Ayush
>
> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
>
> Thank you for your work, Jonathan.
>
> I found branch-2 has been unintentionally pushed again. Would you remove
> it?
> I think the branch should be protected if possible.
>
> -Akira
>
> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>
> don't try to commit to it)
>
>
> Completed procedure:
>
>
>   - Verified everything in old branch-2.10 was in old branch-2
>
>   - Delete old branch-2.10
>
>   - Rename branch-2 to (new) branch-2.10
>
>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>
>   - Renamed fix versions from 2.11.0 to 2.10.1
>
>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
>
> Jonathan Hung
>
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> FYI, starting the rename process, beginning with INFRA-19521.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>
> shv.hadoop@gmail.com>
>
> wrote:
>
>
> Hey guys,
>
>
> I think we diverged a bit from the initial topic of this discussion,
>
> which is removing branch-2.10, and changing the version of branch-2 from
>
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>
> Sounds like the subject line for this thread "Making 2.10 the last minor
>
> 2.x release" confused people.
>
> It is in fact a wider matter that can be discussed when somebody
>
> actually
>
> proposes to release 2.11, which I understand nobody does at the moment.
>
>
> So if anybody objects removing branch-2.10 please make an argument.
>
> Otherwise we should go ahead and just do it next week.
>
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
>
> Thanks,
>
> --Konstantin
>
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>
> wrote:
>
>
> Thanks for the detailed thoughts, everyone.
>
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
>
> releases. As for putting features into minor/patch releases, if we
>
> keep the
>
> convention of putting new features only into minor releases, my
>
> assumption
>
> is still that it's unlikely people will want to get them into branch-2
>
> (based on the 2.10.0 release process). For the java 11 issue, we
>
> haven't
>
> even really removed support for java 7 in branch-2 (much less java 8),
>
> so I
>
> feel moving to java 11 would go along with a move to branch 3. And as
>
> you
>
> mentioned, if people really want to use java 11 on branch-2, we can
>
> always
>
> revive branch-2. But for now I think the convenience of not needing to
>
> port
>
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>
> potentially needing to revive branch-2.
>
>
> Jonathan Hung
>
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>
> +1 for 2.10.x as last release for 2.x version.
>
>
> Software would become more compatible when more companies stress test
>
> the same software and making improvements in trunk.  Some may be extra
>
> caution on moving up the version because obligation internally to keep
>
> things running.  Company obligation should not be the driving force to
>
> maintain Hadoop branches.  There is no proper collaboration in the
>
> community when every name brand company maintains its own Hadoop 2.x
>
> version.  I think it would be more healthy for the community to
>
> reduce the
>
> branch forking and spend energy on trunk to harden the software.
>
> This will
>
> give more confidence to move up the version than trying to fix n
>
> permutations breakage like Flash fixing the timeline.
>
>
> Apache license stated, there is no warranty of any kind for code
>
> contributions.  Fewer community release process should improve
>
> software
>
> quality when eyes are on trunk, and help steering toward the same end
>
> goals.
>
>
> regards,
>
> Eric
>
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>
> <eb...@verizonmedia.com.invalid> wrote:
>
>
> Hello all,
>
>
> Is it written anywhere what the difference is between a minor release
>
> and a
>
> point/dot/maintenance (I'll use "point" from here on out) release? I
>
> have
>
> looked around and I can't find anything other than some compatibility
>
> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>
> think
>
> this would help shape my opinion on whether or not to keep branch-2
>
> alive.
>
> My current understanding is that we can't really break compatibility
>
> in
>
> either a minor or point release. But the only mention of the
>
> difference
>
> between minor and point releases is how to deal with Stable,
>
> Evolving,
>
> and
>
> Unstable tags, and how to deal with changing default configuration
>
> values.
>
> So it seems like there really isn't a big official difference between
>
> the
>
> two. In my mind, the functional difference between the two is that
>
> the
>
> minor releases may have added features and rewrites, while the point
>
> releases only have bug fixes. This might be an incorrect
>
> understanding, but
>
> that's what I have gathered from watching the releases over the last
>
> few
>
> years. Whether or not this is a correct understanding, I think that
>
> this
>
> needs to be documented somewhere, even if it is just a convention.
>
>
> Given my assumed understanding of minor vs point releases, here are
>
> the
>
> pros/cons that I can think of for having a branch-2. Please add on or
>
> correct me for anything you feel is missing or inadequate.
>
> Pros:
>
> - Features/rewrites/higher-risk patches are less likely to be put
>
> into
>
> 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> Cons:
>
> - Bug fixes are less likely to be put into 2.10.x
>
> - An extra branch to maintain
>
>  - Committers have an extra branch (5 vs 4 total branches) to commit
>
> patches to if they should go all the way back to 2.10.x
>
> - It is less necessary to move to 3.x
>
>
> So on the one hand you get added stability in fewer features being
>
> committed to 2.10.x, but then on the other you get fewer bug fixes
>
> being
>
> committed. In a perfect world, we wouldn't have to make this
>
> tradeoff.
>
> But
>
> we don't live in a perfect world and committers will make mistakes
>
> either
>
> because of lack of knowledge or simply because they made a mistake.
>
> If
>
> we
>
> have a branch-2, committers will forget, not know to, or choose not
>
> to
>
> (for
>
> whatever reason) commit valid bug fixes back all the way to
>
> branch-2.10. If
>
> we don't have a branch-2, committers who want their borderline risky
>
> feature in the 2.x line will err on the side of putting it into
>
> branch-2.10
>
> instead of proposing the creation of a branch-2. Clearly I have made
>
> quite
>
> a few assumptions here based on my own experiences, so I would like
>
> to
>
> hear
>
> if others have similar or opposing views.
>
>
> As far as 3.x goes, to me it seems like some of the reasoning for
>
> killing
>
> branch-2 is due to an effort to push the community towards 3.x. This
>
> is why
>
> I have added movement to 3.x as both a pro and a con. As a community
>
> trying
>
> to move forward, keeping as many companies on similar branches as
>
> possible
>
> is a good way to make sure the code is well-tested. However, from a
>
> stability point of view, moving to 3.x is still scary and being able
>
> to
>
> stay on 2.x until you are comfortable to move is very nice. The
>
> 2.10.0
>
> bridge release effort has been very good at making it possible for
>
> people
>
> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>
> that
>
> it is reasonable for companies to want to be extra cautious with 3.x
>
> due to
>
> potential performance degradation at large scale.
>
>
> A question I'm pondering is what happens when we move to Java 11 and
>
> someone is still on 2.x? If they want to backport HADOOP-15338
>
> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>
> support to
>
> 2.x, surely not everyone is going to want that (at least not
>
> immediately).
>
> The 2.10 documentation states, "The JVM requirements will not change
>
> across
>
> point releases within the same minor release except if the JVM
>
> version
>
> under question becomes unsupported" [1], so this would warrant a 2.11
>
> release until Java 8 becomes unsupported (though one could argue that
>
> it is
>
> already unsupported since Oracle is no longer giving public Java 8
>
> update).
>
> If we don't keep branch-2 around now, would a Java 11 backport be the
>
> catalyst for a branch-2 revival?
>
>
> Not sure if this really leads to any sort of answer from me on
>
> whether
>
> or
>
> not we should keep branch-2 alive, but these are the things that I am
>
> weighing in my mind. For me, the bigger problem beyond having
>
> branch-2
>
> or
>
> not is committers not being on the same page with where they should
>
> commit
>
> their patches.
>
>
> Eric
>
>
> [1]
>
>
>
>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
> [2]
>
>
>
>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>
>
> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>
>
> wrote:
>
>
> Hi Konstantin,
>
>
> Sure, I understand those concerns. On the other hand, I worry about
>
> the
>
> stability of 2.10, since we will be on it for a couple of years at
>
> least.
>
> I worry
>
> that some committers may want to put new features into a branch 2
>
> release,
>
> and without a branch-2, they will go directly into 2.10. Since we
>
> don't
>
> always
>
> catch corner cases or performance problems for some time (usually
>
> not
>
> until
>
> the release is deployed to a busy, 4-thousand node cluster), it
>
> may
>
> be
>
> very
>
> difficult to back out those changes.
>
>
> It sounds like I'm in the minority here, so I'm not nixing the
>
> idea,
>
> but I
>
> do
>
> have these reservations.
>
>
> Thanks,
>
> -Eric
>
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>
> <
>
> shv.hadoop@gmail.com> wrote:
>
> Hi Eric,
>
>
> We had a long discussion on this list regarding making the 2.10
>
> release the
>
> last of branch-2 releases. We intended 2.10 as a bridge release
>
> between
>
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>
> not in
>
> the picture right now, and many people may object this idea.
>
>
> I understand Jonathan's proposal as an attempt to
>
> 1. eliminate confusion which branches people should commit their
>
> back-ports
>
> to
>
> 2. save engineering effort committing to more branches than
>
> necessary
>
>
> "Branches are cheap" as our founder used to say. If we ever decide
>
> to
>
> release 2.11 we can resurrect the branch.
>
> Until then I am in favor of Jonathan's proposal +1.
>
>
> Thanks,
>
> --Konstantin
>
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>
> jyhung2357@gmail.com
>
>
> wrote:
>
>
> Thanks Eric for the comments - regarding your concerns, I feel
>
> the
>
> pros
>
> outweigh the cons. To me, the chances of patch releases on 2.10.x
>
> are
>
> much
>
> higher than a new 2.11 minor release. (There didn't seem to be
>
> many
>
> people
>
> outside of our company who expressed interest in getting new
>
> features to
>
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>
> after
>
> 2.10.0
>
> release, there's 29 patches that have gone into branch-2 and 9 in
>
> branch-2.10, so it's already diverged quite a bit.
>
>
> In any case, we can always reverse this decision if we really
>
> need
>
> to, by
>
> recreating branch-2. But this proposal would reduce a lot of
>
> confusion
>
> IMO.
>
>
> Jonathan Hung
>
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>
> epayne@apache.org>
>
> wrote:
>
>
> Thanks Jonathan for opening the discussion.
>
>
> I am not in favor of this proposal. 2.10 was very recently
>
> released,
>
> and
>
> moving to 2.10 will take some time for the community. It seems
>
> premature
>
> to
>
> make a decision at this point that there will never be a need
>
> for a
>
> 2.11
>
> release.
>
>
> -Eric
>
>
>
> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>
> <
>
> jyhung2357@gmail.com> wrote:
>
>
> Hi folks,
>
>
> Given the release of 2.10.0, and the fact that it's intended to
>
> be a
>
> bridge
>
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>
> last
>
> minor
>
> release line in branch-2. Currently, the main issue is that
>
> there's
>
> many
>
> fixes going into branch-2 (the theoretical 2.11.0) that's not
>
> going
>
> into
>
> branch-2.10 (which will become 2.10.1), so the fixes in
>
> branch-2
>
> will
>
> likely never see the light of day unless they are backported to
>
> branch-2.10.
>
>
> To do this, I propose we:
>
>
> - Delete branch-2.10
>
> - Rename branch-2 to branch-2.10
>
> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
>
> This way we get all the current branch-2 fixes into the 2.10.x
>
> release
>
> line. Then the commit chain will look like: trunk -> branch-3.2
>
> ->
>
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
>
> Thoughts?
>
>
> Jonathan Hung
>
>
> [1]
>
>
>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
>
>
>
>
> ---------------------------------------------------------------------
>
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>
>
>
>
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Ayush Saxena <ay...@gmail.com>.
Hi Akira
Seems there was an INFRA ticket for that. INFRA-19581, 
But the INFRA people closed as wont do and yes, the branch is protected, we can’t delete it directly.

Ref: https://issues.apache.org/jira/browse/INFRA-19581

-Ayush

> On 23-Dec-2019, at 5:03 PM, Akira Ajisaka <aa...@apache.org> wrote:
> 
> Thank you for your work, Jonathan.
> 
> I found branch-2 has been unintentionally pushed again. Would you remove it?
> I think the branch should be protected if possible.
> 
> -Akira
> 
>> On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:
>> 
>> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
>> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
>> don't try to commit to it)
>> 
>> Completed procedure:
>> 
>>   - Verified everything in old branch-2.10 was in old branch-2
>>   - Delete old branch-2.10
>>   - Rename branch-2 to (new) branch-2.10
>>   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>>   - Renamed fix versions from 2.11.0 to 2.10.1
>>   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>> 
>> 
>> Jonathan Hung
>> 
>> 
>> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
>> wrote:
>> 
>>> FYI, starting the rename process, beginning with INFRA-19521.
>>> 
>>> Jonathan Hung
>>> 
>>> 
>>> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
>> shv.hadoop@gmail.com>
>>> wrote:
>>> 
>>>> Hey guys,
>>>> 
>>>> I think we diverged a bit from the initial topic of this discussion,
>>>> which is removing branch-2.10, and changing the version of branch-2 from
>>>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>>>> Sounds like the subject line for this thread "Making 2.10 the last minor
>>>> 2.x release" confused people.
>>>> It is in fact a wider matter that can be discussed when somebody
>> actually
>>>> proposes to release 2.11, which I understand nobody does at the moment.
>>>> 
>>>> So if anybody objects removing branch-2.10 please make an argument.
>>>> Otherwise we should go ahead and just do it next week.
>>>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>>> 
>>>> Thanks,
>>>> --Konstantin
>>>> 
>>>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Thanks for the detailed thoughts, everyone.
>>>>> 
>>>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>>>> releases. As for putting features into minor/patch releases, if we
>> keep the
>>>>> convention of putting new features only into minor releases, my
>> assumption
>>>>> is still that it's unlikely people will want to get them into branch-2
>>>>> (based on the 2.10.0 release process). For the java 11 issue, we
>> haven't
>>>>> even really removed support for java 7 in branch-2 (much less java 8),
>> so I
>>>>> feel moving to java 11 would go along with a move to branch 3. And as
>> you
>>>>> mentioned, if people really want to use java 11 on branch-2, we can
>> always
>>>>> revive branch-2. But for now I think the convenience of not needing to
>> port
>>>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>>>> potentially needing to revive branch-2.
>>>>> 
>>>>> Jonathan Hung
>>>>> 
>>>>> 
>>>>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>>>> 
>>>>>> +1 for 2.10.x as last release for 2.x version.
>>>>>> 
>>>>>> Software would become more compatible when more companies stress test
>>>>>> the same software and making improvements in trunk.  Some may be extra
>>>>>> caution on moving up the version because obligation internally to keep
>>>>>> things running.  Company obligation should not be the driving force to
>>>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>>>> community when every name brand company maintains its own Hadoop 2.x
>>>>>> version.  I think it would be more healthy for the community to
>> reduce the
>>>>>> branch forking and spend energy on trunk to harden the software.
>> This will
>>>>>> give more confidence to move up the version than trying to fix n
>>>>>> permutations breakage like Flash fixing the timeline.
>>>>>> 
>>>>>> Apache license stated, there is no warranty of any kind for code
>>>>>> contributions.  Fewer community release process should improve
>> software
>>>>>> quality when eyes are on trunk, and help steering toward the same end
>> goals.
>>>>>> 
>>>>>> regards,
>>>>>> Eric
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> Is it written anywhere what the difference is between a minor release
>>>>>>> and a
>>>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>>>> have
>>>>>>> looked around and I can't find anything other than some compatibility
>>>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>>>> think
>>>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>>>> alive.
>>>>>>> My current understanding is that we can't really break compatibility
>> in
>>>>>>> either a minor or point release. But the only mention of the
>> difference
>>>>>>> between minor and point releases is how to deal with Stable,
>> Evolving,
>>>>>>> and
>>>>>>> Unstable tags, and how to deal with changing default configuration
>>>>>>> values.
>>>>>>> So it seems like there really isn't a big official difference between
>>>>>>> the
>>>>>>> two. In my mind, the functional difference between the two is that
>> the
>>>>>>> minor releases may have added features and rewrites, while the point
>>>>>>> releases only have bug fixes. This might be an incorrect
>>>>>>> understanding, but
>>>>>>> that's what I have gathered from watching the releases over the last
>>>>>>> few
>>>>>>> years. Whether or not this is a correct understanding, I think that
>>>>>>> this
>>>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>>> 
>>>>>>> Given my assumed understanding of minor vs point releases, here are
>> the
>>>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>>>> correct me for anything you feel is missing or inadequate.
>>>>>>> Pros:
>>>>>>> - Features/rewrites/higher-risk patches are less likely to be put
>> into
>>>>>>> 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> Cons:
>>>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>>>> - An extra branch to maintain
>>>>>>>  - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>>>> patches to if they should go all the way back to 2.10.x
>>>>>>> - It is less necessary to move to 3.x
>>>>>>> 
>>>>>>> So on the one hand you get added stability in fewer features being
>>>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>>>> being
>>>>>>> committed. In a perfect world, we wouldn't have to make this
>> tradeoff.
>>>>>>> But
>>>>>>> we don't live in a perfect world and committers will make mistakes
>>>>>>> either
>>>>>>> because of lack of knowledge or simply because they made a mistake.
>> If
>>>>>>> we
>>>>>>> have a branch-2, committers will forget, not know to, or choose not
>> to
>>>>>>> (for
>>>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>>>> branch-2.10. If
>>>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>>>> feature in the 2.x line will err on the side of putting it into
>>>>>>> branch-2.10
>>>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>>>> quite
>>>>>>> a few assumptions here based on my own experiences, so I would like
>> to
>>>>>>> hear
>>>>>>> if others have similar or opposing views.
>>>>>>> 
>>>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>>>> killing
>>>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>>>> is why
>>>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>>>> trying
>>>>>>> to move forward, keeping as many companies on similar branches as
>>>>>>> possible
>>>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>>>> stability point of view, moving to 3.x is still scary and being able
>> to
>>>>>>> stay on 2.x until you are comfortable to move is very nice. The
>> 2.10.0
>>>>>>> bridge release effort has been very good at making it possible for
>>>>>>> people
>>>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>>>> that
>>>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>>>> due to
>>>>>>> potential performance degradation at large scale.
>>>>>>> 
>>>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>>>> support to
>>>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>>>> immediately).
>>>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>>>> across
>>>>>>> point releases within the same minor release except if the JVM
>> version
>>>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>>>> it is
>>>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>>>> update).
>>>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>>>> catalyst for a branch-2 revival?
>>>>>>> 
>>>>>>> Not sure if this really leads to any sort of answer from me on
>> whether
>>>>>>> or
>>>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>>>> weighing in my mind. For me, the bigger problem beyond having
>> branch-2
>>>>>>> or
>>>>>>> not is committers not being on the same page with where they should
>>>>>>> commit
>>>>>>> their patches.
>>>>>>> 
>>>>>>> Eric
>>>>>>> 
>>>>>>> [1]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>>> 
>>>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Konstantin,
>>>>>>>> 
>>>>>>>> Sure, I understand those concerns. On the other hand, I worry about
>>>>>>> the
>>>>>>>> stability of 2.10, since we will be on it for a couple of years at
>>>>>>> least.
>>>>>>>> I worry
>>>>>>>> that some committers may want to put new features into a branch 2
>>>>>>> release,
>>>>>>>> and without a branch-2, they will go directly into 2.10. Since we
>>>>>>> don't
>>>>>>>> always
>>>>>>>> catch corner cases or performance problems for some time (usually
>>>>>>> not
>>>>>>>> until
>>>>>>>> the release is deployed to a busy, 4-thousand node cluster), it
>> may
>>>>>>> be
>>>>>>>> very
>>>>>>>> difficult to back out those changes.
>>>>>>>> 
>>>>>>>> It sounds like I'm in the minority here, so I'm not nixing the
>> idea,
>>>>>>> but I
>>>>>>>> do
>>>>>>>> have these reservations.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> -Eric
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
>> <
>>>>>>>> shv.hadoop@gmail.com> wrote:
>>>>>>>> Hi Eric,
>>>>>>>> 
>>>>>>>> We had a long discussion on this list regarding making the 2.10
>>>>>>> release the
>>>>>>>> last of branch-2 releases. We intended 2.10 as a bridge release
>>>>>>> between
>>>>>>>> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>>>> not in
>>>>>>>> the picture right now, and many people may object this idea.
>>>>>>>> 
>>>>>>>> I understand Jonathan's proposal as an attempt to
>>>>>>>> 1. eliminate confusion which branches people should commit their
>>>>>>> back-ports
>>>>>>>> to
>>>>>>>> 2. save engineering effort committing to more branches than
>> necessary
>>>>>>>> 
>>>>>>>> "Branches are cheap" as our founder used to say. If we ever decide
>> to
>>>>>>>> release 2.11 we can resurrect the branch.
>>>>>>>> Until then I am in favor of Jonathan's proposal +1.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> --Konstantin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
>> jyhung2357@gmail.com
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Eric for the comments - regarding your concerns, I feel
>> the
>>>>>>> pros
>>>>>>>>> outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>>>> are
>>>>>>>> much
>>>>>>>>> higher than a new 2.11 minor release. (There didn't seem to be
>> many
>>>>>>>> people
>>>>>>>>> outside of our company who expressed interest in getting new
>>>>>>> features to
>>>>>>>>> branch-2 prior to the 2.10.0 release.) Even now, a few weeks
>> after
>>>>>>> 2.10.0
>>>>>>>>> release, there's 29 patches that have gone into branch-2 and 9 in
>>>>>>>>> branch-2.10, so it's already diverged quite a bit.
>>>>>>>>> 
>>>>>>>>> In any case, we can always reverse this decision if we really
>> need
>>>>>>> to, by
>>>>>>>>> recreating branch-2. But this proposal would reduce a lot of
>>>>>>> confusion
>>>>>>>> IMO.
>>>>>>>>> 
>>>>>>>>> Jonathan Hung
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>>>> epayne@apache.org>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Jonathan for opening the discussion.
>>>>>>>>>> 
>>>>>>>>>> I am not in favor of this proposal. 2.10 was very recently
>>>>>>> released,
>>>>>>>> and
>>>>>>>>>> moving to 2.10 will take some time for the community. It seems
>>>>>>>> premature
>>>>>>>>> to
>>>>>>>>>> make a decision at this point that there will never be a need
>>>>>>> for a
>>>>>>>> 2.11
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> -Eric
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
>> <
>>>>>>>>>> jyhung2357@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi folks,
>>>>>>>>>> 
>>>>>>>>>> Given the release of 2.10.0, and the fact that it's intended to
>>>>>>> be a
>>>>>>>>> bridge
>>>>>>>>>> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
>> last
>>>>>>> minor
>>>>>>>>>> release line in branch-2. Currently, the main issue is that
>>>>>>> there's
>>>>>>>> many
>>>>>>>>>> fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>>>> going
>>>>>>>> into
>>>>>>>>>> branch-2.10 (which will become 2.10.1), so the fixes in
>> branch-2
>>>>>>> will
>>>>>>>>>> likely never see the light of day unless they are backported to
>>>>>>>>>> branch-2.10.
>>>>>>>>>> 
>>>>>>>>>> To do this, I propose we:
>>>>>>>>>> 
>>>>>>>>>> - Delete branch-2.10
>>>>>>>>>> - Rename branch-2 to branch-2.10
>>>>>>>>>> - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>>>>>>> 
>>>>>>>>>> This way we get all the current branch-2 fixes into the 2.10.x
>>>>>>> release
>>>>>>>>>> line. Then the commit chain will look like: trunk -> branch-3.2
>>>>>>> ->
>>>>>>>>>> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>>>>>>> 
>>>>>>>>>> Thoughts?
>>>>>>>>>> 
>>>>>>>>>> Jonathan Hung
>>>>>>>>>> 
>>>>>>>>>> [1]
>>>>>>>>> 
>>>>>>> 
>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>> 

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Akira Ajisaka <aa...@apache.org>.
Thank you for your work, Jonathan.

I found branch-2 has been unintentionally pushed again. Would you remove it?
I think the branch should be protected if possible.

-Akira

On Tue, Dec 10, 2019 at 5:17 AM Jonathan Hung <jy...@gmail.com> wrote:

> It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
> branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
> don't try to commit to it)
>
> Completed procedure:
>
>    - Verified everything in old branch-2.10 was in old branch-2
>    - Delete old branch-2.10
>    - Rename branch-2 to (new) branch-2.10
>    - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
>    - Renamed fix versions from 2.11.0 to 2.10.1
>    - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE
>
>
> Jonathan Hung
>
>
> On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > FYI, starting the rename process, beginning with INFRA-19521.
> >
> > Jonathan Hung
> >
> >
> > On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <
> shv.hadoop@gmail.com>
> > wrote:
> >
> >> Hey guys,
> >>
> >> I think we diverged a bit from the initial topic of this discussion,
> >> which is removing branch-2.10, and changing the version of branch-2 from
> >> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> >> Sounds like the subject line for this thread "Making 2.10 the last minor
> >> 2.x release" confused people.
> >> It is in fact a wider matter that can be discussed when somebody
> actually
> >> proposes to release 2.11, which I understand nobody does at the moment.
> >>
> >> So if anybody objects removing branch-2.10 please make an argument.
> >> Otherwise we should go ahead and just do it next week.
> >> I see people still struggling to keep branch-2 and branch-2.10 in sync.
> >>
> >> Thanks,
> >> --Konstantin
> >>
> >> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> >> wrote:
> >>
> >>> Thanks for the detailed thoughts, everyone.
> >>>
> >>> Eric (Badger), my understanding is the same as yours re. minor vs patch
> >>> releases. As for putting features into minor/patch releases, if we
> keep the
> >>> convention of putting new features only into minor releases, my
> assumption
> >>> is still that it's unlikely people will want to get them into branch-2
> >>> (based on the 2.10.0 release process). For the java 11 issue, we
> haven't
> >>> even really removed support for java 7 in branch-2 (much less java 8),
> so I
> >>> feel moving to java 11 would go along with a move to branch 3. And as
> you
> >>> mentioned, if people really want to use java 11 on branch-2, we can
> always
> >>> revive branch-2. But for now I think the convenience of not needing to
> port
> >>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> >>> potentially needing to revive branch-2.
> >>>
> >>> Jonathan Hung
> >>>
> >>>
> >>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
> >>>
> >>>> +1 for 2.10.x as last release for 2.x version.
> >>>>
> >>>> Software would become more compatible when more companies stress test
> >>>> the same software and making improvements in trunk.  Some may be extra
> >>>> caution on moving up the version because obligation internally to keep
> >>>> things running.  Company obligation should not be the driving force to
> >>>> maintain Hadoop branches.  There is no proper collaboration in the
> >>>> community when every name brand company maintains its own Hadoop 2.x
> >>>> version.  I think it would be more healthy for the community to
> reduce the
> >>>> branch forking and spend energy on trunk to harden the software.
> This will
> >>>> give more confidence to move up the version than trying to fix n
> >>>> permutations breakage like Flash fixing the timeline.
> >>>>
> >>>> Apache license stated, there is no warranty of any kind for code
> >>>> contributions.  Fewer community release process should improve
> software
> >>>> quality when eyes are on trunk, and help steering toward the same end
> goals.
> >>>>
> >>>> regards,
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> >>>> <eb...@verizonmedia.com.invalid> wrote:
> >>>>
> >>>>> Hello all,
> >>>>>
> >>>>> Is it written anywhere what the difference is between a minor release
> >>>>> and a
> >>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
> >>>>> have
> >>>>> looked around and I can't find anything other than some compatibility
> >>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
> >>>>> think
> >>>>> this would help shape my opinion on whether or not to keep branch-2
> >>>>> alive.
> >>>>> My current understanding is that we can't really break compatibility
> in
> >>>>> either a minor or point release. But the only mention of the
> difference
> >>>>> between minor and point releases is how to deal with Stable,
> Evolving,
> >>>>> and
> >>>>> Unstable tags, and how to deal with changing default configuration
> >>>>> values.
> >>>>> So it seems like there really isn't a big official difference between
> >>>>> the
> >>>>> two. In my mind, the functional difference between the two is that
> the
> >>>>> minor releases may have added features and rewrites, while the point
> >>>>> releases only have bug fixes. This might be an incorrect
> >>>>> understanding, but
> >>>>> that's what I have gathered from watching the releases over the last
> >>>>> few
> >>>>> years. Whether or not this is a correct understanding, I think that
> >>>>> this
> >>>>> needs to be documented somewhere, even if it is just a convention.
> >>>>>
> >>>>> Given my assumed understanding of minor vs point releases, here are
> the
> >>>>> pros/cons that I can think of for having a branch-2. Please add on or
> >>>>> correct me for anything you feel is missing or inadequate.
> >>>>> Pros:
> >>>>> - Features/rewrites/higher-risk patches are less likely to be put
> into
> >>>>> 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> Cons:
> >>>>> - Bug fixes are less likely to be put into 2.10.x
> >>>>> - An extra branch to maintain
> >>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
> >>>>> patches to if they should go all the way back to 2.10.x
> >>>>> - It is less necessary to move to 3.x
> >>>>>
> >>>>> So on the one hand you get added stability in fewer features being
> >>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
> >>>>> being
> >>>>> committed. In a perfect world, we wouldn't have to make this
> tradeoff.
> >>>>> But
> >>>>> we don't live in a perfect world and committers will make mistakes
> >>>>> either
> >>>>> because of lack of knowledge or simply because they made a mistake.
> If
> >>>>> we
> >>>>> have a branch-2, committers will forget, not know to, or choose not
> to
> >>>>> (for
> >>>>> whatever reason) commit valid bug fixes back all the way to
> >>>>> branch-2.10. If
> >>>>> we don't have a branch-2, committers who want their borderline risky
> >>>>> feature in the 2.x line will err on the side of putting it into
> >>>>> branch-2.10
> >>>>> instead of proposing the creation of a branch-2. Clearly I have made
> >>>>> quite
> >>>>> a few assumptions here based on my own experiences, so I would like
> to
> >>>>> hear
> >>>>> if others have similar or opposing views.
> >>>>>
> >>>>> As far as 3.x goes, to me it seems like some of the reasoning for
> >>>>> killing
> >>>>> branch-2 is due to an effort to push the community towards 3.x. This
> >>>>> is why
> >>>>> I have added movement to 3.x as both a pro and a con. As a community
> >>>>> trying
> >>>>> to move forward, keeping as many companies on similar branches as
> >>>>> possible
> >>>>> is a good way to make sure the code is well-tested. However, from a
> >>>>> stability point of view, moving to 3.x is still scary and being able
> to
> >>>>> stay on 2.x until you are comfortable to move is very nice. The
> 2.10.0
> >>>>> bridge release effort has been very good at making it possible for
> >>>>> people
> >>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
> >>>>> that
> >>>>> it is reasonable for companies to want to be extra cautious with 3.x
> >>>>> due to
> >>>>> potential performance degradation at large scale.
> >>>>>
> >>>>> A question I'm pondering is what happens when we move to Java 11 and
> >>>>> someone is still on 2.x? If they want to backport HADOOP-15338
> >>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
> >>>>> support to
> >>>>> 2.x, surely not everyone is going to want that (at least not
> >>>>> immediately).
> >>>>> The 2.10 documentation states, "The JVM requirements will not change
> >>>>> across
> >>>>> point releases within the same minor release except if the JVM
> version
> >>>>> under question becomes unsupported" [1], so this would warrant a 2.11
> >>>>> release until Java 8 becomes unsupported (though one could argue that
> >>>>> it is
> >>>>> already unsupported since Oracle is no longer giving public Java 8
> >>>>> update).
> >>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
> >>>>> catalyst for a branch-2 revival?
> >>>>>
> >>>>> Not sure if this really leads to any sort of answer from me on
> whether
> >>>>> or
> >>>>> not we should keep branch-2 alive, but these are the things that I am
> >>>>> weighing in my mind. For me, the bigger problem beyond having
> branch-2
> >>>>> or
> >>>>> not is committers not being on the same page with where they should
> >>>>> commit
> >>>>> their patches.
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>> [2]
> >>>>>
> >>>>>
> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
> >>>>>
> >>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <epayne@apache.org
> >
> >>>>> wrote:
> >>>>>
> >>>>> > Hi Konstantin,
> >>>>> >
> >>>>> > Sure, I understand those concerns. On the other hand, I worry about
> >>>>> the
> >>>>> > stability of 2.10, since we will be on it for a couple of years at
> >>>>> least.
> >>>>> > I worry
> >>>>> >  that some committers may want to put new features into a branch 2
> >>>>> release,
> >>>>> >  and without a branch-2, they will go directly into 2.10. Since we
> >>>>> don't
> >>>>> > always
> >>>>> >  catch corner cases or performance problems for some time (usually
> >>>>> not
> >>>>> > until
> >>>>> >  the release is deployed to a busy, 4-thousand node cluster), it
> may
> >>>>> be
> >>>>> > very
> >>>>> >  difficult to back out those changes.
> >>>>> >
> >>>>> > It sounds like I'm in the minority here, so I'm not nixing the
> idea,
> >>>>> but I
> >>>>> > do
> >>>>> >  have these reservations.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > -Eric
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko
> <
> >>>>> > shv.hadoop@gmail.com> wrote:
> >>>>> > Hi Eric,
> >>>>> >
> >>>>> > We had a long discussion on this list regarding making the 2.10
> >>>>> release the
> >>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
> >>>>> between
> >>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
> >>>>> not in
> >>>>> > the picture right now, and many people may object this idea.
> >>>>> >
> >>>>> > I understand Jonathan's proposal as an attempt to
> >>>>> > 1. eliminate confusion which branches people should commit their
> >>>>> back-ports
> >>>>> > to
> >>>>> > 2. save engineering effort committing to more branches than
> necessary
> >>>>> >
> >>>>> > "Branches are cheap" as our founder used to say. If we ever decide
> to
> >>>>> > release 2.11 we can resurrect the branch.
> >>>>> > Until then I am in favor of Jonathan's proposal +1.
> >>>>> >
> >>>>> > Thanks,
> >>>>> > --Konstantin
> >>>>> >
> >>>>> >
> >>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <
> jyhung2357@gmail.com
> >>>>> >
> >>>>> > wrote:
> >>>>> >
> >>>>> > > Thanks Eric for the comments - regarding your concerns, I feel
> the
> >>>>> pros
> >>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
> >>>>> are
> >>>>> > much
> >>>>> > > higher than a new 2.11 minor release. (There didn't seem to be
> many
> >>>>> > people
> >>>>> > > outside of our company who expressed interest in getting new
> >>>>> features to
> >>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks
> after
> >>>>> 2.10.0
> >>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
> >>>>> > > branch-2.10, so it's already diverged quite a bit.
> >>>>> > >
> >>>>> > > In any case, we can always reverse this decision if we really
> need
> >>>>> to, by
> >>>>> > > recreating branch-2. But this proposal would reduce a lot of
> >>>>> confusion
> >>>>> > IMO.
> >>>>> > >
> >>>>> > > Jonathan Hung
> >>>>> > >
> >>>>> > >
> >>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
> >>>>> epayne@apache.org>
> >>>>> > > wrote:
> >>>>> > >
> >>>>> > > > Thanks Jonathan for opening the discussion.
> >>>>> > > >
> >>>>> > > > I am not in favor of this proposal. 2.10 was very recently
> >>>>> released,
> >>>>> > and
> >>>>> > > > moving to 2.10 will take some time for the community. It seems
> >>>>> > premature
> >>>>> > > to
> >>>>> > > > make a decision at this point that there will never be a need
> >>>>> for a
> >>>>> > 2.11
> >>>>> > > > release.
> >>>>> > > >
> >>>>> > > > -Eric
> >>>>> > > >
> >>>>> > > >
> >>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung
> <
> >>>>> > > > jyhung2357@gmail.com> wrote:
> >>>>> > > >
> >>>>> > > > Hi folks,
> >>>>> > > >
> >>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
> >>>>> be a
> >>>>> > > bridge
> >>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the
> last
> >>>>> minor
> >>>>> > > > release line in branch-2. Currently, the main issue is that
> >>>>> there's
> >>>>> > many
> >>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
> >>>>> going
> >>>>> > into
> >>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in
> branch-2
> >>>>> will
> >>>>> > > > likely never see the light of day unless they are backported to
> >>>>> > > > branch-2.10.
> >>>>> > > >
> >>>>> > > > To do this, I propose we:
> >>>>> > > >
> >>>>> > > >  - Delete branch-2.10
> >>>>> > > >  - Rename branch-2 to branch-2.10
> >>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >>>>> > > >
> >>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
> >>>>> release
> >>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
> >>>>> ->
> >>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >>>>> > > >
> >>>>> > > > Thoughts?
> >>>>> > > >
> >>>>> > > > Jonathan Hung
> >>>>> > > >
> >>>>> > > > [1]
> >>>>> > >
> >>>>>
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >>>>> > > >
> >>>>> > >
> >>>>> >
> >>>>> >
> ---------------------------------------------------------------------
> >>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
It's done. The new commit chain is: trunk -> branch-3.2 -> branch-3.1 ->
branch-2.10 -> branch-2.9 -> branch-2.8 (branch-2 no longer exists, please
don't try to commit to it)

Completed procedure:

   - Verified everything in old branch-2.10 was in old branch-2
   - Delete old branch-2.10
   - Rename branch-2 to (new) branch-2.10
   - Set version in new branch-2.10 to 2.10.1-SNAPSHOT
   - Renamed fix versions from 2.11.0 to 2.10.1
   - Removed 2.11.0 as a version in HADOOP/YARN/HDFS/MAPREDUCE


Jonathan Hung


On Wed, Dec 4, 2019 at 10:55 AM Jonathan Hung <jy...@gmail.com> wrote:

> FYI, starting the rename process, beginning with INFRA-19521.
>
> Jonathan Hung
>
>
> On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
> wrote:
>
>> Hey guys,
>>
>> I think we diverged a bit from the initial topic of this discussion,
>> which is removing branch-2.10, and changing the version of branch-2 from
>> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
>> Sounds like the subject line for this thread "Making 2.10 the last minor
>> 2.x release" confused people.
>> It is in fact a wider matter that can be discussed when somebody actually
>> proposes to release 2.11, which I understand nobody does at the moment.
>>
>> So if anybody objects removing branch-2.10 please make an argument.
>> Otherwise we should go ahead and just do it next week.
>> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>>
>> Thanks,
>> --Konstantin
>>
>> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
>> wrote:
>>
>>> Thanks for the detailed thoughts, everyone.
>>>
>>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>>> releases. As for putting features into minor/patch releases, if we keep the
>>> convention of putting new features only into minor releases, my assumption
>>> is still that it's unlikely people will want to get them into branch-2
>>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>>> even really removed support for java 7 in branch-2 (much less java 8), so I
>>> feel moving to java 11 would go along with a move to branch 3. And as you
>>> mentioned, if people really want to use java 11 on branch-2, we can always
>>> revive branch-2. But for now I think the convenience of not needing to port
>>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>>> potentially needing to revive branch-2.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>>
>>>> +1 for 2.10.x as last release for 2.x version.
>>>>
>>>> Software would become more compatible when more companies stress test
>>>> the same software and making improvements in trunk.  Some may be extra
>>>> caution on moving up the version because obligation internally to keep
>>>> things running.  Company obligation should not be the driving force to
>>>> maintain Hadoop branches.  There is no proper collaboration in the
>>>> community when every name brand company maintains its own Hadoop 2.x
>>>> version.  I think it would be more healthy for the community to reduce the
>>>> branch forking and spend energy on trunk to harden the software.  This will
>>>> give more confidence to move up the version than trying to fix n
>>>> permutations breakage like Flash fixing the timeline.
>>>>
>>>> Apache license stated, there is no warranty of any kind for code
>>>> contributions.  Fewer community release process should improve software
>>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>>
>>>> regards,
>>>> Eric
>>>>
>>>>
>>>>
>>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>>> <eb...@verizonmedia.com.invalid> wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> Is it written anywhere what the difference is between a minor release
>>>>> and a
>>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>>> have
>>>>> looked around and I can't find anything other than some compatibility
>>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I
>>>>> think
>>>>> this would help shape my opinion on whether or not to keep branch-2
>>>>> alive.
>>>>> My current understanding is that we can't really break compatibility in
>>>>> either a minor or point release. But the only mention of the difference
>>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>>> and
>>>>> Unstable tags, and how to deal with changing default configuration
>>>>> values.
>>>>> So it seems like there really isn't a big official difference between
>>>>> the
>>>>> two. In my mind, the functional difference between the two is that the
>>>>> minor releases may have added features and rewrites, while the point
>>>>> releases only have bug fixes. This might be an incorrect
>>>>> understanding, but
>>>>> that's what I have gathered from watching the releases over the last
>>>>> few
>>>>> years. Whether or not this is a correct understanding, I think that
>>>>> this
>>>>> needs to be documented somewhere, even if it is just a convention.
>>>>>
>>>>> Given my assumed understanding of minor vs point releases, here are the
>>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>>> correct me for anything you feel is missing or inadequate.
>>>>> Pros:
>>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>>> 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> Cons:
>>>>> - Bug fixes are less likely to be put into 2.10.x
>>>>> - An extra branch to maintain
>>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>>> patches to if they should go all the way back to 2.10.x
>>>>> - It is less necessary to move to 3.x
>>>>>
>>>>> So on the one hand you get added stability in fewer features being
>>>>> committed to 2.10.x, but then on the other you get fewer bug fixes
>>>>> being
>>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>>> But
>>>>> we don't live in a perfect world and committers will make mistakes
>>>>> either
>>>>> because of lack of knowledge or simply because they made a mistake. If
>>>>> we
>>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>>> (for
>>>>> whatever reason) commit valid bug fixes back all the way to
>>>>> branch-2.10. If
>>>>> we don't have a branch-2, committers who want their borderline risky
>>>>> feature in the 2.x line will err on the side of putting it into
>>>>> branch-2.10
>>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>>> quite
>>>>> a few assumptions here based on my own experiences, so I would like to
>>>>> hear
>>>>> if others have similar or opposing views.
>>>>>
>>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>>> killing
>>>>> branch-2 is due to an effort to push the community towards 3.x. This
>>>>> is why
>>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>>> trying
>>>>> to move forward, keeping as many companies on similar branches as
>>>>> possible
>>>>> is a good way to make sure the code is well-tested. However, from a
>>>>> stability point of view, moving to 3.x is still scary and being able to
>>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>>> bridge release effort has been very good at making it possible for
>>>>> people
>>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>>> that
>>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>>> due to
>>>>> potential performance degradation at large scale.
>>>>>
>>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>>> support to
>>>>> 2.x, surely not everyone is going to want that (at least not
>>>>> immediately).
>>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>>> across
>>>>> point releases within the same minor release except if the JVM version
>>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>>> release until Java 8 becomes unsupported (though one could argue that
>>>>> it is
>>>>> already unsupported since Oracle is no longer giving public Java 8
>>>>> update).
>>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>>> catalyst for a branch-2 revival?
>>>>>
>>>>> Not sure if this really leads to any sort of answer from me on whether
>>>>> or
>>>>> not we should keep branch-2 alive, but these are the things that I am
>>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>>> or
>>>>> not is committers not being on the same page with where they should
>>>>> commit
>>>>> their patches.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1]
>>>>>
>>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>> [2]
>>>>>
>>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>>
>>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>>> wrote:
>>>>>
>>>>> > Hi Konstantin,
>>>>> >
>>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>>> the
>>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>>> least.
>>>>> > I worry
>>>>> >  that some committers may want to put new features into a branch 2
>>>>> release,
>>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>>> don't
>>>>> > always
>>>>> >  catch corner cases or performance problems for some time (usually
>>>>> not
>>>>> > until
>>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>>> be
>>>>> > very
>>>>> >  difficult to back out those changes.
>>>>> >
>>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>>> but I
>>>>> > do
>>>>> >  have these reservations.
>>>>> >
>>>>> > Thanks,
>>>>> > -Eric
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>>> > shv.hadoop@gmail.com> wrote:
>>>>> > Hi Eric,
>>>>> >
>>>>> > We had a long discussion on this list regarding making the 2.10
>>>>> release the
>>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>>> between
>>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is
>>>>> not in
>>>>> > the picture right now, and many people may object this idea.
>>>>> >
>>>>> > I understand Jonathan's proposal as an attempt to
>>>>> > 1. eliminate confusion which branches people should commit their
>>>>> back-ports
>>>>> > to
>>>>> > 2. save engineering effort committing to more branches than necessary
>>>>> >
>>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>>> > release 2.11 we can resurrect the branch.
>>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>>> >
>>>>> > Thanks,
>>>>> > --Konstantin
>>>>> >
>>>>> >
>>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jyhung2357@gmail.com
>>>>> >
>>>>> > wrote:
>>>>> >
>>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>>> pros
>>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>>> are
>>>>> > much
>>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>>> > people
>>>>> > > outside of our company who expressed interest in getting new
>>>>> features to
>>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>>> 2.10.0
>>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>>> > >
>>>>> > > In any case, we can always reverse this decision if we really need
>>>>> to, by
>>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>>> confusion
>>>>> > IMO.
>>>>> > >
>>>>> > > Jonathan Hung
>>>>> > >
>>>>> > >
>>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>>> epayne@apache.org>
>>>>> > > wrote:
>>>>> > >
>>>>> > > > Thanks Jonathan for opening the discussion.
>>>>> > > >
>>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>>> released,
>>>>> > and
>>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>>> > premature
>>>>> > > to
>>>>> > > > make a decision at this point that there will never be a need
>>>>> for a
>>>>> > 2.11
>>>>> > > > release.
>>>>> > > >
>>>>> > > > -Eric
>>>>> > > >
>>>>> > > >
>>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>>> > > > jyhung2357@gmail.com> wrote:
>>>>> > > >
>>>>> > > > Hi folks,
>>>>> > > >
>>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>>> be a
>>>>> > > bridge
>>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>>> minor
>>>>> > > > release line in branch-2. Currently, the main issue is that
>>>>> there's
>>>>> > many
>>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>>> going
>>>>> > into
>>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>>> will
>>>>> > > > likely never see the light of day unless they are backported to
>>>>> > > > branch-2.10.
>>>>> > > >
>>>>> > > > To do this, I propose we:
>>>>> > > >
>>>>> > > >  - Delete branch-2.10
>>>>> > > >  - Rename branch-2 to branch-2.10
>>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>>> > > >
>>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>>> release
>>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2
>>>>> ->
>>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>>> > > >
>>>>> > > > Thoughts?
>>>>> > > >
>>>>> > > > Jonathan Hung
>>>>> > > >
>>>>> > > > [1]
>>>>> > >
>>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>>> > > >
>>>>> > >
>>>>> >
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
FYI, starting the rename process, beginning with INFRA-19521.

Jonathan Hung


On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
wrote:

> Hey guys,
>
> I think we diverged a bit from the initial topic of this discussion, which
> is removing branch-2.10, and changing the version of branch-2 from
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> Sounds like the subject line for this thread "Making 2.10 the last minor
> 2.x release" confused people.
> It is in fact a wider matter that can be discussed when somebody actually
> proposes to release 2.11, which I understand nobody does at the moment.
>
> So if anybody objects removing branch-2.10 please make an argument.
> Otherwise we should go ahead and just do it next week.
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
> Thanks,
> --Konstantin
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Thanks for the detailed thoughts, everyone.
>>
>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>> releases. As for putting features into minor/patch releases, if we keep the
>> convention of putting new features only into minor releases, my assumption
>> is still that it's unlikely people will want to get them into branch-2
>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>> even really removed support for java 7 in branch-2 (much less java 8), so I
>> feel moving to java 11 would go along with a move to branch 3. And as you
>> mentioned, if people really want to use java 11 on branch-2, we can always
>> revive branch-2. But for now I think the convenience of not needing to port
>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> potentially needing to revive branch-2.
>>
>> Jonathan Hung
>>
>>
>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>
>>> +1 for 2.10.x as last release for 2.x version.
>>>
>>> Software would become more compatible when more companies stress test
>>> the same software and making improvements in trunk.  Some may be extra
>>> caution on moving up the version because obligation internally to keep
>>> things running.  Company obligation should not be the driving force to
>>> maintain Hadoop branches.  There is no proper collaboration in the
>>> community when every name brand company maintains its own Hadoop 2.x
>>> version.  I think it would be more healthy for the community to reduce the
>>> branch forking and spend energy on trunk to harden the software.  This will
>>> give more confidence to move up the version than trying to fix n
>>> permutations breakage like Flash fixing the timeline.
>>>
>>> Apache license stated, there is no warranty of any kind for code
>>> contributions.  Fewer community release process should improve software
>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>
>>> regards,
>>> Eric
>>>
>>>
>>>
>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> <eb...@verizonmedia.com.invalid> wrote:
>>>
>>>> Hello all,
>>>>
>>>> Is it written anywhere what the difference is between a minor release
>>>> and a
>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> have
>>>> looked around and I can't find anything other than some compatibility
>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>>> this would help shape my opinion on whether or not to keep branch-2
>>>> alive.
>>>> My current understanding is that we can't really break compatibility in
>>>> either a minor or point release. But the only mention of the difference
>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>> and
>>>> Unstable tags, and how to deal with changing default configuration
>>>> values.
>>>> So it seems like there really isn't a big official difference between
>>>> the
>>>> two. In my mind, the functional difference between the two is that the
>>>> minor releases may have added features and rewrites, while the point
>>>> releases only have bug fixes. This might be an incorrect understanding,
>>>> but
>>>> that's what I have gathered from watching the releases over the last few
>>>> years. Whether or not this is a correct understanding, I think that this
>>>> needs to be documented somewhere, even if it is just a convention.
>>>>
>>>> Given my assumed understanding of minor vs point releases, here are the
>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>> correct me for anything you feel is missing or inadequate.
>>>> Pros:
>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>> 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> Cons:
>>>> - Bug fixes are less likely to be put into 2.10.x
>>>> - An extra branch to maintain
>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> patches to if they should go all the way back to 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> So on the one hand you get added stability in fewer features being
>>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>> But
>>>> we don't live in a perfect world and committers will make mistakes
>>>> either
>>>> because of lack of knowledge or simply because they made a mistake. If
>>>> we
>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>> (for
>>>> whatever reason) commit valid bug fixes back all the way to
>>>> branch-2.10. If
>>>> we don't have a branch-2, committers who want their borderline risky
>>>> feature in the 2.x line will err on the side of putting it into
>>>> branch-2.10
>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>> quite
>>>> a few assumptions here based on my own experiences, so I would like to
>>>> hear
>>>> if others have similar or opposing views.
>>>>
>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>> killing
>>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>>> why
>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>> trying
>>>> to move forward, keeping as many companies on similar branches as
>>>> possible
>>>> is a good way to make sure the code is well-tested. However, from a
>>>> stability point of view, moving to 3.x is still scary and being able to
>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>> bridge release effort has been very good at making it possible for
>>>> people
>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> that
>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>> due to
>>>> potential performance degradation at large scale.
>>>>
>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> support to
>>>> 2.x, surely not everyone is going to want that (at least not
>>>> immediately).
>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>> across
>>>> point releases within the same minor release except if the JVM version
>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>> release until Java 8 becomes unsupported (though one could argue that
>>>> it is
>>>> already unsupported since Oracle is no longer giving public Java 8
>>>> update).
>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> catalyst for a branch-2 revival?
>>>>
>>>> Not sure if this really leads to any sort of answer from me on whether
>>>> or
>>>> not we should keep branch-2 alive, but these are the things that I am
>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>> or
>>>> not is committers not being on the same page with where they should
>>>> commit
>>>> their patches.
>>>>
>>>> Eric
>>>>
>>>> [1]
>>>>
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> [2]
>>>>
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>
>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>> wrote:
>>>>
>>>> > Hi Konstantin,
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> the
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> least.
>>>> > I worry
>>>> >  that some committers may want to put new features into a branch 2
>>>> release,
>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>> don't
>>>> > always
>>>> >  catch corner cases or performance problems for some time (usually not
>>>> > until
>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>> be
>>>> > very
>>>> >  difficult to back out those changes.
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>> but I
>>>> > do
>>>> >  have these reservations.
>>>> >
>>>> > Thanks,
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>> > shv.hadoop@gmail.com> wrote:
>>>> > Hi Eric,
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> release the
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> between
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>>> in
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> > 1. eliminate confusion which branches people should commit their
>>>> back-ports
>>>> > to
>>>> > 2. save engineering effort committing to more branches than necessary
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>> > release 2.11 we can resurrect the branch.
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> > Thanks,
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>> pros
>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> are
>>>> > much
>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>> > people
>>>> > > outside of our company who expressed interest in getting new
>>>> features to
>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>> 2.10.0
>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>> > >
>>>> > > In any case, we can always reverse this decision if we really need
>>>> to, by
>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>> confusion
>>>> > IMO.
>>>> > >
>>>> > > Jonathan Hung
>>>> > >
>>>> > >
>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> epayne@apache.org>
>>>> > > wrote:
>>>> > >
>>>> > > > Thanks Jonathan for opening the discussion.
>>>> > > >
>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>> released,
>>>> > and
>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>> > premature
>>>> > > to
>>>> > > > make a decision at this point that there will never be a need for
>>>> a
>>>> > 2.11
>>>> > > > release.
>>>> > > >
>>>> > > > -Eric
>>>> > > >
>>>> > > >
>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>> > > > jyhung2357@gmail.com> wrote:
>>>> > > >
>>>> > > > Hi folks,
>>>> > > >
>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>> be a
>>>> > > bridge
>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>> minor
>>>> > > > release line in branch-2. Currently, the main issue is that
>>>> there's
>>>> > many
>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> going
>>>> > into
>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>> will
>>>> > > > likely never see the light of day unless they are backported to
>>>> > > > branch-2.10.
>>>> > > >
>>>> > > > To do this, I propose we:
>>>> > > >
>>>> > > >  - Delete branch-2.10
>>>> > > >  - Rename branch-2 to branch-2.10
>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> > > >
>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>> release
>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> > > >
>>>> > > > Thoughts?
>>>> > > >
>>>> > > > Jonathan Hung
>>>> > > >
>>>> > > > [1]
>>>> > >
>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> > > >
>>>> > >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
FYI, starting the rename process, beginning with INFRA-19521.

Jonathan Hung


On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
wrote:

> Hey guys,
>
> I think we diverged a bit from the initial topic of this discussion, which
> is removing branch-2.10, and changing the version of branch-2 from
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> Sounds like the subject line for this thread "Making 2.10 the last minor
> 2.x release" confused people.
> It is in fact a wider matter that can be discussed when somebody actually
> proposes to release 2.11, which I understand nobody does at the moment.
>
> So if anybody objects removing branch-2.10 please make an argument.
> Otherwise we should go ahead and just do it next week.
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
> Thanks,
> --Konstantin
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Thanks for the detailed thoughts, everyone.
>>
>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>> releases. As for putting features into minor/patch releases, if we keep the
>> convention of putting new features only into minor releases, my assumption
>> is still that it's unlikely people will want to get them into branch-2
>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>> even really removed support for java 7 in branch-2 (much less java 8), so I
>> feel moving to java 11 would go along with a move to branch 3. And as you
>> mentioned, if people really want to use java 11 on branch-2, we can always
>> revive branch-2. But for now I think the convenience of not needing to port
>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> potentially needing to revive branch-2.
>>
>> Jonathan Hung
>>
>>
>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>
>>> +1 for 2.10.x as last release for 2.x version.
>>>
>>> Software would become more compatible when more companies stress test
>>> the same software and making improvements in trunk.  Some may be extra
>>> caution on moving up the version because obligation internally to keep
>>> things running.  Company obligation should not be the driving force to
>>> maintain Hadoop branches.  There is no proper collaboration in the
>>> community when every name brand company maintains its own Hadoop 2.x
>>> version.  I think it would be more healthy for the community to reduce the
>>> branch forking and spend energy on trunk to harden the software.  This will
>>> give more confidence to move up the version than trying to fix n
>>> permutations breakage like Flash fixing the timeline.
>>>
>>> Apache license stated, there is no warranty of any kind for code
>>> contributions.  Fewer community release process should improve software
>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>
>>> regards,
>>> Eric
>>>
>>>
>>>
>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> <eb...@verizonmedia.com.invalid> wrote:
>>>
>>>> Hello all,
>>>>
>>>> Is it written anywhere what the difference is between a minor release
>>>> and a
>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> have
>>>> looked around and I can't find anything other than some compatibility
>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>>> this would help shape my opinion on whether or not to keep branch-2
>>>> alive.
>>>> My current understanding is that we can't really break compatibility in
>>>> either a minor or point release. But the only mention of the difference
>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>> and
>>>> Unstable tags, and how to deal with changing default configuration
>>>> values.
>>>> So it seems like there really isn't a big official difference between
>>>> the
>>>> two. In my mind, the functional difference between the two is that the
>>>> minor releases may have added features and rewrites, while the point
>>>> releases only have bug fixes. This might be an incorrect understanding,
>>>> but
>>>> that's what I have gathered from watching the releases over the last few
>>>> years. Whether or not this is a correct understanding, I think that this
>>>> needs to be documented somewhere, even if it is just a convention.
>>>>
>>>> Given my assumed understanding of minor vs point releases, here are the
>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>> correct me for anything you feel is missing or inadequate.
>>>> Pros:
>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>> 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> Cons:
>>>> - Bug fixes are less likely to be put into 2.10.x
>>>> - An extra branch to maintain
>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> patches to if they should go all the way back to 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> So on the one hand you get added stability in fewer features being
>>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>> But
>>>> we don't live in a perfect world and committers will make mistakes
>>>> either
>>>> because of lack of knowledge or simply because they made a mistake. If
>>>> we
>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>> (for
>>>> whatever reason) commit valid bug fixes back all the way to
>>>> branch-2.10. If
>>>> we don't have a branch-2, committers who want their borderline risky
>>>> feature in the 2.x line will err on the side of putting it into
>>>> branch-2.10
>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>> quite
>>>> a few assumptions here based on my own experiences, so I would like to
>>>> hear
>>>> if others have similar or opposing views.
>>>>
>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>> killing
>>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>>> why
>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>> trying
>>>> to move forward, keeping as many companies on similar branches as
>>>> possible
>>>> is a good way to make sure the code is well-tested. However, from a
>>>> stability point of view, moving to 3.x is still scary and being able to
>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>> bridge release effort has been very good at making it possible for
>>>> people
>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> that
>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>> due to
>>>> potential performance degradation at large scale.
>>>>
>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> support to
>>>> 2.x, surely not everyone is going to want that (at least not
>>>> immediately).
>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>> across
>>>> point releases within the same minor release except if the JVM version
>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>> release until Java 8 becomes unsupported (though one could argue that
>>>> it is
>>>> already unsupported since Oracle is no longer giving public Java 8
>>>> update).
>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> catalyst for a branch-2 revival?
>>>>
>>>> Not sure if this really leads to any sort of answer from me on whether
>>>> or
>>>> not we should keep branch-2 alive, but these are the things that I am
>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>> or
>>>> not is committers not being on the same page with where they should
>>>> commit
>>>> their patches.
>>>>
>>>> Eric
>>>>
>>>> [1]
>>>>
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> [2]
>>>>
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>
>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>> wrote:
>>>>
>>>> > Hi Konstantin,
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> the
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> least.
>>>> > I worry
>>>> >  that some committers may want to put new features into a branch 2
>>>> release,
>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>> don't
>>>> > always
>>>> >  catch corner cases or performance problems for some time (usually not
>>>> > until
>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>> be
>>>> > very
>>>> >  difficult to back out those changes.
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>> but I
>>>> > do
>>>> >  have these reservations.
>>>> >
>>>> > Thanks,
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>> > shv.hadoop@gmail.com> wrote:
>>>> > Hi Eric,
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> release the
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> between
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>>> in
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> > 1. eliminate confusion which branches people should commit their
>>>> back-ports
>>>> > to
>>>> > 2. save engineering effort committing to more branches than necessary
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>> > release 2.11 we can resurrect the branch.
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> > Thanks,
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>> pros
>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> are
>>>> > much
>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>> > people
>>>> > > outside of our company who expressed interest in getting new
>>>> features to
>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>> 2.10.0
>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>> > >
>>>> > > In any case, we can always reverse this decision if we really need
>>>> to, by
>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>> confusion
>>>> > IMO.
>>>> > >
>>>> > > Jonathan Hung
>>>> > >
>>>> > >
>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> epayne@apache.org>
>>>> > > wrote:
>>>> > >
>>>> > > > Thanks Jonathan for opening the discussion.
>>>> > > >
>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>> released,
>>>> > and
>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>> > premature
>>>> > > to
>>>> > > > make a decision at this point that there will never be a need for
>>>> a
>>>> > 2.11
>>>> > > > release.
>>>> > > >
>>>> > > > -Eric
>>>> > > >
>>>> > > >
>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>> > > > jyhung2357@gmail.com> wrote:
>>>> > > >
>>>> > > > Hi folks,
>>>> > > >
>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>> be a
>>>> > > bridge
>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>> minor
>>>> > > > release line in branch-2. Currently, the main issue is that
>>>> there's
>>>> > many
>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> going
>>>> > into
>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>> will
>>>> > > > likely never see the light of day unless they are backported to
>>>> > > > branch-2.10.
>>>> > > >
>>>> > > > To do this, I propose we:
>>>> > > >
>>>> > > >  - Delete branch-2.10
>>>> > > >  - Rename branch-2 to branch-2.10
>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> > > >
>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>> release
>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> > > >
>>>> > > > Thoughts?
>>>> > > >
>>>> > > > Jonathan Hung
>>>> > > >
>>>> > > > [1]
>>>> > >
>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> > > >
>>>> > >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
FYI, starting the rename process, beginning with INFRA-19521.

Jonathan Hung


On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
wrote:

> Hey guys,
>
> I think we diverged a bit from the initial topic of this discussion, which
> is removing branch-2.10, and changing the version of branch-2 from
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> Sounds like the subject line for this thread "Making 2.10 the last minor
> 2.x release" confused people.
> It is in fact a wider matter that can be discussed when somebody actually
> proposes to release 2.11, which I understand nobody does at the moment.
>
> So if anybody objects removing branch-2.10 please make an argument.
> Otherwise we should go ahead and just do it next week.
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
> Thanks,
> --Konstantin
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Thanks for the detailed thoughts, everyone.
>>
>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>> releases. As for putting features into minor/patch releases, if we keep the
>> convention of putting new features only into minor releases, my assumption
>> is still that it's unlikely people will want to get them into branch-2
>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>> even really removed support for java 7 in branch-2 (much less java 8), so I
>> feel moving to java 11 would go along with a move to branch 3. And as you
>> mentioned, if people really want to use java 11 on branch-2, we can always
>> revive branch-2. But for now I think the convenience of not needing to port
>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> potentially needing to revive branch-2.
>>
>> Jonathan Hung
>>
>>
>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>
>>> +1 for 2.10.x as last release for 2.x version.
>>>
>>> Software would become more compatible when more companies stress test
>>> the same software and making improvements in trunk.  Some may be extra
>>> caution on moving up the version because obligation internally to keep
>>> things running.  Company obligation should not be the driving force to
>>> maintain Hadoop branches.  There is no proper collaboration in the
>>> community when every name brand company maintains its own Hadoop 2.x
>>> version.  I think it would be more healthy for the community to reduce the
>>> branch forking and spend energy on trunk to harden the software.  This will
>>> give more confidence to move up the version than trying to fix n
>>> permutations breakage like Flash fixing the timeline.
>>>
>>> Apache license stated, there is no warranty of any kind for code
>>> contributions.  Fewer community release process should improve software
>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>
>>> regards,
>>> Eric
>>>
>>>
>>>
>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> <eb...@verizonmedia.com.invalid> wrote:
>>>
>>>> Hello all,
>>>>
>>>> Is it written anywhere what the difference is between a minor release
>>>> and a
>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> have
>>>> looked around and I can't find anything other than some compatibility
>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>>> this would help shape my opinion on whether or not to keep branch-2
>>>> alive.
>>>> My current understanding is that we can't really break compatibility in
>>>> either a minor or point release. But the only mention of the difference
>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>> and
>>>> Unstable tags, and how to deal with changing default configuration
>>>> values.
>>>> So it seems like there really isn't a big official difference between
>>>> the
>>>> two. In my mind, the functional difference between the two is that the
>>>> minor releases may have added features and rewrites, while the point
>>>> releases only have bug fixes. This might be an incorrect understanding,
>>>> but
>>>> that's what I have gathered from watching the releases over the last few
>>>> years. Whether or not this is a correct understanding, I think that this
>>>> needs to be documented somewhere, even if it is just a convention.
>>>>
>>>> Given my assumed understanding of minor vs point releases, here are the
>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>> correct me for anything you feel is missing or inadequate.
>>>> Pros:
>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>> 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> Cons:
>>>> - Bug fixes are less likely to be put into 2.10.x
>>>> - An extra branch to maintain
>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> patches to if they should go all the way back to 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> So on the one hand you get added stability in fewer features being
>>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>> But
>>>> we don't live in a perfect world and committers will make mistakes
>>>> either
>>>> because of lack of knowledge or simply because they made a mistake. If
>>>> we
>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>> (for
>>>> whatever reason) commit valid bug fixes back all the way to
>>>> branch-2.10. If
>>>> we don't have a branch-2, committers who want their borderline risky
>>>> feature in the 2.x line will err on the side of putting it into
>>>> branch-2.10
>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>> quite
>>>> a few assumptions here based on my own experiences, so I would like to
>>>> hear
>>>> if others have similar or opposing views.
>>>>
>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>> killing
>>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>>> why
>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>> trying
>>>> to move forward, keeping as many companies on similar branches as
>>>> possible
>>>> is a good way to make sure the code is well-tested. However, from a
>>>> stability point of view, moving to 3.x is still scary and being able to
>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>> bridge release effort has been very good at making it possible for
>>>> people
>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> that
>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>> due to
>>>> potential performance degradation at large scale.
>>>>
>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> support to
>>>> 2.x, surely not everyone is going to want that (at least not
>>>> immediately).
>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>> across
>>>> point releases within the same minor release except if the JVM version
>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>> release until Java 8 becomes unsupported (though one could argue that
>>>> it is
>>>> already unsupported since Oracle is no longer giving public Java 8
>>>> update).
>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> catalyst for a branch-2 revival?
>>>>
>>>> Not sure if this really leads to any sort of answer from me on whether
>>>> or
>>>> not we should keep branch-2 alive, but these are the things that I am
>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>> or
>>>> not is committers not being on the same page with where they should
>>>> commit
>>>> their patches.
>>>>
>>>> Eric
>>>>
>>>> [1]
>>>>
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> [2]
>>>>
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>
>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>> wrote:
>>>>
>>>> > Hi Konstantin,
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> the
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> least.
>>>> > I worry
>>>> >  that some committers may want to put new features into a branch 2
>>>> release,
>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>> don't
>>>> > always
>>>> >  catch corner cases or performance problems for some time (usually not
>>>> > until
>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>> be
>>>> > very
>>>> >  difficult to back out those changes.
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>> but I
>>>> > do
>>>> >  have these reservations.
>>>> >
>>>> > Thanks,
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>> > shv.hadoop@gmail.com> wrote:
>>>> > Hi Eric,
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> release the
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> between
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>>> in
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> > 1. eliminate confusion which branches people should commit their
>>>> back-ports
>>>> > to
>>>> > 2. save engineering effort committing to more branches than necessary
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>> > release 2.11 we can resurrect the branch.
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> > Thanks,
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>> pros
>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> are
>>>> > much
>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>> > people
>>>> > > outside of our company who expressed interest in getting new
>>>> features to
>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>> 2.10.0
>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>> > >
>>>> > > In any case, we can always reverse this decision if we really need
>>>> to, by
>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>> confusion
>>>> > IMO.
>>>> > >
>>>> > > Jonathan Hung
>>>> > >
>>>> > >
>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> epayne@apache.org>
>>>> > > wrote:
>>>> > >
>>>> > > > Thanks Jonathan for opening the discussion.
>>>> > > >
>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>> released,
>>>> > and
>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>> > premature
>>>> > > to
>>>> > > > make a decision at this point that there will never be a need for
>>>> a
>>>> > 2.11
>>>> > > > release.
>>>> > > >
>>>> > > > -Eric
>>>> > > >
>>>> > > >
>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>> > > > jyhung2357@gmail.com> wrote:
>>>> > > >
>>>> > > > Hi folks,
>>>> > > >
>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>> be a
>>>> > > bridge
>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>> minor
>>>> > > > release line in branch-2. Currently, the main issue is that
>>>> there's
>>>> > many
>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> going
>>>> > into
>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>> will
>>>> > > > likely never see the light of day unless they are backported to
>>>> > > > branch-2.10.
>>>> > > >
>>>> > > > To do this, I propose we:
>>>> > > >
>>>> > > >  - Delete branch-2.10
>>>> > > >  - Rename branch-2 to branch-2.10
>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> > > >
>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>> release
>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> > > >
>>>> > > > Thoughts?
>>>> > > >
>>>> > > > Jonathan Hung
>>>> > > >
>>>> > > > [1]
>>>> > >
>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> > > >
>>>> > >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
FYI, starting the rename process, beginning with INFRA-19521.

Jonathan Hung


On Wed, Nov 27, 2019 at 12:15 PM Konstantin Shvachko <sh...@gmail.com>
wrote:

> Hey guys,
>
> I think we diverged a bit from the initial topic of this discussion, which
> is removing branch-2.10, and changing the version of branch-2 from
> 2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
> Sounds like the subject line for this thread "Making 2.10 the last minor
> 2.x release" confused people.
> It is in fact a wider matter that can be discussed when somebody actually
> proposes to release 2.11, which I understand nobody does at the moment.
>
> So if anybody objects removing branch-2.10 please make an argument.
> Otherwise we should go ahead and just do it next week.
> I see people still struggling to keep branch-2 and branch-2.10 in sync.
>
> Thanks,
> --Konstantin
>
> On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com>
> wrote:
>
>> Thanks for the detailed thoughts, everyone.
>>
>> Eric (Badger), my understanding is the same as yours re. minor vs patch
>> releases. As for putting features into minor/patch releases, if we keep the
>> convention of putting new features only into minor releases, my assumption
>> is still that it's unlikely people will want to get them into branch-2
>> (based on the 2.10.0 release process). For the java 11 issue, we haven't
>> even really removed support for java 7 in branch-2 (much less java 8), so I
>> feel moving to java 11 would go along with a move to branch 3. And as you
>> mentioned, if people really want to use java 11 on branch-2, we can always
>> revive branch-2. But for now I think the convenience of not needing to port
>> to both branch-2 and branch-2.10 (and below) outweighs the cost of
>> potentially needing to revive branch-2.
>>
>> Jonathan Hung
>>
>>
>> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>>
>>> +1 for 2.10.x as last release for 2.x version.
>>>
>>> Software would become more compatible when more companies stress test
>>> the same software and making improvements in trunk.  Some may be extra
>>> caution on moving up the version because obligation internally to keep
>>> things running.  Company obligation should not be the driving force to
>>> maintain Hadoop branches.  There is no proper collaboration in the
>>> community when every name brand company maintains its own Hadoop 2.x
>>> version.  I think it would be more healthy for the community to reduce the
>>> branch forking and spend energy on trunk to harden the software.  This will
>>> give more confidence to move up the version than trying to fix n
>>> permutations breakage like Flash fixing the timeline.
>>>
>>> Apache license stated, there is no warranty of any kind for code
>>> contributions.  Fewer community release process should improve software
>>> quality when eyes are on trunk, and help steering toward the same end goals.
>>>
>>> regards,
>>> Eric
>>>
>>>
>>>
>>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>>> <eb...@verizonmedia.com.invalid> wrote:
>>>
>>>> Hello all,
>>>>
>>>> Is it written anywhere what the difference is between a minor release
>>>> and a
>>>> point/dot/maintenance (I'll use "point" from here on out) release? I
>>>> have
>>>> looked around and I can't find anything other than some compatibility
>>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>>> this would help shape my opinion on whether or not to keep branch-2
>>>> alive.
>>>> My current understanding is that we can't really break compatibility in
>>>> either a minor or point release. But the only mention of the difference
>>>> between minor and point releases is how to deal with Stable, Evolving,
>>>> and
>>>> Unstable tags, and how to deal with changing default configuration
>>>> values.
>>>> So it seems like there really isn't a big official difference between
>>>> the
>>>> two. In my mind, the functional difference between the two is that the
>>>> minor releases may have added features and rewrites, while the point
>>>> releases only have bug fixes. This might be an incorrect understanding,
>>>> but
>>>> that's what I have gathered from watching the releases over the last few
>>>> years. Whether or not this is a correct understanding, I think that this
>>>> needs to be documented somewhere, even if it is just a convention.
>>>>
>>>> Given my assumed understanding of minor vs point releases, here are the
>>>> pros/cons that I can think of for having a branch-2. Please add on or
>>>> correct me for anything you feel is missing or inadequate.
>>>> Pros:
>>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>>> 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> Cons:
>>>> - Bug fixes are less likely to be put into 2.10.x
>>>> - An extra branch to maintain
>>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>>> patches to if they should go all the way back to 2.10.x
>>>> - It is less necessary to move to 3.x
>>>>
>>>> So on the one hand you get added stability in fewer features being
>>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>>> But
>>>> we don't live in a perfect world and committers will make mistakes
>>>> either
>>>> because of lack of knowledge or simply because they made a mistake. If
>>>> we
>>>> have a branch-2, committers will forget, not know to, or choose not to
>>>> (for
>>>> whatever reason) commit valid bug fixes back all the way to
>>>> branch-2.10. If
>>>> we don't have a branch-2, committers who want their borderline risky
>>>> feature in the 2.x line will err on the side of putting it into
>>>> branch-2.10
>>>> instead of proposing the creation of a branch-2. Clearly I have made
>>>> quite
>>>> a few assumptions here based on my own experiences, so I would like to
>>>> hear
>>>> if others have similar or opposing views.
>>>>
>>>> As far as 3.x goes, to me it seems like some of the reasoning for
>>>> killing
>>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>>> why
>>>> I have added movement to 3.x as both a pro and a con. As a community
>>>> trying
>>>> to move forward, keeping as many companies on similar branches as
>>>> possible
>>>> is a good way to make sure the code is well-tested. However, from a
>>>> stability point of view, moving to 3.x is still scary and being able to
>>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>>> bridge release effort has been very good at making it possible for
>>>> people
>>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>>> that
>>>> it is reasonable for companies to want to be extra cautious with 3.x
>>>> due to
>>>> potential performance degradation at large scale.
>>>>
>>>> A question I'm pondering is what happens when we move to Java 11 and
>>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>>> support to
>>>> 2.x, surely not everyone is going to want that (at least not
>>>> immediately).
>>>> The 2.10 documentation states, "The JVM requirements will not change
>>>> across
>>>> point releases within the same minor release except if the JVM version
>>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>>> release until Java 8 becomes unsupported (though one could argue that
>>>> it is
>>>> already unsupported since Oracle is no longer giving public Java 8
>>>> update).
>>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>>> catalyst for a branch-2 revival?
>>>>
>>>> Not sure if this really leads to any sort of answer from me on whether
>>>> or
>>>> not we should keep branch-2 alive, but these are the things that I am
>>>> weighing in my mind. For me, the bigger problem beyond having branch-2
>>>> or
>>>> not is committers not being on the same page with where they should
>>>> commit
>>>> their patches.
>>>>
>>>> Eric
>>>>
>>>> [1]
>>>>
>>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> [2]
>>>>
>>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>>
>>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>>> wrote:
>>>>
>>>> > Hi Konstantin,
>>>> >
>>>> > Sure, I understand those concerns. On the other hand, I worry about
>>>> the
>>>> > stability of 2.10, since we will be on it for a couple of years at
>>>> least.
>>>> > I worry
>>>> >  that some committers may want to put new features into a branch 2
>>>> release,
>>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>>> don't
>>>> > always
>>>> >  catch corner cases or performance problems for some time (usually not
>>>> > until
>>>> >  the release is deployed to a busy, 4-thousand node cluster), it may
>>>> be
>>>> > very
>>>> >  difficult to back out those changes.
>>>> >
>>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>>> but I
>>>> > do
>>>> >  have these reservations.
>>>> >
>>>> > Thanks,
>>>> > -Eric
>>>> >
>>>> >
>>>> >
>>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>>> > shv.hadoop@gmail.com> wrote:
>>>> > Hi Eric,
>>>> >
>>>> > We had a long discussion on this list regarding making the 2.10
>>>> release the
>>>> > last of branch-2 releases. We intended 2.10 as a bridge release
>>>> between
>>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>>> in
>>>> > the picture right now, and many people may object this idea.
>>>> >
>>>> > I understand Jonathan's proposal as an attempt to
>>>> > 1. eliminate confusion which branches people should commit their
>>>> back-ports
>>>> > to
>>>> > 2. save engineering effort committing to more branches than necessary
>>>> >
>>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>>> > release 2.11 we can resurrect the branch.
>>>> > Until then I am in favor of Jonathan's proposal +1.
>>>> >
>>>> > Thanks,
>>>> > --Konstantin
>>>> >
>>>> >
>>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>>> pros
>>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x
>>>> are
>>>> > much
>>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>>> > people
>>>> > > outside of our company who expressed interest in getting new
>>>> features to
>>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>>> 2.10.0
>>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>>> > > branch-2.10, so it's already diverged quite a bit.
>>>> > >
>>>> > > In any case, we can always reverse this decision if we really need
>>>> to, by
>>>> > > recreating branch-2. But this proposal would reduce a lot of
>>>> confusion
>>>> > IMO.
>>>> > >
>>>> > > Jonathan Hung
>>>> > >
>>>> > >
>>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>>> epayne@apache.org>
>>>> > > wrote:
>>>> > >
>>>> > > > Thanks Jonathan for opening the discussion.
>>>> > > >
>>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>>> released,
>>>> > and
>>>> > > > moving to 2.10 will take some time for the community. It seems
>>>> > premature
>>>> > > to
>>>> > > > make a decision at this point that there will never be a need for
>>>> a
>>>> > 2.11
>>>> > > > release.
>>>> > > >
>>>> > > > -Eric
>>>> > > >
>>>> > > >
>>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>>> > > > jyhung2357@gmail.com> wrote:
>>>> > > >
>>>> > > > Hi folks,
>>>> > > >
>>>> > > > Given the release of 2.10.0, and the fact that it's intended to
>>>> be a
>>>> > > bridge
>>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>>> minor
>>>> > > > release line in branch-2. Currently, the main issue is that
>>>> there's
>>>> > many
>>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not
>>>> going
>>>> > into
>>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>>> will
>>>> > > > likely never see the light of day unless they are backported to
>>>> > > > branch-2.10.
>>>> > > >
>>>> > > > To do this, I propose we:
>>>> > > >
>>>> > > >  - Delete branch-2.10
>>>> > > >  - Rename branch-2 to branch-2.10
>>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>>> > > >
>>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>>> release
>>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>>> > > >
>>>> > > > Thoughts?
>>>> > > >
>>>> > > > Jonathan Hung
>>>> > > >
>>>> > > > [1]
>>>> > >
>>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>>> > > >
>>>> > >
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>> >
>>>> >
>>>>
>>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hey guys,

I think we diverged a bit from the initial topic of this discussion, which
is removing branch-2.10, and changing the version of branch-2 from
2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
Sounds like the subject line for this thread "Making 2.10 the last minor
2.x release" confused people.
It is in fact a wider matter that can be discussed when somebody actually
proposes to release 2.11, which I understand nobody does at the moment.

So if anybody objects removing branch-2.10 please make an argument.
Otherwise we should go ahead and just do it next week.
I see people still struggling to keep branch-2 and branch-2.10 in sync.

Thanks,
--Konstantin

On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>
> Jonathan Hung
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>> +1 for 2.10.x as last release for 2.x version.
>>
>> Software would become more compatible when more companies stress test the
>> same software and making improvements in trunk.  Some may be extra caution
>> on moving up the version because obligation internally to keep things
>> running.  Company obligation should not be the driving force to maintain
>> Hadoop branches.  There is no proper collaboration in the community when
>> every name brand company maintains its own Hadoop 2.x version.  I think it
>> would be more healthy for the community to reduce the branch forking and
>> spend energy on trunk to harden the software.  This will give more
>> confidence to move up the version than trying to fix n permutations
>> breakage like Flash fixing the timeline.
>>
>> Apache license stated, there is no warranty of any kind for code
>> contributions.  Fewer community release process should improve software
>> quality when eyes are on trunk, and help steering toward the same end goals.
>>
>> regards,
>> Eric
>>
>>
>>
>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> <eb...@verizonmedia.com.invalid> wrote:
>>
>>> Hello all,
>>>
>>> Is it written anywhere what the difference is between a minor release
>>> and a
>>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>>> looked around and I can't find anything other than some compatibility
>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>> this would help shape my opinion on whether or not to keep branch-2
>>> alive.
>>> My current understanding is that we can't really break compatibility in
>>> either a minor or point release. But the only mention of the difference
>>> between minor and point releases is how to deal with Stable, Evolving,
>>> and
>>> Unstable tags, and how to deal with changing default configuration
>>> values.
>>> So it seems like there really isn't a big official difference between the
>>> two. In my mind, the functional difference between the two is that the
>>> minor releases may have added features and rewrites, while the point
>>> releases only have bug fixes. This might be an incorrect understanding,
>>> but
>>> that's what I have gathered from watching the releases over the last few
>>> years. Whether or not this is a correct understanding, I think that this
>>> needs to be documented somewhere, even if it is just a convention.
>>>
>>> Given my assumed understanding of minor vs point releases, here are the
>>> pros/cons that I can think of for having a branch-2. Please add on or
>>> correct me for anything you feel is missing or inadequate.
>>> Pros:
>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>> 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> Cons:
>>> - Bug fixes are less likely to be put into 2.10.x
>>> - An extra branch to maintain
>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>> patches to if they should go all the way back to 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> So on the one hand you get added stability in fewer features being
>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>> But
>>> we don't live in a perfect world and committers will make mistakes either
>>> because of lack of knowledge or simply because they made a mistake. If we
>>> have a branch-2, committers will forget, not know to, or choose not to
>>> (for
>>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>>> If
>>> we don't have a branch-2, committers who want their borderline risky
>>> feature in the 2.x line will err on the side of putting it into
>>> branch-2.10
>>> instead of proposing the creation of a branch-2. Clearly I have made
>>> quite
>>> a few assumptions here based on my own experiences, so I would like to
>>> hear
>>> if others have similar or opposing views.
>>>
>>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>> why
>>> I have added movement to 3.x as both a pro and a con. As a community
>>> trying
>>> to move forward, keeping as many companies on similar branches as
>>> possible
>>> is a good way to make sure the code is well-tested. However, from a
>>> stability point of view, moving to 3.x is still scary and being able to
>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>> bridge release effort has been very good at making it possible for people
>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> that
>>> it is reasonable for companies to want to be extra cautious with 3.x due
>>> to
>>> potential performance degradation at large scale.
>>>
>>> A question I'm pondering is what happens when we move to Java 11 and
>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> support to
>>> 2.x, surely not everyone is going to want that (at least not
>>> immediately).
>>> The 2.10 documentation states, "The JVM requirements will not change
>>> across
>>> point releases within the same minor release except if the JVM version
>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>> release until Java 8 becomes unsupported (though one could argue that it
>>> is
>>> already unsupported since Oracle is no longer giving public Java 8
>>> update).
>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>> catalyst for a branch-2 revival?
>>>
>>> Not sure if this really leads to any sort of answer from me on whether or
>>> not we should keep branch-2 alive, but these are the things that I am
>>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>>> not is committers not being on the same page with where they should
>>> commit
>>> their patches.
>>>
>>> Eric
>>>
>>> [1]
>>>
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> [2]
>>>
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>
>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>> wrote:
>>>
>>> > Hi Konstantin,
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about the
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> least.
>>> > I worry
>>> >  that some committers may want to put new features into a branch 2
>>> release,
>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>> don't
>>> > always
>>> >  catch corner cases or performance problems for some time (usually not
>>> > until
>>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>>> > very
>>> >  difficult to back out those changes.
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>> but I
>>> > do
>>> >  have these reservations.
>>> >
>>> > Thanks,
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>> > shv.hadoop@gmail.com> wrote:
>>> > Hi Eric,
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> release the
>>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>> in
>>> > the picture right now, and many people may object this idea.
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> > 1. eliminate confusion which branches people should commit their
>>> back-ports
>>> > to
>>> > 2. save engineering effort committing to more branches than necessary
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>> > release 2.11 we can resurrect the branch.
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> > Thanks,
>>> > --Konstantin
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>> pros
>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>>> > much
>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>> > people
>>> > > outside of our company who expressed interest in getting new
>>> features to
>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>> 2.10.0
>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>> > > branch-2.10, so it's already diverged quite a bit.
>>> > >
>>> > > In any case, we can always reverse this decision if we really need
>>> to, by
>>> > > recreating branch-2. But this proposal would reduce a lot of
>>> confusion
>>> > IMO.
>>> > >
>>> > > Jonathan Hung
>>> > >
>>> > >
>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> epayne@apache.org>
>>> > > wrote:
>>> > >
>>> > > > Thanks Jonathan for opening the discussion.
>>> > > >
>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>> released,
>>> > and
>>> > > > moving to 2.10 will take some time for the community. It seems
>>> > premature
>>> > > to
>>> > > > make a decision at this point that there will never be a need for a
>>> > 2.11
>>> > > > release.
>>> > > >
>>> > > > -Eric
>>> > > >
>>> > > >
>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>> > > > jyhung2357@gmail.com> wrote:
>>> > > >
>>> > > > Hi folks,
>>> > > >
>>> > > > Given the release of 2.10.0, and the fact that it's intended to be
>>> a
>>> > > bridge
>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>> minor
>>> > > > release line in branch-2. Currently, the main issue is that there's
>>> > many
>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>>> > into
>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>> will
>>> > > > likely never see the light of day unless they are backported to
>>> > > > branch-2.10.
>>> > > >
>>> > > > To do this, I propose we:
>>> > > >
>>> > > >  - Delete branch-2.10
>>> > > >  - Rename branch-2 to branch-2.10
>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> > > >
>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>> release
>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> > > >
>>> > > > Thoughts?
>>> > > >
>>> > > > Jonathan Hung
>>> > > >
>>> > > > [1]
>>> > >
>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> > > >
>>> > >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hey guys,

I think we diverged a bit from the initial topic of this discussion, which
is removing branch-2.10, and changing the version of branch-2 from
2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
Sounds like the subject line for this thread "Making 2.10 the last minor
2.x release" confused people.
It is in fact a wider matter that can be discussed when somebody actually
proposes to release 2.11, which I understand nobody does at the moment.

So if anybody objects removing branch-2.10 please make an argument.
Otherwise we should go ahead and just do it next week.
I see people still struggling to keep branch-2 and branch-2.10 in sync.

Thanks,
--Konstantin

On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>
> Jonathan Hung
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>> +1 for 2.10.x as last release for 2.x version.
>>
>> Software would become more compatible when more companies stress test the
>> same software and making improvements in trunk.  Some may be extra caution
>> on moving up the version because obligation internally to keep things
>> running.  Company obligation should not be the driving force to maintain
>> Hadoop branches.  There is no proper collaboration in the community when
>> every name brand company maintains its own Hadoop 2.x version.  I think it
>> would be more healthy for the community to reduce the branch forking and
>> spend energy on trunk to harden the software.  This will give more
>> confidence to move up the version than trying to fix n permutations
>> breakage like Flash fixing the timeline.
>>
>> Apache license stated, there is no warranty of any kind for code
>> contributions.  Fewer community release process should improve software
>> quality when eyes are on trunk, and help steering toward the same end goals.
>>
>> regards,
>> Eric
>>
>>
>>
>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> <eb...@verizonmedia.com.invalid> wrote:
>>
>>> Hello all,
>>>
>>> Is it written anywhere what the difference is between a minor release
>>> and a
>>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>>> looked around and I can't find anything other than some compatibility
>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>> this would help shape my opinion on whether or not to keep branch-2
>>> alive.
>>> My current understanding is that we can't really break compatibility in
>>> either a minor or point release. But the only mention of the difference
>>> between minor and point releases is how to deal with Stable, Evolving,
>>> and
>>> Unstable tags, and how to deal with changing default configuration
>>> values.
>>> So it seems like there really isn't a big official difference between the
>>> two. In my mind, the functional difference between the two is that the
>>> minor releases may have added features and rewrites, while the point
>>> releases only have bug fixes. This might be an incorrect understanding,
>>> but
>>> that's what I have gathered from watching the releases over the last few
>>> years. Whether or not this is a correct understanding, I think that this
>>> needs to be documented somewhere, even if it is just a convention.
>>>
>>> Given my assumed understanding of minor vs point releases, here are the
>>> pros/cons that I can think of for having a branch-2. Please add on or
>>> correct me for anything you feel is missing or inadequate.
>>> Pros:
>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>> 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> Cons:
>>> - Bug fixes are less likely to be put into 2.10.x
>>> - An extra branch to maintain
>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>> patches to if they should go all the way back to 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> So on the one hand you get added stability in fewer features being
>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>> But
>>> we don't live in a perfect world and committers will make mistakes either
>>> because of lack of knowledge or simply because they made a mistake. If we
>>> have a branch-2, committers will forget, not know to, or choose not to
>>> (for
>>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>>> If
>>> we don't have a branch-2, committers who want their borderline risky
>>> feature in the 2.x line will err on the side of putting it into
>>> branch-2.10
>>> instead of proposing the creation of a branch-2. Clearly I have made
>>> quite
>>> a few assumptions here based on my own experiences, so I would like to
>>> hear
>>> if others have similar or opposing views.
>>>
>>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>> why
>>> I have added movement to 3.x as both a pro and a con. As a community
>>> trying
>>> to move forward, keeping as many companies on similar branches as
>>> possible
>>> is a good way to make sure the code is well-tested. However, from a
>>> stability point of view, moving to 3.x is still scary and being able to
>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>> bridge release effort has been very good at making it possible for people
>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> that
>>> it is reasonable for companies to want to be extra cautious with 3.x due
>>> to
>>> potential performance degradation at large scale.
>>>
>>> A question I'm pondering is what happens when we move to Java 11 and
>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> support to
>>> 2.x, surely not everyone is going to want that (at least not
>>> immediately).
>>> The 2.10 documentation states, "The JVM requirements will not change
>>> across
>>> point releases within the same minor release except if the JVM version
>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>> release until Java 8 becomes unsupported (though one could argue that it
>>> is
>>> already unsupported since Oracle is no longer giving public Java 8
>>> update).
>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>> catalyst for a branch-2 revival?
>>>
>>> Not sure if this really leads to any sort of answer from me on whether or
>>> not we should keep branch-2 alive, but these are the things that I am
>>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>>> not is committers not being on the same page with where they should
>>> commit
>>> their patches.
>>>
>>> Eric
>>>
>>> [1]
>>>
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> [2]
>>>
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>
>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>> wrote:
>>>
>>> > Hi Konstantin,
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about the
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> least.
>>> > I worry
>>> >  that some committers may want to put new features into a branch 2
>>> release,
>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>> don't
>>> > always
>>> >  catch corner cases or performance problems for some time (usually not
>>> > until
>>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>>> > very
>>> >  difficult to back out those changes.
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>> but I
>>> > do
>>> >  have these reservations.
>>> >
>>> > Thanks,
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>> > shv.hadoop@gmail.com> wrote:
>>> > Hi Eric,
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> release the
>>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>> in
>>> > the picture right now, and many people may object this idea.
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> > 1. eliminate confusion which branches people should commit their
>>> back-ports
>>> > to
>>> > 2. save engineering effort committing to more branches than necessary
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>> > release 2.11 we can resurrect the branch.
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> > Thanks,
>>> > --Konstantin
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>> pros
>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>>> > much
>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>> > people
>>> > > outside of our company who expressed interest in getting new
>>> features to
>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>> 2.10.0
>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>> > > branch-2.10, so it's already diverged quite a bit.
>>> > >
>>> > > In any case, we can always reverse this decision if we really need
>>> to, by
>>> > > recreating branch-2. But this proposal would reduce a lot of
>>> confusion
>>> > IMO.
>>> > >
>>> > > Jonathan Hung
>>> > >
>>> > >
>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> epayne@apache.org>
>>> > > wrote:
>>> > >
>>> > > > Thanks Jonathan for opening the discussion.
>>> > > >
>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>> released,
>>> > and
>>> > > > moving to 2.10 will take some time for the community. It seems
>>> > premature
>>> > > to
>>> > > > make a decision at this point that there will never be a need for a
>>> > 2.11
>>> > > > release.
>>> > > >
>>> > > > -Eric
>>> > > >
>>> > > >
>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>> > > > jyhung2357@gmail.com> wrote:
>>> > > >
>>> > > > Hi folks,
>>> > > >
>>> > > > Given the release of 2.10.0, and the fact that it's intended to be
>>> a
>>> > > bridge
>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>> minor
>>> > > > release line in branch-2. Currently, the main issue is that there's
>>> > many
>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>>> > into
>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>> will
>>> > > > likely never see the light of day unless they are backported to
>>> > > > branch-2.10.
>>> > > >
>>> > > > To do this, I propose we:
>>> > > >
>>> > > >  - Delete branch-2.10
>>> > > >  - Rename branch-2 to branch-2.10
>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> > > >
>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>> release
>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> > > >
>>> > > > Thoughts?
>>> > > >
>>> > > > Jonathan Hung
>>> > > >
>>> > > > [1]
>>> > >
>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> > > >
>>> > >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Steve Loughran <st...@cloudera.com.INVALID>.
On Thu, Nov 21, 2019 at 11:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>


I don't think we should evern support java 11 in branch-2.

"If you want to use a recent version of java - use a recent version of
hadoop"

I've not been backporting my stuff for a long, long time, and my general
stance with branch-2 bug reports on bits of the code that I work on is
"what does it do on hadoop 3.2?"

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hey guys,

I think we diverged a bit from the initial topic of this discussion, which
is removing branch-2.10, and changing the version of branch-2 from
2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
Sounds like the subject line for this thread "Making 2.10 the last minor
2.x release" confused people.
It is in fact a wider matter that can be discussed when somebody actually
proposes to release 2.11, which I understand nobody does at the moment.

So if anybody objects removing branch-2.10 please make an argument.
Otherwise we should go ahead and just do it next week.
I see people still struggling to keep branch-2 and branch-2.10 in sync.

Thanks,
--Konstantin

On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>
> Jonathan Hung
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>> +1 for 2.10.x as last release for 2.x version.
>>
>> Software would become more compatible when more companies stress test the
>> same software and making improvements in trunk.  Some may be extra caution
>> on moving up the version because obligation internally to keep things
>> running.  Company obligation should not be the driving force to maintain
>> Hadoop branches.  There is no proper collaboration in the community when
>> every name brand company maintains its own Hadoop 2.x version.  I think it
>> would be more healthy for the community to reduce the branch forking and
>> spend energy on trunk to harden the software.  This will give more
>> confidence to move up the version than trying to fix n permutations
>> breakage like Flash fixing the timeline.
>>
>> Apache license stated, there is no warranty of any kind for code
>> contributions.  Fewer community release process should improve software
>> quality when eyes are on trunk, and help steering toward the same end goals.
>>
>> regards,
>> Eric
>>
>>
>>
>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> <eb...@verizonmedia.com.invalid> wrote:
>>
>>> Hello all,
>>>
>>> Is it written anywhere what the difference is between a minor release
>>> and a
>>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>>> looked around and I can't find anything other than some compatibility
>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>> this would help shape my opinion on whether or not to keep branch-2
>>> alive.
>>> My current understanding is that we can't really break compatibility in
>>> either a minor or point release. But the only mention of the difference
>>> between minor and point releases is how to deal with Stable, Evolving,
>>> and
>>> Unstable tags, and how to deal with changing default configuration
>>> values.
>>> So it seems like there really isn't a big official difference between the
>>> two. In my mind, the functional difference between the two is that the
>>> minor releases may have added features and rewrites, while the point
>>> releases only have bug fixes. This might be an incorrect understanding,
>>> but
>>> that's what I have gathered from watching the releases over the last few
>>> years. Whether or not this is a correct understanding, I think that this
>>> needs to be documented somewhere, even if it is just a convention.
>>>
>>> Given my assumed understanding of minor vs point releases, here are the
>>> pros/cons that I can think of for having a branch-2. Please add on or
>>> correct me for anything you feel is missing or inadequate.
>>> Pros:
>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>> 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> Cons:
>>> - Bug fixes are less likely to be put into 2.10.x
>>> - An extra branch to maintain
>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>> patches to if they should go all the way back to 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> So on the one hand you get added stability in fewer features being
>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>> But
>>> we don't live in a perfect world and committers will make mistakes either
>>> because of lack of knowledge or simply because they made a mistake. If we
>>> have a branch-2, committers will forget, not know to, or choose not to
>>> (for
>>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>>> If
>>> we don't have a branch-2, committers who want their borderline risky
>>> feature in the 2.x line will err on the side of putting it into
>>> branch-2.10
>>> instead of proposing the creation of a branch-2. Clearly I have made
>>> quite
>>> a few assumptions here based on my own experiences, so I would like to
>>> hear
>>> if others have similar or opposing views.
>>>
>>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>> why
>>> I have added movement to 3.x as both a pro and a con. As a community
>>> trying
>>> to move forward, keeping as many companies on similar branches as
>>> possible
>>> is a good way to make sure the code is well-tested. However, from a
>>> stability point of view, moving to 3.x is still scary and being able to
>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>> bridge release effort has been very good at making it possible for people
>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> that
>>> it is reasonable for companies to want to be extra cautious with 3.x due
>>> to
>>> potential performance degradation at large scale.
>>>
>>> A question I'm pondering is what happens when we move to Java 11 and
>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> support to
>>> 2.x, surely not everyone is going to want that (at least not
>>> immediately).
>>> The 2.10 documentation states, "The JVM requirements will not change
>>> across
>>> point releases within the same minor release except if the JVM version
>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>> release until Java 8 becomes unsupported (though one could argue that it
>>> is
>>> already unsupported since Oracle is no longer giving public Java 8
>>> update).
>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>> catalyst for a branch-2 revival?
>>>
>>> Not sure if this really leads to any sort of answer from me on whether or
>>> not we should keep branch-2 alive, but these are the things that I am
>>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>>> not is committers not being on the same page with where they should
>>> commit
>>> their patches.
>>>
>>> Eric
>>>
>>> [1]
>>>
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> [2]
>>>
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>
>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>> wrote:
>>>
>>> > Hi Konstantin,
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about the
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> least.
>>> > I worry
>>> >  that some committers may want to put new features into a branch 2
>>> release,
>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>> don't
>>> > always
>>> >  catch corner cases or performance problems for some time (usually not
>>> > until
>>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>>> > very
>>> >  difficult to back out those changes.
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>> but I
>>> > do
>>> >  have these reservations.
>>> >
>>> > Thanks,
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>> > shv.hadoop@gmail.com> wrote:
>>> > Hi Eric,
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> release the
>>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>> in
>>> > the picture right now, and many people may object this idea.
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> > 1. eliminate confusion which branches people should commit their
>>> back-ports
>>> > to
>>> > 2. save engineering effort committing to more branches than necessary
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>> > release 2.11 we can resurrect the branch.
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> > Thanks,
>>> > --Konstantin
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>> pros
>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>>> > much
>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>> > people
>>> > > outside of our company who expressed interest in getting new
>>> features to
>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>> 2.10.0
>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>> > > branch-2.10, so it's already diverged quite a bit.
>>> > >
>>> > > In any case, we can always reverse this decision if we really need
>>> to, by
>>> > > recreating branch-2. But this proposal would reduce a lot of
>>> confusion
>>> > IMO.
>>> > >
>>> > > Jonathan Hung
>>> > >
>>> > >
>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> epayne@apache.org>
>>> > > wrote:
>>> > >
>>> > > > Thanks Jonathan for opening the discussion.
>>> > > >
>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>> released,
>>> > and
>>> > > > moving to 2.10 will take some time for the community. It seems
>>> > premature
>>> > > to
>>> > > > make a decision at this point that there will never be a need for a
>>> > 2.11
>>> > > > release.
>>> > > >
>>> > > > -Eric
>>> > > >
>>> > > >
>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>> > > > jyhung2357@gmail.com> wrote:
>>> > > >
>>> > > > Hi folks,
>>> > > >
>>> > > > Given the release of 2.10.0, and the fact that it's intended to be
>>> a
>>> > > bridge
>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>> minor
>>> > > > release line in branch-2. Currently, the main issue is that there's
>>> > many
>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>>> > into
>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>> will
>>> > > > likely never see the light of day unless they are backported to
>>> > > > branch-2.10.
>>> > > >
>>> > > > To do this, I propose we:
>>> > > >
>>> > > >  - Delete branch-2.10
>>> > > >  - Rename branch-2 to branch-2.10
>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> > > >
>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>> release
>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> > > >
>>> > > > Thoughts?
>>> > > >
>>> > > > Jonathan Hung
>>> > > >
>>> > > > [1]
>>> > >
>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> > > >
>>> > >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hey guys,

I think we diverged a bit from the initial topic of this discussion, which
is removing branch-2.10, and changing the version of branch-2 from
2.11.0-SNAPSHOT to 2.10.1-SNAPSHOT.
Sounds like the subject line for this thread "Making 2.10 the last minor
2.x release" confused people.
It is in fact a wider matter that can be discussed when somebody actually
proposes to release 2.11, which I understand nobody does at the moment.

So if anybody objects removing branch-2.10 please make an argument.
Otherwise we should go ahead and just do it next week.
I see people still struggling to keep branch-2 and branch-2.10 in sync.

Thanks,
--Konstantin

On Thu, Nov 21, 2019 at 3:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>
> Jonathan Hung
>
>
> On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:
>
>> +1 for 2.10.x as last release for 2.x version.
>>
>> Software would become more compatible when more companies stress test the
>> same software and making improvements in trunk.  Some may be extra caution
>> on moving up the version because obligation internally to keep things
>> running.  Company obligation should not be the driving force to maintain
>> Hadoop branches.  There is no proper collaboration in the community when
>> every name brand company maintains its own Hadoop 2.x version.  I think it
>> would be more healthy for the community to reduce the branch forking and
>> spend energy on trunk to harden the software.  This will give more
>> confidence to move up the version than trying to fix n permutations
>> breakage like Flash fixing the timeline.
>>
>> Apache license stated, there is no warranty of any kind for code
>> contributions.  Fewer community release process should improve software
>> quality when eyes are on trunk, and help steering toward the same end goals.
>>
>> regards,
>> Eric
>>
>>
>>
>> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
>> <eb...@verizonmedia.com.invalid> wrote:
>>
>>> Hello all,
>>>
>>> Is it written anywhere what the difference is between a minor release
>>> and a
>>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>>> looked around and I can't find anything other than some compatibility
>>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>>> this would help shape my opinion on whether or not to keep branch-2
>>> alive.
>>> My current understanding is that we can't really break compatibility in
>>> either a minor or point release. But the only mention of the difference
>>> between minor and point releases is how to deal with Stable, Evolving,
>>> and
>>> Unstable tags, and how to deal with changing default configuration
>>> values.
>>> So it seems like there really isn't a big official difference between the
>>> two. In my mind, the functional difference between the two is that the
>>> minor releases may have added features and rewrites, while the point
>>> releases only have bug fixes. This might be an incorrect understanding,
>>> but
>>> that's what I have gathered from watching the releases over the last few
>>> years. Whether or not this is a correct understanding, I think that this
>>> needs to be documented somewhere, even if it is just a convention.
>>>
>>> Given my assumed understanding of minor vs point releases, here are the
>>> pros/cons that I can think of for having a branch-2. Please add on or
>>> correct me for anything you feel is missing or inadequate.
>>> Pros:
>>> - Features/rewrites/higher-risk patches are less likely to be put into
>>> 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> Cons:
>>> - Bug fixes are less likely to be put into 2.10.x
>>> - An extra branch to maintain
>>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>>> patches to if they should go all the way back to 2.10.x
>>> - It is less necessary to move to 3.x
>>>
>>> So on the one hand you get added stability in fewer features being
>>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>>> committed. In a perfect world, we wouldn't have to make this tradeoff.
>>> But
>>> we don't live in a perfect world and committers will make mistakes either
>>> because of lack of knowledge or simply because they made a mistake. If we
>>> have a branch-2, committers will forget, not know to, or choose not to
>>> (for
>>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>>> If
>>> we don't have a branch-2, committers who want their borderline risky
>>> feature in the 2.x line will err on the side of putting it into
>>> branch-2.10
>>> instead of proposing the creation of a branch-2. Clearly I have made
>>> quite
>>> a few assumptions here based on my own experiences, so I would like to
>>> hear
>>> if others have similar or opposing views.
>>>
>>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>>> branch-2 is due to an effort to push the community towards 3.x. This is
>>> why
>>> I have added movement to 3.x as both a pro and a con. As a community
>>> trying
>>> to move forward, keeping as many companies on similar branches as
>>> possible
>>> is a good way to make sure the code is well-tested. However, from a
>>> stability point of view, moving to 3.x is still scary and being able to
>>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>>> bridge release effort has been very good at making it possible for people
>>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large
>>> that
>>> it is reasonable for companies to want to be extra cautious with 3.x due
>>> to
>>> potential performance degradation at large scale.
>>>
>>> A question I'm pondering is what happens when we move to Java 11 and
>>> someone is still on 2.x? If they want to backport HADOOP-15338
>>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11
>>> support to
>>> 2.x, surely not everyone is going to want that (at least not
>>> immediately).
>>> The 2.10 documentation states, "The JVM requirements will not change
>>> across
>>> point releases within the same minor release except if the JVM version
>>> under question becomes unsupported" [1], so this would warrant a 2.11
>>> release until Java 8 becomes unsupported (though one could argue that it
>>> is
>>> already unsupported since Oracle is no longer giving public Java 8
>>> update).
>>> If we don't keep branch-2 around now, would a Java 11 backport be the
>>> catalyst for a branch-2 revival?
>>>
>>> Not sure if this really leads to any sort of answer from me on whether or
>>> not we should keep branch-2 alive, but these are the things that I am
>>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>>> not is committers not being on the same page with where they should
>>> commit
>>> their patches.
>>>
>>> Eric
>>>
>>> [1]
>>>
>>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>> [2]
>>>
>>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>>
>>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>>> wrote:
>>>
>>> > Hi Konstantin,
>>> >
>>> > Sure, I understand those concerns. On the other hand, I worry about the
>>> > stability of 2.10, since we will be on it for a couple of years at
>>> least.
>>> > I worry
>>> >  that some committers may want to put new features into a branch 2
>>> release,
>>> >  and without a branch-2, they will go directly into 2.10. Since we
>>> don't
>>> > always
>>> >  catch corner cases or performance problems for some time (usually not
>>> > until
>>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>>> > very
>>> >  difficult to back out those changes.
>>> >
>>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>>> but I
>>> > do
>>> >  have these reservations.
>>> >
>>> > Thanks,
>>> > -Eric
>>> >
>>> >
>>> >
>>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>>> > shv.hadoop@gmail.com> wrote:
>>> > Hi Eric,
>>> >
>>> > We had a long discussion on this list regarding making the 2.10
>>> release the
>>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not
>>> in
>>> > the picture right now, and many people may object this idea.
>>> >
>>> > I understand Jonathan's proposal as an attempt to
>>> > 1. eliminate confusion which branches people should commit their
>>> back-ports
>>> > to
>>> > 2. save engineering effort committing to more branches than necessary
>>> >
>>> > "Branches are cheap" as our founder used to say. If we ever decide to
>>> > release 2.11 we can resurrect the branch.
>>> > Until then I am in favor of Jonathan's proposal +1.
>>> >
>>> > Thanks,
>>> > --Konstantin
>>> >
>>> >
>>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>>> > wrote:
>>> >
>>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>>> pros
>>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>>> > much
>>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>>> > people
>>> > > outside of our company who expressed interest in getting new
>>> features to
>>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>>> 2.10.0
>>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>>> > > branch-2.10, so it's already diverged quite a bit.
>>> > >
>>> > > In any case, we can always reverse this decision if we really need
>>> to, by
>>> > > recreating branch-2. But this proposal would reduce a lot of
>>> confusion
>>> > IMO.
>>> > >
>>> > > Jonathan Hung
>>> > >
>>> > >
>>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <
>>> epayne@apache.org>
>>> > > wrote:
>>> > >
>>> > > > Thanks Jonathan for opening the discussion.
>>> > > >
>>> > > > I am not in favor of this proposal. 2.10 was very recently
>>> released,
>>> > and
>>> > > > moving to 2.10 will take some time for the community. It seems
>>> > premature
>>> > > to
>>> > > > make a decision at this point that there will never be a need for a
>>> > 2.11
>>> > > > release.
>>> > > >
>>> > > > -Eric
>>> > > >
>>> > > >
>>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>>> > > > jyhung2357@gmail.com> wrote:
>>> > > >
>>> > > > Hi folks,
>>> > > >
>>> > > > Given the release of 2.10.0, and the fact that it's intended to be
>>> a
>>> > > bridge
>>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>>> minor
>>> > > > release line in branch-2. Currently, the main issue is that there's
>>> > many
>>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>>> > into
>>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>>> will
>>> > > > likely never see the light of day unless they are backported to
>>> > > > branch-2.10.
>>> > > >
>>> > > > To do this, I propose we:
>>> > > >
>>> > > >  - Delete branch-2.10
>>> > > >  - Rename branch-2 to branch-2.10
>>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>>> > > >
>>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>>> release
>>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>>> > > >
>>> > > > Thoughts?
>>> > > >
>>> > > > Jonathan Hung
>>> > > >
>>> > > > [1]
>>> > >
>>> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>>> > > >
>>> > >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>> >
>>> >
>>>
>>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Steve Loughran <st...@cloudera.com.INVALID>.
On Thu, Nov 21, 2019 at 11:49 PM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks for the detailed thoughts, everyone.
>
> Eric (Badger), my understanding is the same as yours re. minor vs patch
> releases. As for putting features into minor/patch releases, if we keep the
> convention of putting new features only into minor releases, my assumption
> is still that it's unlikely people will want to get them into branch-2
> (based on the 2.10.0 release process). For the java 11 issue, we haven't
> even really removed support for java 7 in branch-2 (much less java 8), so I
> feel moving to java 11 would go along with a move to branch 3. And as you
> mentioned, if people really want to use java 11 on branch-2, we can always
> revive branch-2. But for now I think the convenience of not needing to port
> to both branch-2 and branch-2.10 (and below) outweighs the cost of
> potentially needing to revive branch-2.
>


I don't think we should evern support java 11 in branch-2.

"If you want to use a recent version of java - use a recent version of
hadoop"

I've not been backporting my stuff for a long, long time, and my general
stance with branch-2 bug reports on bits of the code that I work on is
"what does it do on hadoop 3.2?"

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks for the detailed thoughts, everyone.

Eric (Badger), my understanding is the same as yours re. minor vs patch
releases. As for putting features into minor/patch releases, if we keep the
convention of putting new features only into minor releases, my assumption
is still that it's unlikely people will want to get them into branch-2
(based on the 2.10.0 release process). For the java 11 issue, we haven't
even really removed support for java 7 in branch-2 (much less java 8), so I
feel moving to java 11 would go along with a move to branch 3. And as you
mentioned, if people really want to use java 11 on branch-2, we can always
revive branch-2. But for now I think the convenience of not needing to port
to both branch-2 and branch-2.10 (and below) outweighs the cost of
potentially needing to revive branch-2.

Jonathan Hung


On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:

> +1 for 2.10.x as last release for 2.x version.
>
> Software would become more compatible when more companies stress test the
> same software and making improvements in trunk.  Some may be extra caution
> on moving up the version because obligation internally to keep things
> running.  Company obligation should not be the driving force to maintain
> Hadoop branches.  There is no proper collaboration in the community when
> every name brand company maintains its own Hadoop 2.x version.  I think it
> would be more healthy for the community to reduce the branch forking and
> spend energy on trunk to harden the software.  This will give more
> confidence to move up the version than trying to fix n permutations
> breakage like Flash fixing the timeline.
>
> Apache license stated, there is no warranty of any kind for code
> contributions.  Fewer community release process should improve software
> quality when eyes are on trunk, and help steering toward the same end goals.
>
> regards,
> Eric
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> <eb...@verizonmedia.com.invalid> wrote:
>
>> Hello all,
>>
>> Is it written anywhere what the difference is between a minor release and
>> a
>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>> looked around and I can't find anything other than some compatibility
>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>> this would help shape my opinion on whether or not to keep branch-2 alive.
>> My current understanding is that we can't really break compatibility in
>> either a minor or point release. But the only mention of the difference
>> between minor and point releases is how to deal with Stable, Evolving, and
>> Unstable tags, and how to deal with changing default configuration values.
>> So it seems like there really isn't a big official difference between the
>> two. In my mind, the functional difference between the two is that the
>> minor releases may have added features and rewrites, while the point
>> releases only have bug fixes. This might be an incorrect understanding,
>> but
>> that's what I have gathered from watching the releases over the last few
>> years. Whether or not this is a correct understanding, I think that this
>> needs to be documented somewhere, even if it is just a convention.
>>
>> Given my assumed understanding of minor vs point releases, here are the
>> pros/cons that I can think of for having a branch-2. Please add on or
>> correct me for anything you feel is missing or inadequate.
>> Pros:
>> - Features/rewrites/higher-risk patches are less likely to be put into
>> 2.10.x
>> - It is less necessary to move to 3.x
>>
>> Cons:
>> - Bug fixes are less likely to be put into 2.10.x
>> - An extra branch to maintain
>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>> patches to if they should go all the way back to 2.10.x
>> - It is less necessary to move to 3.x
>>
>> So on the one hand you get added stability in fewer features being
>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>> committed. In a perfect world, we wouldn't have to make this tradeoff. But
>> we don't live in a perfect world and committers will make mistakes either
>> because of lack of knowledge or simply because they made a mistake. If we
>> have a branch-2, committers will forget, not know to, or choose not to
>> (for
>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>> If
>> we don't have a branch-2, committers who want their borderline risky
>> feature in the 2.x line will err on the side of putting it into
>> branch-2.10
>> instead of proposing the creation of a branch-2. Clearly I have made quite
>> a few assumptions here based on my own experiences, so I would like to
>> hear
>> if others have similar or opposing views.
>>
>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>> branch-2 is due to an effort to push the community towards 3.x. This is
>> why
>> I have added movement to 3.x as both a pro and a con. As a community
>> trying
>> to move forward, keeping as many companies on similar branches as possible
>> is a good way to make sure the code is well-tested. However, from a
>> stability point of view, moving to 3.x is still scary and being able to
>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>> bridge release effort has been very good at making it possible for people
>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
>> it is reasonable for companies to want to be extra cautious with 3.x due
>> to
>> potential performance degradation at large scale.
>>
>> A question I'm pondering is what happens when we move to Java 11 and
>> someone is still on 2.x? If they want to backport HADOOP-15338
>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support
>> to
>> 2.x, surely not everyone is going to want that (at least not immediately).
>> The 2.10 documentation states, "The JVM requirements will not change
>> across
>> point releases within the same minor release except if the JVM version
>> under question becomes unsupported" [1], so this would warrant a 2.11
>> release until Java 8 becomes unsupported (though one could argue that it
>> is
>> already unsupported since Oracle is no longer giving public Java 8
>> update).
>> If we don't keep branch-2 around now, would a Java 11 backport be the
>> catalyst for a branch-2 revival?
>>
>> Not sure if this really leads to any sort of answer from me on whether or
>> not we should keep branch-2 alive, but these are the things that I am
>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>> not is committers not being on the same page with where they should commit
>> their patches.
>>
>> Eric
>>
>> [1]
>>
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> [2]
>>
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>
>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>> wrote:
>>
>> > Hi Konstantin,
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about the
>> > stability of 2.10, since we will be on it for a couple of years at
>> least.
>> > I worry
>> >  that some committers may want to put new features into a branch 2
>> release,
>> >  and without a branch-2, they will go directly into 2.10. Since we don't
>> > always
>> >  catch corner cases or performance problems for some time (usually not
>> > until
>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>> > very
>> >  difficult to back out those changes.
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>> but I
>> > do
>> >  have these reservations.
>> >
>> > Thanks,
>> > -Eric
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>> > shv.hadoop@gmail.com> wrote:
>> > Hi Eric,
>> >
>> > We had a long discussion on this list regarding making the 2.10 release
>> the
>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
>> > the picture right now, and many people may object this idea.
>> >
>> > I understand Jonathan's proposal as an attempt to
>> > 1. eliminate confusion which branches people should commit their
>> back-ports
>> > to
>> > 2. save engineering effort committing to more branches than necessary
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide to
>> > release 2.11 we can resurrect the branch.
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> > Thanks,
>> > --Konstantin
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>> pros
>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>> > much
>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>> > people
>> > > outside of our company who expressed interest in getting new features
>> to
>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>> 2.10.0
>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>> > > branch-2.10, so it's already diverged quite a bit.
>> > >
>> > > In any case, we can always reverse this decision if we really need
>> to, by
>> > > recreating branch-2. But this proposal would reduce a lot of confusion
>> > IMO.
>> > >
>> > > Jonathan Hung
>> > >
>> > >
>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <epayne@apache.org
>> >
>> > > wrote:
>> > >
>> > > > Thanks Jonathan for opening the discussion.
>> > > >
>> > > > I am not in favor of this proposal. 2.10 was very recently released,
>> > and
>> > > > moving to 2.10 will take some time for the community. It seems
>> > premature
>> > > to
>> > > > make a decision at this point that there will never be a need for a
>> > 2.11
>> > > > release.
>> > > >
>> > > > -Eric
>> > > >
>> > > >
>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>> > > > jyhung2357@gmail.com> wrote:
>> > > >
>> > > > Hi folks,
>> > > >
>> > > > Given the release of 2.10.0, and the fact that it's intended to be a
>> > > bridge
>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>> minor
>> > > > release line in branch-2. Currently, the main issue is that there's
>> > many
>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>> > into
>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>> will
>> > > > likely never see the light of day unless they are backported to
>> > > > branch-2.10.
>> > > >
>> > > > To do this, I propose we:
>> > > >
>> > > >  - Delete branch-2.10
>> > > >  - Rename branch-2 to branch-2.10
>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> > > >
>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>> release
>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> > > >
>> > > > Thoughts?
>> > > >
>> > > > Jonathan Hung
>> > > >
>> > > > [1]
>> > > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> > > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks for the detailed thoughts, everyone.

Eric (Badger), my understanding is the same as yours re. minor vs patch
releases. As for putting features into minor/patch releases, if we keep the
convention of putting new features only into minor releases, my assumption
is still that it's unlikely people will want to get them into branch-2
(based on the 2.10.0 release process). For the java 11 issue, we haven't
even really removed support for java 7 in branch-2 (much less java 8), so I
feel moving to java 11 would go along with a move to branch 3. And as you
mentioned, if people really want to use java 11 on branch-2, we can always
revive branch-2. But for now I think the convenience of not needing to port
to both branch-2 and branch-2.10 (and below) outweighs the cost of
potentially needing to revive branch-2.

Jonathan Hung


On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:

> +1 for 2.10.x as last release for 2.x version.
>
> Software would become more compatible when more companies stress test the
> same software and making improvements in trunk.  Some may be extra caution
> on moving up the version because obligation internally to keep things
> running.  Company obligation should not be the driving force to maintain
> Hadoop branches.  There is no proper collaboration in the community when
> every name brand company maintains its own Hadoop 2.x version.  I think it
> would be more healthy for the community to reduce the branch forking and
> spend energy on trunk to harden the software.  This will give more
> confidence to move up the version than trying to fix n permutations
> breakage like Flash fixing the timeline.
>
> Apache license stated, there is no warranty of any kind for code
> contributions.  Fewer community release process should improve software
> quality when eyes are on trunk, and help steering toward the same end goals.
>
> regards,
> Eric
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> <eb...@verizonmedia.com.invalid> wrote:
>
>> Hello all,
>>
>> Is it written anywhere what the difference is between a minor release and
>> a
>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>> looked around and I can't find anything other than some compatibility
>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>> this would help shape my opinion on whether or not to keep branch-2 alive.
>> My current understanding is that we can't really break compatibility in
>> either a minor or point release. But the only mention of the difference
>> between minor and point releases is how to deal with Stable, Evolving, and
>> Unstable tags, and how to deal with changing default configuration values.
>> So it seems like there really isn't a big official difference between the
>> two. In my mind, the functional difference between the two is that the
>> minor releases may have added features and rewrites, while the point
>> releases only have bug fixes. This might be an incorrect understanding,
>> but
>> that's what I have gathered from watching the releases over the last few
>> years. Whether or not this is a correct understanding, I think that this
>> needs to be documented somewhere, even if it is just a convention.
>>
>> Given my assumed understanding of minor vs point releases, here are the
>> pros/cons that I can think of for having a branch-2. Please add on or
>> correct me for anything you feel is missing or inadequate.
>> Pros:
>> - Features/rewrites/higher-risk patches are less likely to be put into
>> 2.10.x
>> - It is less necessary to move to 3.x
>>
>> Cons:
>> - Bug fixes are less likely to be put into 2.10.x
>> - An extra branch to maintain
>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>> patches to if they should go all the way back to 2.10.x
>> - It is less necessary to move to 3.x
>>
>> So on the one hand you get added stability in fewer features being
>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>> committed. In a perfect world, we wouldn't have to make this tradeoff. But
>> we don't live in a perfect world and committers will make mistakes either
>> because of lack of knowledge or simply because they made a mistake. If we
>> have a branch-2, committers will forget, not know to, or choose not to
>> (for
>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>> If
>> we don't have a branch-2, committers who want their borderline risky
>> feature in the 2.x line will err on the side of putting it into
>> branch-2.10
>> instead of proposing the creation of a branch-2. Clearly I have made quite
>> a few assumptions here based on my own experiences, so I would like to
>> hear
>> if others have similar or opposing views.
>>
>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>> branch-2 is due to an effort to push the community towards 3.x. This is
>> why
>> I have added movement to 3.x as both a pro and a con. As a community
>> trying
>> to move forward, keeping as many companies on similar branches as possible
>> is a good way to make sure the code is well-tested. However, from a
>> stability point of view, moving to 3.x is still scary and being able to
>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>> bridge release effort has been very good at making it possible for people
>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
>> it is reasonable for companies to want to be extra cautious with 3.x due
>> to
>> potential performance degradation at large scale.
>>
>> A question I'm pondering is what happens when we move to Java 11 and
>> someone is still on 2.x? If they want to backport HADOOP-15338
>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support
>> to
>> 2.x, surely not everyone is going to want that (at least not immediately).
>> The 2.10 documentation states, "The JVM requirements will not change
>> across
>> point releases within the same minor release except if the JVM version
>> under question becomes unsupported" [1], so this would warrant a 2.11
>> release until Java 8 becomes unsupported (though one could argue that it
>> is
>> already unsupported since Oracle is no longer giving public Java 8
>> update).
>> If we don't keep branch-2 around now, would a Java 11 backport be the
>> catalyst for a branch-2 revival?
>>
>> Not sure if this really leads to any sort of answer from me on whether or
>> not we should keep branch-2 alive, but these are the things that I am
>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>> not is committers not being on the same page with where they should commit
>> their patches.
>>
>> Eric
>>
>> [1]
>>
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> [2]
>>
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>
>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>> wrote:
>>
>> > Hi Konstantin,
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about the
>> > stability of 2.10, since we will be on it for a couple of years at
>> least.
>> > I worry
>> >  that some committers may want to put new features into a branch 2
>> release,
>> >  and without a branch-2, they will go directly into 2.10. Since we don't
>> > always
>> >  catch corner cases or performance problems for some time (usually not
>> > until
>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>> > very
>> >  difficult to back out those changes.
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>> but I
>> > do
>> >  have these reservations.
>> >
>> > Thanks,
>> > -Eric
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>> > shv.hadoop@gmail.com> wrote:
>> > Hi Eric,
>> >
>> > We had a long discussion on this list regarding making the 2.10 release
>> the
>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
>> > the picture right now, and many people may object this idea.
>> >
>> > I understand Jonathan's proposal as an attempt to
>> > 1. eliminate confusion which branches people should commit their
>> back-ports
>> > to
>> > 2. save engineering effort committing to more branches than necessary
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide to
>> > release 2.11 we can resurrect the branch.
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> > Thanks,
>> > --Konstantin
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>> pros
>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>> > much
>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>> > people
>> > > outside of our company who expressed interest in getting new features
>> to
>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>> 2.10.0
>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>> > > branch-2.10, so it's already diverged quite a bit.
>> > >
>> > > In any case, we can always reverse this decision if we really need
>> to, by
>> > > recreating branch-2. But this proposal would reduce a lot of confusion
>> > IMO.
>> > >
>> > > Jonathan Hung
>> > >
>> > >
>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <epayne@apache.org
>> >
>> > > wrote:
>> > >
>> > > > Thanks Jonathan for opening the discussion.
>> > > >
>> > > > I am not in favor of this proposal. 2.10 was very recently released,
>> > and
>> > > > moving to 2.10 will take some time for the community. It seems
>> > premature
>> > > to
>> > > > make a decision at this point that there will never be a need for a
>> > 2.11
>> > > > release.
>> > > >
>> > > > -Eric
>> > > >
>> > > >
>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>> > > > jyhung2357@gmail.com> wrote:
>> > > >
>> > > > Hi folks,
>> > > >
>> > > > Given the release of 2.10.0, and the fact that it's intended to be a
>> > > bridge
>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>> minor
>> > > > release line in branch-2. Currently, the main issue is that there's
>> > many
>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>> > into
>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>> will
>> > > > likely never see the light of day unless they are backported to
>> > > > branch-2.10.
>> > > >
>> > > > To do this, I propose we:
>> > > >
>> > > >  - Delete branch-2.10
>> > > >  - Rename branch-2 to branch-2.10
>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> > > >
>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>> release
>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> > > >
>> > > > Thoughts?
>> > > >
>> > > > Jonathan Hung
>> > > >
>> > > > [1]
>> > > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> > > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks for the detailed thoughts, everyone.

Eric (Badger), my understanding is the same as yours re. minor vs patch
releases. As for putting features into minor/patch releases, if we keep the
convention of putting new features only into minor releases, my assumption
is still that it's unlikely people will want to get them into branch-2
(based on the 2.10.0 release process). For the java 11 issue, we haven't
even really removed support for java 7 in branch-2 (much less java 8), so I
feel moving to java 11 would go along with a move to branch 3. And as you
mentioned, if people really want to use java 11 on branch-2, we can always
revive branch-2. But for now I think the convenience of not needing to port
to both branch-2 and branch-2.10 (and below) outweighs the cost of
potentially needing to revive branch-2.

Jonathan Hung


On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:

> +1 for 2.10.x as last release for 2.x version.
>
> Software would become more compatible when more companies stress test the
> same software and making improvements in trunk.  Some may be extra caution
> on moving up the version because obligation internally to keep things
> running.  Company obligation should not be the driving force to maintain
> Hadoop branches.  There is no proper collaboration in the community when
> every name brand company maintains its own Hadoop 2.x version.  I think it
> would be more healthy for the community to reduce the branch forking and
> spend energy on trunk to harden the software.  This will give more
> confidence to move up the version than trying to fix n permutations
> breakage like Flash fixing the timeline.
>
> Apache license stated, there is no warranty of any kind for code
> contributions.  Fewer community release process should improve software
> quality when eyes are on trunk, and help steering toward the same end goals.
>
> regards,
> Eric
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> <eb...@verizonmedia.com.invalid> wrote:
>
>> Hello all,
>>
>> Is it written anywhere what the difference is between a minor release and
>> a
>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>> looked around and I can't find anything other than some compatibility
>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>> this would help shape my opinion on whether or not to keep branch-2 alive.
>> My current understanding is that we can't really break compatibility in
>> either a minor or point release. But the only mention of the difference
>> between minor and point releases is how to deal with Stable, Evolving, and
>> Unstable tags, and how to deal with changing default configuration values.
>> So it seems like there really isn't a big official difference between the
>> two. In my mind, the functional difference between the two is that the
>> minor releases may have added features and rewrites, while the point
>> releases only have bug fixes. This might be an incorrect understanding,
>> but
>> that's what I have gathered from watching the releases over the last few
>> years. Whether or not this is a correct understanding, I think that this
>> needs to be documented somewhere, even if it is just a convention.
>>
>> Given my assumed understanding of minor vs point releases, here are the
>> pros/cons that I can think of for having a branch-2. Please add on or
>> correct me for anything you feel is missing or inadequate.
>> Pros:
>> - Features/rewrites/higher-risk patches are less likely to be put into
>> 2.10.x
>> - It is less necessary to move to 3.x
>>
>> Cons:
>> - Bug fixes are less likely to be put into 2.10.x
>> - An extra branch to maintain
>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>> patches to if they should go all the way back to 2.10.x
>> - It is less necessary to move to 3.x
>>
>> So on the one hand you get added stability in fewer features being
>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>> committed. In a perfect world, we wouldn't have to make this tradeoff. But
>> we don't live in a perfect world and committers will make mistakes either
>> because of lack of knowledge or simply because they made a mistake. If we
>> have a branch-2, committers will forget, not know to, or choose not to
>> (for
>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>> If
>> we don't have a branch-2, committers who want their borderline risky
>> feature in the 2.x line will err on the side of putting it into
>> branch-2.10
>> instead of proposing the creation of a branch-2. Clearly I have made quite
>> a few assumptions here based on my own experiences, so I would like to
>> hear
>> if others have similar or opposing views.
>>
>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>> branch-2 is due to an effort to push the community towards 3.x. This is
>> why
>> I have added movement to 3.x as both a pro and a con. As a community
>> trying
>> to move forward, keeping as many companies on similar branches as possible
>> is a good way to make sure the code is well-tested. However, from a
>> stability point of view, moving to 3.x is still scary and being able to
>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>> bridge release effort has been very good at making it possible for people
>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
>> it is reasonable for companies to want to be extra cautious with 3.x due
>> to
>> potential performance degradation at large scale.
>>
>> A question I'm pondering is what happens when we move to Java 11 and
>> someone is still on 2.x? If they want to backport HADOOP-15338
>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support
>> to
>> 2.x, surely not everyone is going to want that (at least not immediately).
>> The 2.10 documentation states, "The JVM requirements will not change
>> across
>> point releases within the same minor release except if the JVM version
>> under question becomes unsupported" [1], so this would warrant a 2.11
>> release until Java 8 becomes unsupported (though one could argue that it
>> is
>> already unsupported since Oracle is no longer giving public Java 8
>> update).
>> If we don't keep branch-2 around now, would a Java 11 backport be the
>> catalyst for a branch-2 revival?
>>
>> Not sure if this really leads to any sort of answer from me on whether or
>> not we should keep branch-2 alive, but these are the things that I am
>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>> not is committers not being on the same page with where they should commit
>> their patches.
>>
>> Eric
>>
>> [1]
>>
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> [2]
>>
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>
>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>> wrote:
>>
>> > Hi Konstantin,
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about the
>> > stability of 2.10, since we will be on it for a couple of years at
>> least.
>> > I worry
>> >  that some committers may want to put new features into a branch 2
>> release,
>> >  and without a branch-2, they will go directly into 2.10. Since we don't
>> > always
>> >  catch corner cases or performance problems for some time (usually not
>> > until
>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>> > very
>> >  difficult to back out those changes.
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>> but I
>> > do
>> >  have these reservations.
>> >
>> > Thanks,
>> > -Eric
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>> > shv.hadoop@gmail.com> wrote:
>> > Hi Eric,
>> >
>> > We had a long discussion on this list regarding making the 2.10 release
>> the
>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
>> > the picture right now, and many people may object this idea.
>> >
>> > I understand Jonathan's proposal as an attempt to
>> > 1. eliminate confusion which branches people should commit their
>> back-ports
>> > to
>> > 2. save engineering effort committing to more branches than necessary
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide to
>> > release 2.11 we can resurrect the branch.
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> > Thanks,
>> > --Konstantin
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>> pros
>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>> > much
>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>> > people
>> > > outside of our company who expressed interest in getting new features
>> to
>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>> 2.10.0
>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>> > > branch-2.10, so it's already diverged quite a bit.
>> > >
>> > > In any case, we can always reverse this decision if we really need
>> to, by
>> > > recreating branch-2. But this proposal would reduce a lot of confusion
>> > IMO.
>> > >
>> > > Jonathan Hung
>> > >
>> > >
>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <epayne@apache.org
>> >
>> > > wrote:
>> > >
>> > > > Thanks Jonathan for opening the discussion.
>> > > >
>> > > > I am not in favor of this proposal. 2.10 was very recently released,
>> > and
>> > > > moving to 2.10 will take some time for the community. It seems
>> > premature
>> > > to
>> > > > make a decision at this point that there will never be a need for a
>> > 2.11
>> > > > release.
>> > > >
>> > > > -Eric
>> > > >
>> > > >
>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>> > > > jyhung2357@gmail.com> wrote:
>> > > >
>> > > > Hi folks,
>> > > >
>> > > > Given the release of 2.10.0, and the fact that it's intended to be a
>> > > bridge
>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>> minor
>> > > > release line in branch-2. Currently, the main issue is that there's
>> > many
>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>> > into
>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>> will
>> > > > likely never see the light of day unless they are backported to
>> > > > branch-2.10.
>> > > >
>> > > > To do this, I propose we:
>> > > >
>> > > >  - Delete branch-2.10
>> > > >  - Rename branch-2 to branch-2.10
>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> > > >
>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>> release
>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> > > >
>> > > > Thoughts?
>> > > >
>> > > > Jonathan Hung
>> > > >
>> > > > [1]
>> > > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> > > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks for the detailed thoughts, everyone.

Eric (Badger), my understanding is the same as yours re. minor vs patch
releases. As for putting features into minor/patch releases, if we keep the
convention of putting new features only into minor releases, my assumption
is still that it's unlikely people will want to get them into branch-2
(based on the 2.10.0 release process). For the java 11 issue, we haven't
even really removed support for java 7 in branch-2 (much less java 8), so I
feel moving to java 11 would go along with a move to branch 3. And as you
mentioned, if people really want to use java 11 on branch-2, we can always
revive branch-2. But for now I think the convenience of not needing to port
to both branch-2 and branch-2.10 (and below) outweighs the cost of
potentially needing to revive branch-2.

Jonathan Hung


On Wed, Nov 20, 2019 at 10:50 AM Eric Yang <ey...@cloudera.com> wrote:

> +1 for 2.10.x as last release for 2.x version.
>
> Software would become more compatible when more companies stress test the
> same software and making improvements in trunk.  Some may be extra caution
> on moving up the version because obligation internally to keep things
> running.  Company obligation should not be the driving force to maintain
> Hadoop branches.  There is no proper collaboration in the community when
> every name brand company maintains its own Hadoop 2.x version.  I think it
> would be more healthy for the community to reduce the branch forking and
> spend energy on trunk to harden the software.  This will give more
> confidence to move up the version than trying to fix n permutations
> breakage like Flash fixing the timeline.
>
> Apache license stated, there is no warranty of any kind for code
> contributions.  Fewer community release process should improve software
> quality when eyes are on trunk, and help steering toward the same end goals.
>
> regards,
> Eric
>
>
>
> On Tue, Nov 19, 2019 at 3:03 PM Eric Badger
> <eb...@verizonmedia.com.invalid> wrote:
>
>> Hello all,
>>
>> Is it written anywhere what the difference is between a minor release and
>> a
>> point/dot/maintenance (I'll use "point" from here on out) release? I have
>> looked around and I can't find anything other than some compatibility
>> documentation in 2.x that has since been removed in 3.x [1] [2]. I think
>> this would help shape my opinion on whether or not to keep branch-2 alive.
>> My current understanding is that we can't really break compatibility in
>> either a minor or point release. But the only mention of the difference
>> between minor and point releases is how to deal with Stable, Evolving, and
>> Unstable tags, and how to deal with changing default configuration values.
>> So it seems like there really isn't a big official difference between the
>> two. In my mind, the functional difference between the two is that the
>> minor releases may have added features and rewrites, while the point
>> releases only have bug fixes. This might be an incorrect understanding,
>> but
>> that's what I have gathered from watching the releases over the last few
>> years. Whether or not this is a correct understanding, I think that this
>> needs to be documented somewhere, even if it is just a convention.
>>
>> Given my assumed understanding of minor vs point releases, here are the
>> pros/cons that I can think of for having a branch-2. Please add on or
>> correct me for anything you feel is missing or inadequate.
>> Pros:
>> - Features/rewrites/higher-risk patches are less likely to be put into
>> 2.10.x
>> - It is less necessary to move to 3.x
>>
>> Cons:
>> - Bug fixes are less likely to be put into 2.10.x
>> - An extra branch to maintain
>>   - Committers have an extra branch (5 vs 4 total branches) to commit
>> patches to if they should go all the way back to 2.10.x
>> - It is less necessary to move to 3.x
>>
>> So on the one hand you get added stability in fewer features being
>> committed to 2.10.x, but then on the other you get fewer bug fixes being
>> committed. In a perfect world, we wouldn't have to make this tradeoff. But
>> we don't live in a perfect world and committers will make mistakes either
>> because of lack of knowledge or simply because they made a mistake. If we
>> have a branch-2, committers will forget, not know to, or choose not to
>> (for
>> whatever reason) commit valid bug fixes back all the way to branch-2.10.
>> If
>> we don't have a branch-2, committers who want their borderline risky
>> feature in the 2.x line will err on the side of putting it into
>> branch-2.10
>> instead of proposing the creation of a branch-2. Clearly I have made quite
>> a few assumptions here based on my own experiences, so I would like to
>> hear
>> if others have similar or opposing views.
>>
>> As far as 3.x goes, to me it seems like some of the reasoning for killing
>> branch-2 is due to an effort to push the community towards 3.x. This is
>> why
>> I have added movement to 3.x as both a pro and a con. As a community
>> trying
>> to move forward, keeping as many companies on similar branches as possible
>> is a good way to make sure the code is well-tested. However, from a
>> stability point of view, moving to 3.x is still scary and being able to
>> stay on 2.x until you are comfortable to move is very nice. The 2.10.0
>> bridge release effort has been very good at making it possible for people
>> to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
>> it is reasonable for companies to want to be extra cautious with 3.x due
>> to
>> potential performance degradation at large scale.
>>
>> A question I'm pondering is what happens when we move to Java 11 and
>> someone is still on 2.x? If they want to backport HADOOP-15338
>> <https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support
>> to
>> 2.x, surely not everyone is going to want that (at least not immediately).
>> The 2.10 documentation states, "The JVM requirements will not change
>> across
>> point releases within the same minor release except if the JVM version
>> under question becomes unsupported" [1], so this would warrant a 2.11
>> release until Java 8 becomes unsupported (though one could argue that it
>> is
>> already unsupported since Oracle is no longer giving public Java 8
>> update).
>> If we don't keep branch-2 around now, would a Java 11 backport be the
>> catalyst for a branch-2 revival?
>>
>> Not sure if this really leads to any sort of answer from me on whether or
>> not we should keep branch-2 alive, but these are the things that I am
>> weighing in my mind. For me, the bigger problem beyond having branch-2 or
>> not is committers not being on the same page with where they should commit
>> their patches.
>>
>> Eric
>>
>> [1]
>>
>> https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
>> [2]
>>
>> https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html
>>
>> On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org>
>> wrote:
>>
>> > Hi Konstantin,
>> >
>> > Sure, I understand those concerns. On the other hand, I worry about the
>> > stability of 2.10, since we will be on it for a couple of years at
>> least.
>> > I worry
>> >  that some committers may want to put new features into a branch 2
>> release,
>> >  and without a branch-2, they will go directly into 2.10. Since we don't
>> > always
>> >  catch corner cases or performance problems for some time (usually not
>> > until
>> >  the release is deployed to a busy, 4-thousand node cluster), it may be
>> > very
>> >  difficult to back out those changes.
>> >
>> > It sounds like I'm in the minority here, so I'm not nixing the idea,
>> but I
>> > do
>> >  have these reservations.
>> >
>> > Thanks,
>> > -Eric
>> >
>> >
>> >
>> > On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
>> > shv.hadoop@gmail.com> wrote:
>> > Hi Eric,
>> >
>> > We had a long discussion on this list regarding making the 2.10 release
>> the
>> > last of branch-2 releases. We intended 2.10 as a bridge release between
>> > Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
>> > the picture right now, and many people may object this idea.
>> >
>> > I understand Jonathan's proposal as an attempt to
>> > 1. eliminate confusion which branches people should commit their
>> back-ports
>> > to
>> > 2. save engineering effort committing to more branches than necessary
>> >
>> > "Branches are cheap" as our founder used to say. If we ever decide to
>> > release 2.11 we can resurrect the branch.
>> > Until then I am in favor of Jonathan's proposal +1.
>> >
>> > Thanks,
>> > --Konstantin
>> >
>> >
>> > On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
>> > wrote:
>> >
>> > > Thanks Eric for the comments - regarding your concerns, I feel the
>> pros
>> > > outweigh the cons. To me, the chances of patch releases on 2.10.x are
>> > much
>> > > higher than a new 2.11 minor release. (There didn't seem to be many
>> > people
>> > > outside of our company who expressed interest in getting new features
>> to
>> > > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after
>> 2.10.0
>> > > release, there's 29 patches that have gone into branch-2 and 9 in
>> > > branch-2.10, so it's already diverged quite a bit.
>> > >
>> > > In any case, we can always reverse this decision if we really need
>> to, by
>> > > recreating branch-2. But this proposal would reduce a lot of confusion
>> > IMO.
>> > >
>> > > Jonathan Hung
>> > >
>> > >
>> > > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <epayne@apache.org
>> >
>> > > wrote:
>> > >
>> > > > Thanks Jonathan for opening the discussion.
>> > > >
>> > > > I am not in favor of this proposal. 2.10 was very recently released,
>> > and
>> > > > moving to 2.10 will take some time for the community. It seems
>> > premature
>> > > to
>> > > > make a decision at this point that there will never be a need for a
>> > 2.11
>> > > > release.
>> > > >
>> > > > -Eric
>> > > >
>> > > >
>> > > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
>> > > > jyhung2357@gmail.com> wrote:
>> > > >
>> > > > Hi folks,
>> > > >
>> > > > Given the release of 2.10.0, and the fact that it's intended to be a
>> > > bridge
>> > > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
>> minor
>> > > > release line in branch-2. Currently, the main issue is that there's
>> > many
>> > > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
>> > into
>> > > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2
>> will
>> > > > likely never see the light of day unless they are backported to
>> > > > branch-2.10.
>> > > >
>> > > > To do this, I propose we:
>> > > >
>> > > >  - Delete branch-2.10
>> > > >  - Rename branch-2 to branch-2.10
>> > > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>> > > >
>> > > > This way we get all the current branch-2 fixes into the 2.10.x
>> release
>> > > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
>> > > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>> > > >
>> > > > Thoughts?
>> > > >
>> > > > Jonathan Hung
>> > > >
>> > > > [1]
>> > > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>> > > >
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-help@hadoop.apache.org
>> >
>> >
>>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
Hello all,

Is it written anywhere what the difference is between a minor release and a
point/dot/maintenance (I'll use "point" from here on out) release? I have
looked around and I can't find anything other than some compatibility
documentation in 2.x that has since been removed in 3.x [1] [2]. I think
this would help shape my opinion on whether or not to keep branch-2 alive.
My current understanding is that we can't really break compatibility in
either a minor or point release. But the only mention of the difference
between minor and point releases is how to deal with Stable, Evolving, and
Unstable tags, and how to deal with changing default configuration values.
So it seems like there really isn't a big official difference between the
two. In my mind, the functional difference between the two is that the
minor releases may have added features and rewrites, while the point
releases only have bug fixes. This might be an incorrect understanding, but
that's what I have gathered from watching the releases over the last few
years. Whether or not this is a correct understanding, I think that this
needs to be documented somewhere, even if it is just a convention.

Given my assumed understanding of minor vs point releases, here are the
pros/cons that I can think of for having a branch-2. Please add on or
correct me for anything you feel is missing or inadequate.
Pros:
- Features/rewrites/higher-risk patches are less likely to be put into
2.10.x
- It is less necessary to move to 3.x

Cons:
- Bug fixes are less likely to be put into 2.10.x
- An extra branch to maintain
  - Committers have an extra branch (5 vs 4 total branches) to commit
patches to if they should go all the way back to 2.10.x
- It is less necessary to move to 3.x

So on the one hand you get added stability in fewer features being
committed to 2.10.x, but then on the other you get fewer bug fixes being
committed. In a perfect world, we wouldn't have to make this tradeoff. But
we don't live in a perfect world and committers will make mistakes either
because of lack of knowledge or simply because they made a mistake. If we
have a branch-2, committers will forget, not know to, or choose not to (for
whatever reason) commit valid bug fixes back all the way to branch-2.10. If
we don't have a branch-2, committers who want their borderline risky
feature in the 2.x line will err on the side of putting it into branch-2.10
instead of proposing the creation of a branch-2. Clearly I have made quite
a few assumptions here based on my own experiences, so I would like to hear
if others have similar or opposing views.

As far as 3.x goes, to me it seems like some of the reasoning for killing
branch-2 is due to an effort to push the community towards 3.x. This is why
I have added movement to 3.x as both a pro and a con. As a community trying
to move forward, keeping as many companies on similar branches as possible
is a good way to make sure the code is well-tested. However, from a
stability point of view, moving to 3.x is still scary and being able to
stay on 2.x until you are comfortable to move is very nice. The 2.10.0
bridge release effort has been very good at making it possible for people
to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
it is reasonable for companies to want to be extra cautious with 3.x due to
potential performance degradation at large scale.

A question I'm pondering is what happens when we move to Java 11 and
someone is still on 2.x? If they want to backport HADOOP-15338
<https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support to
2.x, surely not everyone is going to want that (at least not immediately).
The 2.10 documentation states, "The JVM requirements will not change across
point releases within the same minor release except if the JVM version
under question becomes unsupported" [1], so this would warrant a 2.11
release until Java 8 becomes unsupported (though one could argue that it is
already unsupported since Oracle is no longer giving public Java 8 update).
If we don't keep branch-2 around now, would a Java 11 backport be the
catalyst for a branch-2 revival?

Not sure if this really leads to any sort of answer from me on whether or
not we should keep branch-2 alive, but these are the things that I am
weighing in my mind. For me, the bigger problem beyond having branch-2 or
not is committers not being on the same page with where they should commit
their patches.

Eric

[1]
https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
[2]
https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html

On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org> wrote:

> Hi Konstantin,
>
> Sure, I understand those concerns. On the other hand, I worry about the
> stability of 2.10, since we will be on it for a couple of years at least.
> I worry
>  that some committers may want to put new features into a branch 2 release,
>  and without a branch-2, they will go directly into 2.10. Since we don't
> always
>  catch corner cases or performance problems for some time (usually not
> until
>  the release is deployed to a busy, 4-thousand node cluster), it may be
> very
>  difficult to back out those changes.
>
> It sounds like I'm in the minority here, so I'm not nixing the idea, but I
> do
>  have these reservations.
>
> Thanks,
> -Eric
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
> shv.hadoop@gmail.com> wrote:
> Hi Eric,
>
> We had a long discussion on this list regarding making the 2.10 release the
> last of branch-2 releases. We intended 2.10 as a bridge release between
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
> the picture right now, and many people may object this idea.
>
> I understand Jonathan's proposal as an attempt to
> 1. eliminate confusion which branches people should commit their back-ports
> to
> 2. save engineering effort committing to more branches than necessary
>
> "Branches are cheap" as our founder used to say. If we ever decide to
> release 2.11 we can resurrect the branch.
> Until then I am in favor of Jonathan's proposal +1.
>
> Thanks,
> --Konstantin
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Thanks Eric for the comments - regarding your concerns, I feel the pros
> > outweigh the cons. To me, the chances of patch releases on 2.10.x are
> much
> > higher than a new 2.11 minor release. (There didn't seem to be many
> people
> > outside of our company who expressed interest in getting new features to
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> > release, there's 29 patches that have gone into branch-2 and 9 in
> > branch-2.10, so it's already diverged quite a bit.
> >
> > In any case, we can always reverse this decision if we really need to, by
> > recreating branch-2. But this proposal would reduce a lot of confusion
> IMO.
> >
> > Jonathan Hung
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> > wrote:
> >
> > > Thanks Jonathan for opening the discussion.
> > >
> > > I am not in favor of this proposal. 2.10 was very recently released,
> and
> > > moving to 2.10 will take some time for the community. It seems
> premature
> > to
> > > make a decision at this point that there will never be a need for a
> 2.11
> > > release.
> > >
> > > -Eric
> > >
> > >
> > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > > jyhung2357@gmail.com> wrote:
> > >
> > > Hi folks,
> > >
> > > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge
> > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > > release line in branch-2. Currently, the main issue is that there's
> many
> > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into
> > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > > likely never see the light of day unless they are backported to
> > > branch-2.10.
> > >
> > > To do this, I propose we:
> > >
> > >  - Delete branch-2.10
> > >  - Rename branch-2 to branch-2.10
> > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> > >
> > > This way we get all the current branch-2 fixes into the 2.10.x release
> > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> > >
> > > Thoughts?
> > >
> > > Jonathan Hung
> > >
> > > [1]
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
Hello all,

Is it written anywhere what the difference is between a minor release and a
point/dot/maintenance (I'll use "point" from here on out) release? I have
looked around and I can't find anything other than some compatibility
documentation in 2.x that has since been removed in 3.x [1] [2]. I think
this would help shape my opinion on whether or not to keep branch-2 alive.
My current understanding is that we can't really break compatibility in
either a minor or point release. But the only mention of the difference
between minor and point releases is how to deal with Stable, Evolving, and
Unstable tags, and how to deal with changing default configuration values.
So it seems like there really isn't a big official difference between the
two. In my mind, the functional difference between the two is that the
minor releases may have added features and rewrites, while the point
releases only have bug fixes. This might be an incorrect understanding, but
that's what I have gathered from watching the releases over the last few
years. Whether or not this is a correct understanding, I think that this
needs to be documented somewhere, even if it is just a convention.

Given my assumed understanding of minor vs point releases, here are the
pros/cons that I can think of for having a branch-2. Please add on or
correct me for anything you feel is missing or inadequate.
Pros:
- Features/rewrites/higher-risk patches are less likely to be put into
2.10.x
- It is less necessary to move to 3.x

Cons:
- Bug fixes are less likely to be put into 2.10.x
- An extra branch to maintain
  - Committers have an extra branch (5 vs 4 total branches) to commit
patches to if they should go all the way back to 2.10.x
- It is less necessary to move to 3.x

So on the one hand you get added stability in fewer features being
committed to 2.10.x, but then on the other you get fewer bug fixes being
committed. In a perfect world, we wouldn't have to make this tradeoff. But
we don't live in a perfect world and committers will make mistakes either
because of lack of knowledge or simply because they made a mistake. If we
have a branch-2, committers will forget, not know to, or choose not to (for
whatever reason) commit valid bug fixes back all the way to branch-2.10. If
we don't have a branch-2, committers who want their borderline risky
feature in the 2.x line will err on the side of putting it into branch-2.10
instead of proposing the creation of a branch-2. Clearly I have made quite
a few assumptions here based on my own experiences, so I would like to hear
if others have similar or opposing views.

As far as 3.x goes, to me it seems like some of the reasoning for killing
branch-2 is due to an effort to push the community towards 3.x. This is why
I have added movement to 3.x as both a pro and a con. As a community trying
to move forward, keeping as many companies on similar branches as possible
is a good way to make sure the code is well-tested. However, from a
stability point of view, moving to 3.x is still scary and being able to
stay on 2.x until you are comfortable to move is very nice. The 2.10.0
bridge release effort has been very good at making it possible for people
to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
it is reasonable for companies to want to be extra cautious with 3.x due to
potential performance degradation at large scale.

A question I'm pondering is what happens when we move to Java 11 and
someone is still on 2.x? If they want to backport HADOOP-15338
<https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support to
2.x, surely not everyone is going to want that (at least not immediately).
The 2.10 documentation states, "The JVM requirements will not change across
point releases within the same minor release except if the JVM version
under question becomes unsupported" [1], so this would warrant a 2.11
release until Java 8 becomes unsupported (though one could argue that it is
already unsupported since Oracle is no longer giving public Java 8 update).
If we don't keep branch-2 around now, would a Java 11 backport be the
catalyst for a branch-2 revival?

Not sure if this really leads to any sort of answer from me on whether or
not we should keep branch-2 alive, but these are the things that I am
weighing in my mind. For me, the bigger problem beyond having branch-2 or
not is committers not being on the same page with where they should commit
their patches.

Eric

[1]
https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
[2]
https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html

On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org> wrote:

> Hi Konstantin,
>
> Sure, I understand those concerns. On the other hand, I worry about the
> stability of 2.10, since we will be on it for a couple of years at least.
> I worry
>  that some committers may want to put new features into a branch 2 release,
>  and without a branch-2, they will go directly into 2.10. Since we don't
> always
>  catch corner cases or performance problems for some time (usually not
> until
>  the release is deployed to a busy, 4-thousand node cluster), it may be
> very
>  difficult to back out those changes.
>
> It sounds like I'm in the minority here, so I'm not nixing the idea, but I
> do
>  have these reservations.
>
> Thanks,
> -Eric
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
> shv.hadoop@gmail.com> wrote:
> Hi Eric,
>
> We had a long discussion on this list regarding making the 2.10 release the
> last of branch-2 releases. We intended 2.10 as a bridge release between
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
> the picture right now, and many people may object this idea.
>
> I understand Jonathan's proposal as an attempt to
> 1. eliminate confusion which branches people should commit their back-ports
> to
> 2. save engineering effort committing to more branches than necessary
>
> "Branches are cheap" as our founder used to say. If we ever decide to
> release 2.11 we can resurrect the branch.
> Until then I am in favor of Jonathan's proposal +1.
>
> Thanks,
> --Konstantin
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Thanks Eric for the comments - regarding your concerns, I feel the pros
> > outweigh the cons. To me, the chances of patch releases on 2.10.x are
> much
> > higher than a new 2.11 minor release. (There didn't seem to be many
> people
> > outside of our company who expressed interest in getting new features to
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> > release, there's 29 patches that have gone into branch-2 and 9 in
> > branch-2.10, so it's already diverged quite a bit.
> >
> > In any case, we can always reverse this decision if we really need to, by
> > recreating branch-2. But this proposal would reduce a lot of confusion
> IMO.
> >
> > Jonathan Hung
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> > wrote:
> >
> > > Thanks Jonathan for opening the discussion.
> > >
> > > I am not in favor of this proposal. 2.10 was very recently released,
> and
> > > moving to 2.10 will take some time for the community. It seems
> premature
> > to
> > > make a decision at this point that there will never be a need for a
> 2.11
> > > release.
> > >
> > > -Eric
> > >
> > >
> > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > > jyhung2357@gmail.com> wrote:
> > >
> > > Hi folks,
> > >
> > > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge
> > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > > release line in branch-2. Currently, the main issue is that there's
> many
> > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into
> > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > > likely never see the light of day unless they are backported to
> > > branch-2.10.
> > >
> > > To do this, I propose we:
> > >
> > >  - Delete branch-2.10
> > >  - Rename branch-2 to branch-2.10
> > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> > >
> > > This way we get all the current branch-2 fixes into the 2.10.x release
> > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> > >
> > > Thoughts?
> > >
> > > Jonathan Hung
> > >
> > > [1]
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
Hello all,

Is it written anywhere what the difference is between a minor release and a
point/dot/maintenance (I'll use "point" from here on out) release? I have
looked around and I can't find anything other than some compatibility
documentation in 2.x that has since been removed in 3.x [1] [2]. I think
this would help shape my opinion on whether or not to keep branch-2 alive.
My current understanding is that we can't really break compatibility in
either a minor or point release. But the only mention of the difference
between minor and point releases is how to deal with Stable, Evolving, and
Unstable tags, and how to deal with changing default configuration values.
So it seems like there really isn't a big official difference between the
two. In my mind, the functional difference between the two is that the
minor releases may have added features and rewrites, while the point
releases only have bug fixes. This might be an incorrect understanding, but
that's what I have gathered from watching the releases over the last few
years. Whether or not this is a correct understanding, I think that this
needs to be documented somewhere, even if it is just a convention.

Given my assumed understanding of minor vs point releases, here are the
pros/cons that I can think of for having a branch-2. Please add on or
correct me for anything you feel is missing or inadequate.
Pros:
- Features/rewrites/higher-risk patches are less likely to be put into
2.10.x
- It is less necessary to move to 3.x

Cons:
- Bug fixes are less likely to be put into 2.10.x
- An extra branch to maintain
  - Committers have an extra branch (5 vs 4 total branches) to commit
patches to if they should go all the way back to 2.10.x
- It is less necessary to move to 3.x

So on the one hand you get added stability in fewer features being
committed to 2.10.x, but then on the other you get fewer bug fixes being
committed. In a perfect world, we wouldn't have to make this tradeoff. But
we don't live in a perfect world and committers will make mistakes either
because of lack of knowledge or simply because they made a mistake. If we
have a branch-2, committers will forget, not know to, or choose not to (for
whatever reason) commit valid bug fixes back all the way to branch-2.10. If
we don't have a branch-2, committers who want their borderline risky
feature in the 2.x line will err on the side of putting it into branch-2.10
instead of proposing the creation of a branch-2. Clearly I have made quite
a few assumptions here based on my own experiences, so I would like to hear
if others have similar or opposing views.

As far as 3.x goes, to me it seems like some of the reasoning for killing
branch-2 is due to an effort to push the community towards 3.x. This is why
I have added movement to 3.x as both a pro and a con. As a community trying
to move forward, keeping as many companies on similar branches as possible
is a good way to make sure the code is well-tested. However, from a
stability point of view, moving to 3.x is still scary and being able to
stay on 2.x until you are comfortable to move is very nice. The 2.10.0
bridge release effort has been very good at making it possible for people
to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
it is reasonable for companies to want to be extra cautious with 3.x due to
potential performance degradation at large scale.

A question I'm pondering is what happens when we move to Java 11 and
someone is still on 2.x? If they want to backport HADOOP-15338
<https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support to
2.x, surely not everyone is going to want that (at least not immediately).
The 2.10 documentation states, "The JVM requirements will not change across
point releases within the same minor release except if the JVM version
under question becomes unsupported" [1], so this would warrant a 2.11
release until Java 8 becomes unsupported (though one could argue that it is
already unsupported since Oracle is no longer giving public Java 8 update).
If we don't keep branch-2 around now, would a Java 11 backport be the
catalyst for a branch-2 revival?

Not sure if this really leads to any sort of answer from me on whether or
not we should keep branch-2 alive, but these are the things that I am
weighing in my mind. For me, the bigger problem beyond having branch-2 or
not is committers not being on the same page with where they should commit
their patches.

Eric

[1]
https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
[2]
https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html

On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org> wrote:

> Hi Konstantin,
>
> Sure, I understand those concerns. On the other hand, I worry about the
> stability of 2.10, since we will be on it for a couple of years at least.
> I worry
>  that some committers may want to put new features into a branch 2 release,
>  and without a branch-2, they will go directly into 2.10. Since we don't
> always
>  catch corner cases or performance problems for some time (usually not
> until
>  the release is deployed to a busy, 4-thousand node cluster), it may be
> very
>  difficult to back out those changes.
>
> It sounds like I'm in the minority here, so I'm not nixing the idea, but I
> do
>  have these reservations.
>
> Thanks,
> -Eric
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
> shv.hadoop@gmail.com> wrote:
> Hi Eric,
>
> We had a long discussion on this list regarding making the 2.10 release the
> last of branch-2 releases. We intended 2.10 as a bridge release between
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
> the picture right now, and many people may object this idea.
>
> I understand Jonathan's proposal as an attempt to
> 1. eliminate confusion which branches people should commit their back-ports
> to
> 2. save engineering effort committing to more branches than necessary
>
> "Branches are cheap" as our founder used to say. If we ever decide to
> release 2.11 we can resurrect the branch.
> Until then I am in favor of Jonathan's proposal +1.
>
> Thanks,
> --Konstantin
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Thanks Eric for the comments - regarding your concerns, I feel the pros
> > outweigh the cons. To me, the chances of patch releases on 2.10.x are
> much
> > higher than a new 2.11 minor release. (There didn't seem to be many
> people
> > outside of our company who expressed interest in getting new features to
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> > release, there's 29 patches that have gone into branch-2 and 9 in
> > branch-2.10, so it's already diverged quite a bit.
> >
> > In any case, we can always reverse this decision if we really need to, by
> > recreating branch-2. But this proposal would reduce a lot of confusion
> IMO.
> >
> > Jonathan Hung
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> > wrote:
> >
> > > Thanks Jonathan for opening the discussion.
> > >
> > > I am not in favor of this proposal. 2.10 was very recently released,
> and
> > > moving to 2.10 will take some time for the community. It seems
> premature
> > to
> > > make a decision at this point that there will never be a need for a
> 2.11
> > > release.
> > >
> > > -Eric
> > >
> > >
> > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > > jyhung2357@gmail.com> wrote:
> > >
> > > Hi folks,
> > >
> > > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge
> > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > > release line in branch-2. Currently, the main issue is that there's
> many
> > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into
> > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > > likely never see the light of day unless they are backported to
> > > branch-2.10.
> > >
> > > To do this, I propose we:
> > >
> > >  - Delete branch-2.10
> > >  - Rename branch-2 to branch-2.10
> > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> > >
> > > This way we get all the current branch-2 fixes into the 2.10.x release
> > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> > >
> > > Thoughts?
> > >
> > > Jonathan Hung
> > >
> > > [1]
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Eric Badger <eb...@verizonmedia.com.INVALID>.
Hello all,

Is it written anywhere what the difference is between a minor release and a
point/dot/maintenance (I'll use "point" from here on out) release? I have
looked around and I can't find anything other than some compatibility
documentation in 2.x that has since been removed in 3.x [1] [2]. I think
this would help shape my opinion on whether or not to keep branch-2 alive.
My current understanding is that we can't really break compatibility in
either a minor or point release. But the only mention of the difference
between minor and point releases is how to deal with Stable, Evolving, and
Unstable tags, and how to deal with changing default configuration values.
So it seems like there really isn't a big official difference between the
two. In my mind, the functional difference between the two is that the
minor releases may have added features and rewrites, while the point
releases only have bug fixes. This might be an incorrect understanding, but
that's what I have gathered from watching the releases over the last few
years. Whether or not this is a correct understanding, I think that this
needs to be documented somewhere, even if it is just a convention.

Given my assumed understanding of minor vs point releases, here are the
pros/cons that I can think of for having a branch-2. Please add on or
correct me for anything you feel is missing or inadequate.
Pros:
- Features/rewrites/higher-risk patches are less likely to be put into
2.10.x
- It is less necessary to move to 3.x

Cons:
- Bug fixes are less likely to be put into 2.10.x
- An extra branch to maintain
  - Committers have an extra branch (5 vs 4 total branches) to commit
patches to if they should go all the way back to 2.10.x
- It is less necessary to move to 3.x

So on the one hand you get added stability in fewer features being
committed to 2.10.x, but then on the other you get fewer bug fixes being
committed. In a perfect world, we wouldn't have to make this tradeoff. But
we don't live in a perfect world and committers will make mistakes either
because of lack of knowledge or simply because they made a mistake. If we
have a branch-2, committers will forget, not know to, or choose not to (for
whatever reason) commit valid bug fixes back all the way to branch-2.10. If
we don't have a branch-2, committers who want their borderline risky
feature in the 2.x line will err on the side of putting it into branch-2.10
instead of proposing the creation of a branch-2. Clearly I have made quite
a few assumptions here based on my own experiences, so I would like to hear
if others have similar or opposing views.

As far as 3.x goes, to me it seems like some of the reasoning for killing
branch-2 is due to an effort to push the community towards 3.x. This is why
I have added movement to 3.x as both a pro and a con. As a community trying
to move forward, keeping as many companies on similar branches as possible
is a good way to make sure the code is well-tested. However, from a
stability point of view, moving to 3.x is still scary and being able to
stay on 2.x until you are comfortable to move is very nice. The 2.10.0
bridge release effort has been very good at making it possible for people
to move from 2.x in 3.x, but the diff between 2.x and 3.x is so large that
it is reasonable for companies to want to be extra cautious with 3.x due to
potential performance degradation at large scale.

A question I'm pondering is what happens when we move to Java 11 and
someone is still on 2.x? If they want to backport HADOOP-15338
<https://issues.apache.org/jira/browse/HADOOP-15338> for Java 11 support to
2.x, surely not everyone is going to want that (at least not immediately).
The 2.10 documentation states, "The JVM requirements will not change across
point releases within the same minor release except if the JVM version
under question becomes unsupported" [1], so this would warrant a 2.11
release until Java 8 becomes unsupported (though one could argue that it is
already unsupported since Oracle is no longer giving public Java 8 update).
If we don't keep branch-2 around now, would a Java 11 backport be the
catalyst for a branch-2 revival?

Not sure if this really leads to any sort of answer from me on whether or
not we should keep branch-2 alive, but these are the things that I am
weighing in my mind. For me, the bigger problem beyond having branch-2 or
not is committers not being on the same page with where they should commit
their patches.

Eric

[1]
https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-common/Compatibility.html
[2]
https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html

On Tue, Nov 19, 2019 at 2:49 PM epayne@apache.org <ep...@apache.org> wrote:

> Hi Konstantin,
>
> Sure, I understand those concerns. On the other hand, I worry about the
> stability of 2.10, since we will be on it for a couple of years at least.
> I worry
>  that some committers may want to put new features into a branch 2 release,
>  and without a branch-2, they will go directly into 2.10. Since we don't
> always
>  catch corner cases or performance problems for some time (usually not
> until
>  the release is deployed to a busy, 4-thousand node cluster), it may be
> very
>  difficult to back out those changes.
>
> It sounds like I'm in the minority here, so I'm not nixing the idea, but I
> do
>  have these reservations.
>
> Thanks,
> -Eric
>
>
>
> On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <
> shv.hadoop@gmail.com> wrote:
> Hi Eric,
>
> We had a long discussion on this list regarding making the 2.10 release the
> last of branch-2 releases. We intended 2.10 as a bridge release between
> Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
> the picture right now, and many people may object this idea.
>
> I understand Jonathan's proposal as an attempt to
> 1. eliminate confusion which branches people should commit their back-ports
> to
> 2. save engineering effort committing to more branches than necessary
>
> "Branches are cheap" as our founder used to say. If we ever decide to
> release 2.11 we can resurrect the branch.
> Until then I am in favor of Jonathan's proposal +1.
>
> Thanks,
> --Konstantin
>
>
> On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com>
> wrote:
>
> > Thanks Eric for the comments - regarding your concerns, I feel the pros
> > outweigh the cons. To me, the chances of patch releases on 2.10.x are
> much
> > higher than a new 2.11 minor release. (There didn't seem to be many
> people
> > outside of our company who expressed interest in getting new features to
> > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> > release, there's 29 patches that have gone into branch-2 and 9 in
> > branch-2.10, so it's already diverged quite a bit.
> >
> > In any case, we can always reverse this decision if we really need to, by
> > recreating branch-2. But this proposal would reduce a lot of confusion
> IMO.
> >
> > Jonathan Hung
> >
> >
> > On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> > wrote:
> >
> > > Thanks Jonathan for opening the discussion.
> > >
> > > I am not in favor of this proposal. 2.10 was very recently released,
> and
> > > moving to 2.10 will take some time for the community. It seems
> premature
> > to
> > > make a decision at this point that there will never be a need for a
> 2.11
> > > release.
> > >
> > > -Eric
> > >
> > >
> > >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > > jyhung2357@gmail.com> wrote:
> > >
> > > Hi folks,
> > >
> > > Given the release of 2.10.0, and the fact that it's intended to be a
> > bridge
> > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > > release line in branch-2. Currently, the main issue is that there's
> many
> > > fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into
> > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > > likely never see the light of day unless they are backported to
> > > branch-2.10.
> > >
> > > To do this, I propose we:
> > >
> > >  - Delete branch-2.10
> > >  - Rename branch-2 to branch-2.10
> > >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> > >
> > > This way we get all the current branch-2 fixes into the 2.10.x release
> > > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> > >
> > > Thoughts?
> > >
> > > Jonathan Hung
> > >
> > > [1]
> > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Hi Konstantin,

Sure, I understand those concerns. On the other hand, I worry about the
stability of 2.10, since we will be on it for a couple of years at least. I worry
 that some committers may want to put new features into a branch 2 release,
 and without a branch-2, they will go directly into 2.10. Since we don't always
 catch corner cases or performance problems for some time (usually not until
 the release is deployed to a busy, 4-thousand node cluster), it may be very
 difficult to back out those changes.

It sounds like I'm in the minority here, so I'm not nixing the idea, but I do
 have these reservations.

Thanks,
-Eric



On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <sh...@gmail.com> wrote: 
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >  - Delete branch-2.10
> >  - Rename branch-2 to branch-2.10
> >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org


Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Hi Konstantin,

Sure, I understand those concerns. On the other hand, I worry about the
stability of 2.10, since we will be on it for a couple of years at least. I worry
 that some committers may want to put new features into a branch 2 release,
 and without a branch-2, they will go directly into 2.10. Since we don't always
 catch corner cases or performance problems for some time (usually not until
 the release is deployed to a busy, 4-thousand node cluster), it may be very
 difficult to back out those changes.

It sounds like I'm in the minority here, so I'm not nixing the idea, but I do
 have these reservations.

Thanks,
-Eric



On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <sh...@gmail.com> wrote: 
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >  - Delete branch-2.10
> >  - Rename branch-2 to branch-2.10
> >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Hi Konstantin,

Sure, I understand those concerns. On the other hand, I worry about the
stability of 2.10, since we will be on it for a couple of years at least. I worry
 that some committers may want to put new features into a branch 2 release,
 and without a branch-2, they will go directly into 2.10. Since we don't always
 catch corner cases or performance problems for some time (usually not until
 the release is deployed to a busy, 4-thousand node cluster), it may be very
 difficult to back out those changes.

It sounds like I'm in the minority here, so I'm not nixing the idea, but I do
 have these reservations.

Thanks,
-Eric



On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <sh...@gmail.com> wrote: 
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >  - Delete branch-2.10
> >  - Rename branch-2 to branch-2.10
> >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Hi Konstantin,

Sure, I understand those concerns. On the other hand, I worry about the
stability of 2.10, since we will be on it for a couple of years at least. I worry
 that some committers may want to put new features into a branch 2 release,
 and without a branch-2, they will go directly into 2.10. Since we don't always
 catch corner cases or performance problems for some time (usually not until
 the release is deployed to a busy, 4-thousand node cluster), it may be very
 difficult to back out those changes.

It sounds like I'm in the minority here, so I'm not nixing the idea, but I do
 have these reservations.

Thanks,
-Eric



On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko <sh...@gmail.com> wrote: 
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >  - Delete branch-2.10
> >  - Rename branch-2 to branch-2.10
> >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >   - Delete branch-2.10
> >   - Rename branch-2 to branch-2.10
> >   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >   - Delete branch-2.10
> >   - Rename branch-2 to branch-2.10
> >   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >   - Delete branch-2.10
> >   - Rename branch-2 to branch-2.10
> >   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung <jy...@gmail.com> wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2357@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >   - Delete branch-2.10
> >   - Rename branch-2 to branch-2.10
> >   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks Eric for the comments - regarding your concerns, I feel the pros
outweigh the cons. To me, the chances of patch releases on 2.10.x are much
higher than a new 2.11 minor release. (There didn't seem to be many people
outside of our company who expressed interest in getting new features to
branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
release, there's 29 patches that have gone into branch-2 and 9 in
branch-2.10, so it's already diverged quite a bit.

In any case, we can always reverse this decision if we really need to, by
recreating branch-2. But this proposal would reduce a lot of confusion IMO.

Jonathan Hung


On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks Eric for the comments - regarding your concerns, I feel the pros
outweigh the cons. To me, the chances of patch releases on 2.10.x are much
higher than a new 2.11 minor release. (There didn't seem to be many people
outside of our company who expressed interest in getting new features to
branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
release, there's 29 patches that have gone into branch-2 and 9 in
branch-2.10, so it's already diverged quite a bit.

In any case, we can always reverse this decision if we really need to, by
recreating branch-2. But this proposal would reduce a lot of confusion IMO.

Jonathan Hung


On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Wangda Tan <wh...@gmail.com>.
+1, thanks Jonathan for bringing this up!

On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Wangda Tan <wh...@gmail.com>.
+1, thanks Jonathan for bringing this up!

On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks Eric for the comments - regarding your concerns, I feel the pros
outweigh the cons. To me, the chances of patch releases on 2.10.x are much
higher than a new 2.11 minor release. (There didn't seem to be many people
outside of our company who expressed interest in getting new features to
branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
release, there's 29 patches that have gone into branch-2 and 9 in
branch-2.10, so it's already diverged quite a bit.

In any case, we can always reverse this decision if we really need to, by
recreating branch-2. But this proposal would reduce a lot of confusion IMO.

Jonathan Hung


On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Wangda Tan <wh...@gmail.com>.
+1, thanks Jonathan for bringing this up!

On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Thanks Eric for the comments - regarding your concerns, I feel the pros
outweigh the cons. To me, the chances of patch releases on 2.10.x are much
higher than a new 2.11 minor release. (There didn't seem to be many people
outside of our company who expressed interest in getting new features to
branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
release, there's 29 patches that have gone into branch-2 and 9 in
branch-2.10, so it's already diverged quite a bit.

In any case, we can always reverse this decision if we really need to, by
recreating branch-2. But this proposal would reduce a lot of confusion IMO.

Jonathan Hung


On Fri, Nov 15, 2019 at 11:41 AM epayne@apache.org <ep...@apache.org>
wrote:

> Thanks Jonathan for opening the discussion.
>
> I am not in favor of this proposal. 2.10 was very recently released, and
> moving to 2.10 will take some time for the community. It seems premature to
> make a decision at this point that there will never be a need for a 2.11
> release.
>
> -Eric
>
>
>  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> jyhung2357@gmail.com> wrote:
>
> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a bridge
> release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> release line in branch-2. Currently, the main issue is that there's many
> fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to
> branch-2.10.
>
> To do this, I propose we:
>
>   - Delete branch-2.10
>   - Rename branch-2 to branch-2.10
>   - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Thanks Jonathan for opening the discussion.

I am not in favor of this proposal. 2.10 was very recently released, and moving to 2.10 will take some time for the community. It seems premature to make a decision at this point that there will never be a need for a 2.11 release.

-Eric


 On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <jy...@gmail.com> wrote: 

Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

  - Delete branch-2.10
  - Rename branch-2 to branch-2.10
  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Thanks Jonathan for opening the discussion.

I am not in favor of this proposal. 2.10 was very recently released, and moving to 2.10 will take some time for the community. It seems premature to make a decision at this point that there will never be a need for a 2.11 release.

-Eric


 On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <jy...@gmail.com> wrote: 

Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

  - Delete branch-2.10
  - Rename branch-2 to branch-2.10
  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org


Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Some other additional items we would need:

   - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
   2.10.1
   - Remove 2.11.0 as a version in these projects


Jonathan Hung


On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com> wrote:

> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a
> bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> minor release line in branch-2. Currently, the main issue is that there's
> many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to branch-2.10.
>
> To do this, I propose we:
>
>    - Delete branch-2.10
>    - Rename branch-2 to branch-2.10
>    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Thanks Jonathan for opening the discussion.

I am not in favor of this proposal. 2.10 was very recently released, and moving to 2.10 will take some time for the community. It seems premature to make a decision at this point that there will never be a need for a 2.11 release.

-Eric


 On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <jy...@gmail.com> wrote: 

Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

  - Delete branch-2.10
  - Rename branch-2 to branch-2.10
  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Some other additional items we would need:

   - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
   2.10.1
   - Remove 2.11.0 as a version in these projects


Jonathan Hung


On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com> wrote:

> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a
> bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> minor release line in branch-2. Currently, the main issue is that there's
> many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to branch-2.10.
>
> To do this, I propose we:
>
>    - Delete branch-2.10
>    - Rename branch-2 to branch-2.10
>    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by Jonathan Hung <jy...@gmail.com>.
Some other additional items we would need:

   - Mark all fix-versions in YARN/HDFS/MAPREDUCE/HADOOP from 2.11.0 to
   2.10.1
   - Remove 2.11.0 as a version in these projects


Jonathan Hung


On Thu, Nov 14, 2019 at 6:51 PM Jonathan Hung <jy...@gmail.com> wrote:

> Hi folks,
>
> Given the release of 2.10.0, and the fact that it's intended to be a
> bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last
> minor release line in branch-2. Currently, the main issue is that there's
> many fixes going into branch-2 (the theoretical 2.11.0) that's not going
> into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> likely never see the light of day unless they are backported to branch-2.10.
>
> To do this, I propose we:
>
>    - Delete branch-2.10
>    - Rename branch-2 to branch-2.10
>    - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
>
> This way we get all the current branch-2 fixes into the 2.10.x release
> line. Then the commit chain will look like: trunk -> branch-3.2 ->
> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
>
> Thoughts?
>
> Jonathan Hung
>
> [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
>

Re: [DISCUSS] Making 2.10 the last minor 2.x release

Posted by "epayne@apache.org" <ep...@apache.org>.
Thanks Jonathan for opening the discussion.

I am not in favor of this proposal. 2.10 was very recently released, and moving to 2.10 will take some time for the community. It seems premature to make a decision at this point that there will never be a need for a 2.11 release.

-Eric


 On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <jy...@gmail.com> wrote: 

Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

  - Delete branch-2.10
  - Rename branch-2 to branch-2.10
  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org