You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Andrew Wang <an...@cloudera.com> on 2016/04/22 01:31:43 UTC

Re: Looking to a Hadoop 3 release

Hi folks,

Very optimistically, we're still on track for a 3.0 alpha this month.
Here's a JIRA query for 3.0 and 2.8:

https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20MAPREDUCE%2C%20YARN)%20AND%20%22Target%20Version%2Fs%22%20in%20(3.0.0%2C%202.8.0)%20AND%20statusCategory%20not%20in%20(Complete)%20ORDER%20BY%20priority

I think two of these are true alpha blockers: HADOOP-12892 and
HADOOP-12893. I'm trying to help push both of those forward.

For the rest, I think it's probably okay to delay until the next alpha,
since we're planning a few alphas leading up to beta. That said, if you are
the owner of a Blocker targeted at 3.0.0, I'd encourage reviving those
patches. The earlier the better for incompatible changes.

In all likelihood, this first release will slip into early May, but I'll be
disappointed if we don't have an RC out before ApacheCon.

Best,
Andrew

On Mon, Feb 22, 2016 at 3:19 PM, Colin P. McCabe <cm...@apache.org> wrote:

> I think starting a 3.0 alpha soon would be a great idea.  As some
> other people commented, this would come with no compatibility
> guarantees, so that we can iron out any issues.
>
> Colin
>
> On Mon, Feb 22, 2016 at 1:26 PM, Zhe Zhang <zh...@cloudera.com> wrote:
> > Thanks Andrew for driving the effort!
> >
> > +1 (non-binding) on starting the 3.0 release process now with 3.0 as an
> > alpha.
> >
> > I wanted to echo Andrew's point that backporting EC to branch-2 is a lot
> of
> > work. Considering that no concrete backporting plan has been proposed, it
> > seems quite uncertain whether / when it can be released in 2.9. I think
> we
> > should rather concentrate our EC dev efforts to harden key features under
> > the follow-on umbrella HDFS-8031 and make it solid for a 3.0 release.
> >
> > Sincerely,
> > Zhe
> >
> > On Mon, Feb 22, 2016 at 9:25 AM Colin P. McCabe <cm...@apache.org>
> wrote:
> >
> >> +1 for a release of 3.0.  There are a lot of significant,
> >> compatibility-breaking, but necessary changes in this release... we've
> >> touched on some of them in this thread.
> >>
> >> +1 for a parallel release of 2.8 as well.  I think we are pretty close
> >> to this, barring a dozen or so blockers.
> >>
> >> best,
> >> Colin
> >>
> >> On Mon, Feb 22, 2016 at 2:56 AM, Steve Loughran <stevel@hortonworks.com
> >
> >> wrote:
> >> >
> >> >> On 20 Feb 2016, at 15:34, Junping Du <jd...@hortonworks.com> wrote:
> >> >>
> >> >> Shall we consolidate effort for 2.8.0 and 3.0.0? It doesn't sounds
> >> reasonable to have two alpha releases to go in parallel. Is EC feature
> the
> >> main motivation of releasing hadoop 3 here? If so, I don't understand
> why
> >> this feature cannot land on 2.8.x or 2.9.x as an alpha feature.
> >> >
> >> >
> >> >
> >> >> If we release 3.0 in a month like plan proposed below, it means we
> will
> >> have 4 active releases going in parallel - two alpha releases (2.8 and
> 3.0)
> >> and two stable releases (2.6.x and 2.7.x). It brings a lot of
> challenges in
> >> issues tracking and patch committing, not even mention the tremendous
> >> effort of release verification and voting.
> >> >> I would like to propose to wait 2.8 release become stable (may be 2nd
> >> release in 2.8 branch cause first release is alpha due to discussion in
> >> another email thread), then we can move to 3.0 as the only alpha
> release.
> >> In the meantime, we can bring more significant features (like ATS v2,
> etc.)
> >> to trunk and consolidate stable releases in 2.6.x and 2.7.x. I believe
> that
> >> make life easier. :)
> >> >> Thoughts?
> >> >>
> >> >
> >> > 2.8.0 is relatively close to shipping. I say relatively as I'm doing
> >> some work with ATS 1.5 downstream and I'd like to make sure all that
> works.
> >> There's also a large collection of S3 and swift patches needing
> attention
> >> from any reviewers with time and credentials.
> >> >
> >> > 3.x is going to take multiple iterations to stabilise, and with more
> >> changes, more significant a rollout. I'd also like to do a complete
> update
> >> of all the dependencies before a final release, so we can have less
> >> pressure to upgrade for a while, and get Sean's classloader patch in so
> >> it's slightly less visible.
> >> >
> >> > That means 3.0 is going to be an alpha release, not final.
> >> >
> >> > one thing that could be shared is any build.xml automation of the
> >> release process, to at least take away most of the manual steps in the
> >> process, to have something more repeatable.
> >> >
> >> > -steve
> >> >
> >> >
> >> >> Thanks,
> >> >>
> >> >> Junping
> >> >> ________________________________________
> >> >> From: Yongjun Zhang <yz...@cloudera.com>
> >> >> Sent: Friday, February 19, 2016 8:05 PM
> >> >> To: hdfs-dev@hadoop.apache.org
> >> >> Cc: common-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org;
> >> yarn-dev@hadoop.apache.org
> >> >> Subject: Re: Looking to a Hadoop 3 release
> >> >>
> >> >> Thanks Andrew for initiating the effort!
> >> >>
> >> >> +1 on pushing 3.x with extended alpha cycle, and continuing the more
> >> stable
> >> >> 2.x releases.
> >> >>
> >> >> --Yongjun
> >> >>
> >> >> On Thu, Feb 18, 2016 at 5:58 PM, Andrew Wang <
> andrew.wang@cloudera.com>
> >> >> wrote:
> >> >>
> >> >>> Hi Kai,
> >> >>>
> >> >>> Sure, I'm open to it. It's a new major release, so we're allowed to
> >> make
> >> >>> these kinds of big changes. The idea behind the extended alpha
> cycle is
> >> >>> that downstreams can give us feedback. This way if we do anything
> too
> >> >>> radical, we can address it in the next alpha and have downstreams
> >> re-test.
> >> >>>
> >> >>> Best,
> >> >>> Andrew
> >> >>>
> >> >>> On Thu, Feb 18, 2016 at 5:23 PM, Zheng, Kai <ka...@intel.com>
> >> wrote:
> >> >>>
> >> >>>> Thanks Andrew for driving this. Wonder if it's a good chance for
> >> >>>> HADOOP-12579 (Deprecate and remove WriteableRPCEngine) to be in.
> Note
> >> >>> it's
> >> >>>> not an incompatible change, but feel better to be done in the major
> >> >>> release.
> >> >>>>
> >> >>>> Regards,
> >> >>>> Kai
> >> >>>>
> >> >>>> -----Original Message-----
> >> >>>> From: Andrew Wang [mailto:andrew.wang@cloudera.com]
> >> >>>> Sent: Friday, February 19, 2016 7:04 AM
> >> >>>> To: hdfs-dev@hadoop.apache.org; Kihwal Lee <ki...@yahoo-inc.com>
> >> >>>> Cc: mapreduce-dev@hadoop.apache.org; common-dev@hadoop.apache.org;
> >> >>>> yarn-dev@hadoop.apache.org
> >> >>>> Subject: Re: Looking to a Hadoop 3 release
> >> >>>>
> >> >>>> Hi Kihwal,
> >> >>>>
> >> >>>> I think there's still value in continuing the 2.x releases. 3.x
> comes
> >> >>> with
> >> >>>> the incompatible bump to a JDK8 runtime, and also the fact that 3.x
> >> won't
> >> >>>> be beta or GA for some number of months. In the meanwhile, it'd be
> >> good
> >> >>> to
> >> >>>> keep putting out regular, stable 2.x releases.
> >> >>>>
> >> >>>> Best,
> >> >>>> Andrew
> >> >>>>
> >> >>>>
> >> >>>> On Thu, Feb 18, 2016 at 2:50 PM, Kihwal Lee
> >> <kihwal@yahoo-inc.com.invalid
> >> >>>>
> >> >>>> wrote:
> >> >>>>
> >> >>>>> Moving Hadoop 3 forward sounds fine. If EC is one of the main
> >> >>>>> motivations, are we getting rid of branch-2.8?
> >> >>>>>
> >> >>>>> Kihwal
> >> >>>>>
> >> >>>>>      From: Andrew Wang <an...@cloudera.com>
> >> >>>>> To: "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>
> >> >>>>> Cc: "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>; "
> >> >>>>> mapreduce-dev@hadoop.apache.org" <mapreduce-dev@hadoop.apache.org
> >;
> >> >>>>> hdfs-dev <hd...@hadoop.apache.org>
> >> >>>>> Sent: Thursday, February 18, 2016 4:35 PM
> >> >>>>> Subject: Re: Looking to a Hadoop 3 release
> >> >>>>>
> >> >>>>> Hi all,
> >> >>>>>
> >> >>>>> Reviving this thread. I've seen renewed interest in a trunk
> release
> >> >>>>> since HDFS erasure coding has not yet made it to branch-2. Along
> with
> >> >>>>> JDK8, the shell script rewrite, and many other improvements, I
> think
> >> >>>>> it's time to revisit Hadoop 3.0 release plans.
> >> >>>>>
> >> >>>>> My overall plan is still the same as in my original email: a
> series
> >> of
> >> >>>>> regular alpha releases leading up to beta and GA. Alpha releases
> make
> >> >>>>> it easier for downstreams to integrate with our code, and making
> them
> >> >>>>> regular means features can be included when they are ready.
> >> >>>>>
> >> >>>>> I know there are some incompatible changes waiting in the wings
> (i.e.
> >> >>>>> HDFS-6984 making FileStatus a PB rather than Writable, some of
> >> >>>>> HADOOP-9991 bumping dependency versions) that would be good to get
> >> in.
> >> >>>>> If you have changes like this, please set the target version to
> 3.0.0
> >> >>>>> and mark them "Incompatible". We can use this JIRA query to track:
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%2
> >> >>>>>
> >> 0HDFS%2C%20YARN%2C%20MAPREDUCE)%20and%20%22Target%20Version%2Fs%22%20%
> >> >>>>>
> >> 3D%20%223.0.0%22%20and%20resolution%3D%22unresolved%22%20and%20%22Hado
> >> >>>>>
> op%20Flags%22%3D%22Incompatible%20change%22%20order%20by%20priority
> >> >>>>>
> >> >>>>> There's some release-related stuff that needs to be sorted out
> >> >>>>> (namely, the new CHANGES.txt and release note generation from
> Yetus),
> >> >>>>> but I'd tentatively like to roll the first alpha a month out, so
> >> third
> >> >>>>> week of March.
> >> >>>>>
> >> >>>>> Best,
> >> >>>>> Andrew
> >> >>>>>
> >> >>>>> On Mon, Mar 9, 2015 at 7:23 PM, Raymie Stata <
> rstata@altiscale.com>
> >> >>>> wrote:
> >> >>>>>
> >> >>>>>> Avoiding the use of JDK8 language features (and, presumably,
> APIs)
> >> >>>>>> means you've abandoned #1, i.e., you haven't (really) bumped the
> JDK
> >> >>>>>> source version to JDK8.
> >> >>>>>>
> >> >>>>>> Also, note that releasing from trunk is a way of achieving #3,
> it's
> >> >>>>>> not a way of abandoning it.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Mon, Mar 9, 2015 at 7:10 PM, Andrew Wang
> >> >>>>>> <an...@cloudera.com>
> >> >>>>>> wrote:
> >> >>>>>>> Hi Raymie,
> >> >>>>>>>
> >> >>>>>>> Konst proposed just releasing off of trunk rather than cutting a
> >> >>>>>> branch-2,
> >> >>>>>>> and there was general agreement there. So, consider #3
> abandoned.
> >> >>>>>>> 1&2
> >> >>>>> can
> >> >>>>>>> be achieved at the same time, we just need to avoid using JDK8
> >> >>>>>>> language features in trunk so things can be backported.
> >> >>>>>>>
> >> >>>>>>> Best,
> >> >>>>>>> Andrew
> >> >>>>>>>
> >> >>>>>>> On Mon, Mar 9, 2015 at 7:01 PM, Raymie Stata
> >> >>>>>>> <rs...@altiscale.com>
> >> >>>>>> wrote:
> >> >>>>>>>
> >> >>>>>>>> In this (and the related threads), I see the following three
> >> >>>>>> requirements:
> >> >>>>>>>>
> >> >>>>>>>> 1. "Bump the source JDK version to JDK8" (ie, drop JDK7
> support).
> >> >>>>>>>>
> >> >>>>>>>> 2. "We'll still be releasing 2.x releases for a while, with
> >> >>>>>>>> similar feature sets as 3.x."
> >> >>>>>>>>
> >> >>>>>>>> 3. Avoid the "risk of split-brain behavior" by "minimize
> >> >>>>>>>> backporting headaches. Pulling trunk > branch-2 > branch-2.x is
> >> >>>> already tedious.
> >> >>>>>>>> Adding a branch-3, branch-3.x would be obnoxious."
> >> >>>>>>>>
> >> >>>>>>>> These three cannot be achieved at the same time.  Which do we
> >> >>>> abandon?
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia
> >> >>>>>>>> <sa...@gmail.com>
> >> >>>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> On Mar 5, 2015, at 3:21 PM, Siddharth Seth <sseth@apache.org
> >
> >> >>>>> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>> 2) Simplification of configs - potentially separating client
> >> >>>>>>>>>> side
> >> >>>>>>>> configs
> >> >>>>>>>>>> and those used by daemons. This is another source of
> perpetual
> >> >>>>>> confusion
> >> >>>>>>>>>> for users.
> >> >>>>>>>>> + 1 on this.
> >> >>>>>>>>>
> >> >>>>>>>>> sanjay
> >> >>>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >> >
> >>
>

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

A heads up that I think we're getting close on the blockers for the first
alpha. Looking at my list, I see two I'd like to get in still: YARN-5270
and HADOOP-13316. Will cut a branch and roll the release once those go in;
my test builds have looked good thus far.

My original plan was to do alphas and then beta in Aug/Sep, but given how
the create-release and L&N changes delayed us by a few months, it also
pushes out the beta timeframe. Given that Nov/Dec is often a quiet period
of development, I think a realistic new beta date is sometime early next
year (Jan/Feb). FYI.

Thanks,
Andrew

On Thu, May 12, 2016 at 5:20 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> I am with Vinod on avoiding merging mostly_complete_branches to trunk since
> we are not shipping any release off it. If 3.x releases going off of trunk
> is going to help with this, I am fine with that approach. We should still
> make sure to keep trunk-incompat small and not include large features.
>
> On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org>
> wrote:
>
> > If we're not starting branch-3/trunk, what would distinguish it from
> > trunk/trunk-incompat? Is it the same mechanism with different labels?
> >
> > That may be a reasonable strategy when we create branch-3, as a
> > release branch for beta. Releasing 3.x from trunk will help us figure
> > out which incompatibilities can be called out in an upgrade guide
> > (e.g., "new feature X is incompatible with uncommon configuration Y")
> > and which require code changes (e.g., "data loss upgrading a cluster
> > with feature X"). Given how long trunk has been unreleased, we need
> > more data from deployments to triage. How to manage transitions
> > between major versions will always be case-by-case; consensus on how
> > we'll address generic incompatible changes is not saving any work.
> >
> > Once created, removing functionality from branch-3 (leaving it in
> > trunk) _because_ nobody volunteers cycles to address urgent
> > compatibility issues is fair. It's also more workable than asking that
> > features be committed to a branch that we have no plan to release,
> > even as alpha. -C
> >
> > On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> > <vi...@apache.org> wrote:
> > > Tx for your replies, Andrew.
> > >
> > >>> For exit criteria, how about we time box it? My plan was to do
> monthly
> > >> alphas through the summer, leading up to beta in late August / early
> > Sep.
> > >> At that point we freeze and stabilize for GA in Nov/Dec.
> > >
> > >
> > > Time-boxing is a reasonable exit-criterion.
> > >
> > >
> > >> In this case, does trunk-incompat essentially become the new trunk? Or
> > are
> > >> we treating trunk-incompat as a feature branch, which periodically
> > merges
> > >> changes from trunk?
> > >
> > >
> > > It’s the later. Essentially
> > >  - trunk-incompat = trunk + only incompatible changes, periodically
> kept
> > up-to-date to trunk
> > >  - trunk is always ready to ship
> > >  - and no compatible code gets left behind
> > >
> > > The reason for my proposal like this is to address the tension between
> > “there is lot of compatible code in trunk that we are not shipping” and
> > “don’t ship trunk, it has incompatibilities”. With this, we will not have
> > (compatible) code not getting shipped to users.
> > >
> > > Obviously, we can forget about all of my proposal completely if
> everyone
> > puts in all compatible code into branch-2 / branch-3 or whatever the main
> > releasable branch is. This didn’t work in practice, have seen this not
> > happening prominently during 0.21, and now 3.x.
> > >
> > > There is another related issue - "my feature is nearly ready, so I’ll
> > just merge it into trunk as we don’t release that anyways, but not the
> > current releasable branch - I’m lazy to fix the last few stability
> related
> > issues”. With this, we will (should) get more disciplined, take feature
> > stability on a branch seriously and merge a feature branch only when it
> is
> > truly ready!
> > >
> > >> For 3.x, my strawman was to release off trunk for the alphas, then
> > branch a
> > >> branch-3 for the beta and onwards.
> > >
> > >
> > > Repeating above, I’m proposing continuing to make GA 3.x releases also
> > off of trunk! This way only incompatible changes don’t get shipped to
> users
> > - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> > incompatible code to warrant a 4.x, 5.x etc.
> > >
> > > +Vinod
> >
>

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

A heads up that I think we're getting close on the blockers for the first
alpha. Looking at my list, I see two I'd like to get in still: YARN-5270
and HADOOP-13316. Will cut a branch and roll the release once those go in;
my test builds have looked good thus far.

My original plan was to do alphas and then beta in Aug/Sep, but given how
the create-release and L&N changes delayed us by a few months, it also
pushes out the beta timeframe. Given that Nov/Dec is often a quiet period
of development, I think a realistic new beta date is sometime early next
year (Jan/Feb). FYI.

Thanks,
Andrew

On Thu, May 12, 2016 at 5:20 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> I am with Vinod on avoiding merging mostly_complete_branches to trunk since
> we are not shipping any release off it. If 3.x releases going off of trunk
> is going to help with this, I am fine with that approach. We should still
> make sure to keep trunk-incompat small and not include large features.
>
> On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org>
> wrote:
>
> > If we're not starting branch-3/trunk, what would distinguish it from
> > trunk/trunk-incompat? Is it the same mechanism with different labels?
> >
> > That may be a reasonable strategy when we create branch-3, as a
> > release branch for beta. Releasing 3.x from trunk will help us figure
> > out which incompatibilities can be called out in an upgrade guide
> > (e.g., "new feature X is incompatible with uncommon configuration Y")
> > and which require code changes (e.g., "data loss upgrading a cluster
> > with feature X"). Given how long trunk has been unreleased, we need
> > more data from deployments to triage. How to manage transitions
> > between major versions will always be case-by-case; consensus on how
> > we'll address generic incompatible changes is not saving any work.
> >
> > Once created, removing functionality from branch-3 (leaving it in
> > trunk) _because_ nobody volunteers cycles to address urgent
> > compatibility issues is fair. It's also more workable than asking that
> > features be committed to a branch that we have no plan to release,
> > even as alpha. -C
> >
> > On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> > <vi...@apache.org> wrote:
> > > Tx for your replies, Andrew.
> > >
> > >>> For exit criteria, how about we time box it? My plan was to do
> monthly
> > >> alphas through the summer, leading up to beta in late August / early
> > Sep.
> > >> At that point we freeze and stabilize for GA in Nov/Dec.
> > >
> > >
> > > Time-boxing is a reasonable exit-criterion.
> > >
> > >
> > >> In this case, does trunk-incompat essentially become the new trunk? Or
> > are
> > >> we treating trunk-incompat as a feature branch, which periodically
> > merges
> > >> changes from trunk?
> > >
> > >
> > > It’s the later. Essentially
> > >  - trunk-incompat = trunk + only incompatible changes, periodically
> kept
> > up-to-date to trunk
> > >  - trunk is always ready to ship
> > >  - and no compatible code gets left behind
> > >
> > > The reason for my proposal like this is to address the tension between
> > “there is lot of compatible code in trunk that we are not shipping” and
> > “don’t ship trunk, it has incompatibilities”. With this, we will not have
> > (compatible) code not getting shipped to users.
> > >
> > > Obviously, we can forget about all of my proposal completely if
> everyone
> > puts in all compatible code into branch-2 / branch-3 or whatever the main
> > releasable branch is. This didn’t work in practice, have seen this not
> > happening prominently during 0.21, and now 3.x.
> > >
> > > There is another related issue - "my feature is nearly ready, so I’ll
> > just merge it into trunk as we don’t release that anyways, but not the
> > current releasable branch - I’m lazy to fix the last few stability
> related
> > issues”. With this, we will (should) get more disciplined, take feature
> > stability on a branch seriously and merge a feature branch only when it
> is
> > truly ready!
> > >
> > >> For 3.x, my strawman was to release off trunk for the alphas, then
> > branch a
> > >> branch-3 for the beta and onwards.
> > >
> > >
> > > Repeating above, I’m proposing continuing to make GA 3.x releases also
> > off of trunk! This way only incompatible changes don’t get shipped to
> users
> > - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> > incompatible code to warrant a 4.x, 5.x etc.
> > >
> > > +Vinod
> >
>

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

A heads up that I think we're getting close on the blockers for the first
alpha. Looking at my list, I see two I'd like to get in still: YARN-5270
and HADOOP-13316. Will cut a branch and roll the release once those go in;
my test builds have looked good thus far.

My original plan was to do alphas and then beta in Aug/Sep, but given how
the create-release and L&N changes delayed us by a few months, it also
pushes out the beta timeframe. Given that Nov/Dec is often a quiet period
of development, I think a realistic new beta date is sometime early next
year (Jan/Feb). FYI.

Thanks,
Andrew

On Thu, May 12, 2016 at 5:20 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> I am with Vinod on avoiding merging mostly_complete_branches to trunk since
> we are not shipping any release off it. If 3.x releases going off of trunk
> is going to help with this, I am fine with that approach. We should still
> make sure to keep trunk-incompat small and not include large features.
>
> On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org>
> wrote:
>
> > If we're not starting branch-3/trunk, what would distinguish it from
> > trunk/trunk-incompat? Is it the same mechanism with different labels?
> >
> > That may be a reasonable strategy when we create branch-3, as a
> > release branch for beta. Releasing 3.x from trunk will help us figure
> > out which incompatibilities can be called out in an upgrade guide
> > (e.g., "new feature X is incompatible with uncommon configuration Y")
> > and which require code changes (e.g., "data loss upgrading a cluster
> > with feature X"). Given how long trunk has been unreleased, we need
> > more data from deployments to triage. How to manage transitions
> > between major versions will always be case-by-case; consensus on how
> > we'll address generic incompatible changes is not saving any work.
> >
> > Once created, removing functionality from branch-3 (leaving it in
> > trunk) _because_ nobody volunteers cycles to address urgent
> > compatibility issues is fair. It's also more workable than asking that
> > features be committed to a branch that we have no plan to release,
> > even as alpha. -C
> >
> > On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> > <vi...@apache.org> wrote:
> > > Tx for your replies, Andrew.
> > >
> > >>> For exit criteria, how about we time box it? My plan was to do
> monthly
> > >> alphas through the summer, leading up to beta in late August / early
> > Sep.
> > >> At that point we freeze and stabilize for GA in Nov/Dec.
> > >
> > >
> > > Time-boxing is a reasonable exit-criterion.
> > >
> > >
> > >> In this case, does trunk-incompat essentially become the new trunk? Or
> > are
> > >> we treating trunk-incompat as a feature branch, which periodically
> > merges
> > >> changes from trunk?
> > >
> > >
> > > It’s the later. Essentially
> > >  - trunk-incompat = trunk + only incompatible changes, periodically
> kept
> > up-to-date to trunk
> > >  - trunk is always ready to ship
> > >  - and no compatible code gets left behind
> > >
> > > The reason for my proposal like this is to address the tension between
> > “there is lot of compatible code in trunk that we are not shipping” and
> > “don’t ship trunk, it has incompatibilities”. With this, we will not have
> > (compatible) code not getting shipped to users.
> > >
> > > Obviously, we can forget about all of my proposal completely if
> everyone
> > puts in all compatible code into branch-2 / branch-3 or whatever the main
> > releasable branch is. This didn’t work in practice, have seen this not
> > happening prominently during 0.21, and now 3.x.
> > >
> > > There is another related issue - "my feature is nearly ready, so I’ll
> > just merge it into trunk as we don’t release that anyways, but not the
> > current releasable branch - I’m lazy to fix the last few stability
> related
> > issues”. With this, we will (should) get more disciplined, take feature
> > stability on a branch seriously and merge a feature branch only when it
> is
> > truly ready!
> > >
> > >> For 3.x, my strawman was to release off trunk for the alphas, then
> > branch a
> > >> branch-3 for the beta and onwards.
> > >
> > >
> > > Repeating above, I’m proposing continuing to make GA 3.x releases also
> > off of trunk! This way only incompatible changes don’t get shipped to
> users
> > - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> > incompatible code to warrant a 4.x, 5.x etc.
> > >
> > > +Vinod
> >
>

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

A heads up that I think we're getting close on the blockers for the first
alpha. Looking at my list, I see two I'd like to get in still: YARN-5270
and HADOOP-13316. Will cut a branch and roll the release once those go in;
my test builds have looked good thus far.

My original plan was to do alphas and then beta in Aug/Sep, but given how
the create-release and L&N changes delayed us by a few months, it also
pushes out the beta timeframe. Given that Nov/Dec is often a quiet period
of development, I think a realistic new beta date is sometime early next
year (Jan/Feb). FYI.

Thanks,
Andrew

On Thu, May 12, 2016 at 5:20 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> I am with Vinod on avoiding merging mostly_complete_branches to trunk since
> we are not shipping any release off it. If 3.x releases going off of trunk
> is going to help with this, I am fine with that approach. We should still
> make sure to keep trunk-incompat small and not include large features.
>
> On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org>
> wrote:
>
> > If we're not starting branch-3/trunk, what would distinguish it from
> > trunk/trunk-incompat? Is it the same mechanism with different labels?
> >
> > That may be a reasonable strategy when we create branch-3, as a
> > release branch for beta. Releasing 3.x from trunk will help us figure
> > out which incompatibilities can be called out in an upgrade guide
> > (e.g., "new feature X is incompatible with uncommon configuration Y")
> > and which require code changes (e.g., "data loss upgrading a cluster
> > with feature X"). Given how long trunk has been unreleased, we need
> > more data from deployments to triage. How to manage transitions
> > between major versions will always be case-by-case; consensus on how
> > we'll address generic incompatible changes is not saving any work.
> >
> > Once created, removing functionality from branch-3 (leaving it in
> > trunk) _because_ nobody volunteers cycles to address urgent
> > compatibility issues is fair. It's also more workable than asking that
> > features be committed to a branch that we have no plan to release,
> > even as alpha. -C
> >
> > On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> > <vi...@apache.org> wrote:
> > > Tx for your replies, Andrew.
> > >
> > >>> For exit criteria, how about we time box it? My plan was to do
> monthly
> > >> alphas through the summer, leading up to beta in late August / early
> > Sep.
> > >> At that point we freeze and stabilize for GA in Nov/Dec.
> > >
> > >
> > > Time-boxing is a reasonable exit-criterion.
> > >
> > >
> > >> In this case, does trunk-incompat essentially become the new trunk? Or
> > are
> > >> we treating trunk-incompat as a feature branch, which periodically
> > merges
> > >> changes from trunk?
> > >
> > >
> > > It’s the later. Essentially
> > >  - trunk-incompat = trunk + only incompatible changes, periodically
> kept
> > up-to-date to trunk
> > >  - trunk is always ready to ship
> > >  - and no compatible code gets left behind
> > >
> > > The reason for my proposal like this is to address the tension between
> > “there is lot of compatible code in trunk that we are not shipping” and
> > “don’t ship trunk, it has incompatibilities”. With this, we will not have
> > (compatible) code not getting shipped to users.
> > >
> > > Obviously, we can forget about all of my proposal completely if
> everyone
> > puts in all compatible code into branch-2 / branch-3 or whatever the main
> > releasable branch is. This didn’t work in practice, have seen this not
> > happening prominently during 0.21, and now 3.x.
> > >
> > > There is another related issue - "my feature is nearly ready, so I’ll
> > just merge it into trunk as we don’t release that anyways, but not the
> > current releasable branch - I’m lazy to fix the last few stability
> related
> > issues”. With this, we will (should) get more disciplined, take feature
> > stability on a branch seriously and merge a feature branch only when it
> is
> > truly ready!
> > >
> > >> For 3.x, my strawman was to release off trunk for the alphas, then
> > branch a
> > >> branch-3 for the beta and onwards.
> > >
> > >
> > > Repeating above, I’m proposing continuing to make GA 3.x releases also
> > off of trunk! This way only incompatible changes don’t get shipped to
> users
> > - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> > incompatible code to warrant a 4.x, 5.x etc.
> > >
> > > +Vinod
> >
>

Re: Looking to a Hadoop 3 release

Posted by Karthik Kambatla <ka...@cloudera.com>.

I am with Vinod on avoiding merging mostly_complete_branches to trunk since
we are not shipping any release off it. If 3.x releases going off of trunk
is going to help with this, I am fine with that approach. We should still
make sure to keep trunk-incompat small and not include large features.

On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org> wrote:

> If we're not starting branch-3/trunk, what would distinguish it from
> trunk/trunk-incompat? Is it the same mechanism with different labels?
>
> That may be a reasonable strategy when we create branch-3, as a
> release branch for beta. Releasing 3.x from trunk will help us figure
> out which incompatibilities can be called out in an upgrade guide
> (e.g., "new feature X is incompatible with uncommon configuration Y")
> and which require code changes (e.g., "data loss upgrading a cluster
> with feature X"). Given how long trunk has been unreleased, we need
> more data from deployments to triage. How to manage transitions
> between major versions will always be case-by-case; consensus on how
> we'll address generic incompatible changes is not saving any work.
>
> Once created, removing functionality from branch-3 (leaving it in
> trunk) _because_ nobody volunteers cycles to address urgent
> compatibility issues is fair. It's also more workable than asking that
> features be committed to a branch that we have no plan to release,
> even as alpha. -C
>
> On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> <vi...@apache.org> wrote:
> > Tx for your replies, Andrew.
> >
> >>> For exit criteria, how about we time box it? My plan was to do monthly
> >> alphas through the summer, leading up to beta in late August / early
> Sep.
> >> At that point we freeze and stabilize for GA in Nov/Dec.
> >
> >
> > Time-boxing is a reasonable exit-criterion.
> >
> >
> >> In this case, does trunk-incompat essentially become the new trunk? Or
> are
> >> we treating trunk-incompat as a feature branch, which periodically
> merges
> >> changes from trunk?
> >
> >
> > It’s the later. Essentially
> >  - trunk-incompat = trunk + only incompatible changes, periodically kept
> up-to-date to trunk
> >  - trunk is always ready to ship
> >  - and no compatible code gets left behind
> >
> > The reason for my proposal like this is to address the tension between
> “there is lot of compatible code in trunk that we are not shipping” and
> “don’t ship trunk, it has incompatibilities”. With this, we will not have
> (compatible) code not getting shipped to users.
> >
> > Obviously, we can forget about all of my proposal completely if everyone
> puts in all compatible code into branch-2 / branch-3 or whatever the main
> releasable branch is. This didn’t work in practice, have seen this not
> happening prominently during 0.21, and now 3.x.
> >
> > There is another related issue - "my feature is nearly ready, so I’ll
> just merge it into trunk as we don’t release that anyways, but not the
> current releasable branch - I’m lazy to fix the last few stability related
> issues”. With this, we will (should) get more disciplined, take feature
> stability on a branch seriously and merge a feature branch only when it is
> truly ready!
> >
> >> For 3.x, my strawman was to release off trunk for the alphas, then
> branch a
> >> branch-3 for the beta and onwards.
> >
> >
> > Repeating above, I’m proposing continuing to make GA 3.x releases also
> off of trunk! This way only incompatible changes don’t get shipped to users
> - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> incompatible code to warrant a 4.x, 5.x etc.
> >
> > +Vinod
>

Re: Looking to a Hadoop 3 release

Posted by Karthik Kambatla <ka...@cloudera.com>.

I am with Vinod on avoiding merging mostly_complete_branches to trunk since
we are not shipping any release off it. If 3.x releases going off of trunk
is going to help with this, I am fine with that approach. We should still
make sure to keep trunk-incompat small and not include large features.

On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org> wrote:

> If we're not starting branch-3/trunk, what would distinguish it from
> trunk/trunk-incompat? Is it the same mechanism with different labels?
>
> That may be a reasonable strategy when we create branch-3, as a
> release branch for beta. Releasing 3.x from trunk will help us figure
> out which incompatibilities can be called out in an upgrade guide
> (e.g., "new feature X is incompatible with uncommon configuration Y")
> and which require code changes (e.g., "data loss upgrading a cluster
> with feature X"). Given how long trunk has been unreleased, we need
> more data from deployments to triage. How to manage transitions
> between major versions will always be case-by-case; consensus on how
> we'll address generic incompatible changes is not saving any work.
>
> Once created, removing functionality from branch-3 (leaving it in
> trunk) _because_ nobody volunteers cycles to address urgent
> compatibility issues is fair. It's also more workable than asking that
> features be committed to a branch that we have no plan to release,
> even as alpha. -C
>
> On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> <vi...@apache.org> wrote:
> > Tx for your replies, Andrew.
> >
> >>> For exit criteria, how about we time box it? My plan was to do monthly
> >> alphas through the summer, leading up to beta in late August / early
> Sep.
> >> At that point we freeze and stabilize for GA in Nov/Dec.
> >
> >
> > Time-boxing is a reasonable exit-criterion.
> >
> >
> >> In this case, does trunk-incompat essentially become the new trunk? Or
> are
> >> we treating trunk-incompat as a feature branch, which periodically
> merges
> >> changes from trunk?
> >
> >
> > It’s the later. Essentially
> >  - trunk-incompat = trunk + only incompatible changes, periodically kept
> up-to-date to trunk
> >  - trunk is always ready to ship
> >  - and no compatible code gets left behind
> >
> > The reason for my proposal like this is to address the tension between
> “there is lot of compatible code in trunk that we are not shipping” and
> “don’t ship trunk, it has incompatibilities”. With this, we will not have
> (compatible) code not getting shipped to users.
> >
> > Obviously, we can forget about all of my proposal completely if everyone
> puts in all compatible code into branch-2 / branch-3 or whatever the main
> releasable branch is. This didn’t work in practice, have seen this not
> happening prominently during 0.21, and now 3.x.
> >
> > There is another related issue - "my feature is nearly ready, so I’ll
> just merge it into trunk as we don’t release that anyways, but not the
> current releasable branch - I’m lazy to fix the last few stability related
> issues”. With this, we will (should) get more disciplined, take feature
> stability on a branch seriously and merge a feature branch only when it is
> truly ready!
> >
> >> For 3.x, my strawman was to release off trunk for the alphas, then
> branch a
> >> branch-3 for the beta and onwards.
> >
> >
> > Repeating above, I’m proposing continuing to make GA 3.x releases also
> off of trunk! This way only incompatible changes don’t get shipped to users
> - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> incompatible code to warrant a 4.x, 5.x etc.
> >
> > +Vinod
>

Re: Looking to a Hadoop 3 release

Posted by Karthik Kambatla <ka...@cloudera.com>.

I am with Vinod on avoiding merging mostly_complete_branches to trunk since
we are not shipping any release off it. If 3.x releases going off of trunk
is going to help with this, I am fine with that approach. We should still
make sure to keep trunk-incompat small and not include large features.

On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org> wrote:

> If we're not starting branch-3/trunk, what would distinguish it from
> trunk/trunk-incompat? Is it the same mechanism with different labels?
>
> That may be a reasonable strategy when we create branch-3, as a
> release branch for beta. Releasing 3.x from trunk will help us figure
> out which incompatibilities can be called out in an upgrade guide
> (e.g., "new feature X is incompatible with uncommon configuration Y")
> and which require code changes (e.g., "data loss upgrading a cluster
> with feature X"). Given how long trunk has been unreleased, we need
> more data from deployments to triage. How to manage transitions
> between major versions will always be case-by-case; consensus on how
> we'll address generic incompatible changes is not saving any work.
>
> Once created, removing functionality from branch-3 (leaving it in
> trunk) _because_ nobody volunteers cycles to address urgent
> compatibility issues is fair. It's also more workable than asking that
> features be committed to a branch that we have no plan to release,
> even as alpha. -C
>
> On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> <vi...@apache.org> wrote:
> > Tx for your replies, Andrew.
> >
> >>> For exit criteria, how about we time box it? My plan was to do monthly
> >> alphas through the summer, leading up to beta in late August / early
> Sep.
> >> At that point we freeze and stabilize for GA in Nov/Dec.
> >
> >
> > Time-boxing is a reasonable exit-criterion.
> >
> >
> >> In this case, does trunk-incompat essentially become the new trunk? Or
> are
> >> we treating trunk-incompat as a feature branch, which periodically
> merges
> >> changes from trunk?
> >
> >
> > It’s the later. Essentially
> >  - trunk-incompat = trunk + only incompatible changes, periodically kept
> up-to-date to trunk
> >  - trunk is always ready to ship
> >  - and no compatible code gets left behind
> >
> > The reason for my proposal like this is to address the tension between
> “there is lot of compatible code in trunk that we are not shipping” and
> “don’t ship trunk, it has incompatibilities”. With this, we will not have
> (compatible) code not getting shipped to users.
> >
> > Obviously, we can forget about all of my proposal completely if everyone
> puts in all compatible code into branch-2 / branch-3 or whatever the main
> releasable branch is. This didn’t work in practice, have seen this not
> happening prominently during 0.21, and now 3.x.
> >
> > There is another related issue - "my feature is nearly ready, so I’ll
> just merge it into trunk as we don’t release that anyways, but not the
> current releasable branch - I’m lazy to fix the last few stability related
> issues”. With this, we will (should) get more disciplined, take feature
> stability on a branch seriously and merge a feature branch only when it is
> truly ready!
> >
> >> For 3.x, my strawman was to release off trunk for the alphas, then
> branch a
> >> branch-3 for the beta and onwards.
> >
> >
> > Repeating above, I’m proposing continuing to make GA 3.x releases also
> off of trunk! This way only incompatible changes don’t get shipped to users
> - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> incompatible code to warrant a 4.x, 5.x etc.
> >
> > +Vinod
>

Re: Looking to a Hadoop 3 release

Posted by Karthik Kambatla <ka...@cloudera.com>.

I am with Vinod on avoiding merging mostly_complete_branches to trunk since
we are not shipping any release off it. If 3.x releases going off of trunk
is going to help with this, I am fine with that approach. We should still
make sure to keep trunk-incompat small and not include large features.

On Sat, Apr 23, 2016 at 6:53 PM, Chris Douglas <cd...@apache.org> wrote:

> If we're not starting branch-3/trunk, what would distinguish it from
> trunk/trunk-incompat? Is it the same mechanism with different labels?
>
> That may be a reasonable strategy when we create branch-3, as a
> release branch for beta. Releasing 3.x from trunk will help us figure
> out which incompatibilities can be called out in an upgrade guide
> (e.g., "new feature X is incompatible with uncommon configuration Y")
> and which require code changes (e.g., "data loss upgrading a cluster
> with feature X"). Given how long trunk has been unreleased, we need
> more data from deployments to triage. How to manage transitions
> between major versions will always be case-by-case; consensus on how
> we'll address generic incompatible changes is not saving any work.
>
> Once created, removing functionality from branch-3 (leaving it in
> trunk) _because_ nobody volunteers cycles to address urgent
> compatibility issues is fair. It's also more workable than asking that
> features be committed to a branch that we have no plan to release,
> even as alpha. -C
>
> On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
> <vi...@apache.org> wrote:
> > Tx for your replies, Andrew.
> >
> >>> For exit criteria, how about we time box it? My plan was to do monthly
> >> alphas through the summer, leading up to beta in late August / early
> Sep.
> >> At that point we freeze and stabilize for GA in Nov/Dec.
> >
> >
> > Time-boxing is a reasonable exit-criterion.
> >
> >
> >> In this case, does trunk-incompat essentially become the new trunk? Or
> are
> >> we treating trunk-incompat as a feature branch, which periodically
> merges
> >> changes from trunk?
> >
> >
> > It’s the later. Essentially
> >  - trunk-incompat = trunk + only incompatible changes, periodically kept
> up-to-date to trunk
> >  - trunk is always ready to ship
> >  - and no compatible code gets left behind
> >
> > The reason for my proposal like this is to address the tension between
> “there is lot of compatible code in trunk that we are not shipping” and
> “don’t ship trunk, it has incompatibilities”. With this, we will not have
> (compatible) code not getting shipped to users.
> >
> > Obviously, we can forget about all of my proposal completely if everyone
> puts in all compatible code into branch-2 / branch-3 or whatever the main
> releasable branch is. This didn’t work in practice, have seen this not
> happening prominently during 0.21, and now 3.x.
> >
> > There is another related issue - "my feature is nearly ready, so I’ll
> just merge it into trunk as we don’t release that anyways, but not the
> current releasable branch - I’m lazy to fix the last few stability related
> issues”. With this, we will (should) get more disciplined, take feature
> stability on a branch seriously and merge a feature branch only when it is
> truly ready!
> >
> >> For 3.x, my strawman was to release off trunk for the alphas, then
> branch a
> >> branch-3 for the beta and onwards.
> >
> >
> > Repeating above, I’m proposing continuing to make GA 3.x releases also
> off of trunk! This way only incompatible changes don’t get shipped to users
> - by design! Eventually, trunk-incompat will be latest 3.x GA + enough
> incompatible code to warrant a 4.x, 5.x etc.
> >
> > +Vinod
>

Re: Looking to a Hadoop 3 release

Posted by Chris Douglas <cd...@apache.org>.

If we're not starting branch-3/trunk, what would distinguish it from
trunk/trunk-incompat? Is it the same mechanism with different labels?

That may be a reasonable strategy when we create branch-3, as a
release branch for beta. Releasing 3.x from trunk will help us figure
out which incompatibilities can be called out in an upgrade guide
(e.g., "new feature X is incompatible with uncommon configuration Y")
and which require code changes (e.g., "data loss upgrading a cluster
with feature X"). Given how long trunk has been unreleased, we need
more data from deployments to triage. How to manage transitions
between major versions will always be case-by-case; consensus on how
we'll address generic incompatible changes is not saving any work.

Once created, removing functionality from branch-3 (leaving it in
trunk) _because_ nobody volunteers cycles to address urgent
compatibility issues is fair. It's also more workable than asking that
features be committed to a branch that we have no plan to release,
even as alpha. -C

On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
<vi...@apache.org> wrote:
> Tx for your replies, Andrew.
>
>>> For exit criteria, how about we time box it? My plan was to do monthly
>> alphas through the summer, leading up to beta in late August / early Sep.
>> At that point we freeze and stabilize for GA in Nov/Dec.
>
>
> Time-boxing is a reasonable exit-criterion.
>
>
>> In this case, does trunk-incompat essentially become the new trunk? Or are
>> we treating trunk-incompat as a feature branch, which periodically merges
>> changes from trunk?
>
>
> It’s the later. Essentially
>  - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
>  - trunk is always ready to ship
>  - and no compatible code gets left behind
>
> The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.
>
> Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.
>
> There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!
>
>> For 3.x, my strawman was to release off trunk for the alphas, then branch a
>> branch-3 for the beta and onwards.
>
>
> Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.
>
> +Vinod

Re: Looking to a Hadoop 3 release

Posted by Chris Douglas <cd...@apache.org>.

If we're not starting branch-3/trunk, what would distinguish it from
trunk/trunk-incompat? Is it the same mechanism with different labels?

That may be a reasonable strategy when we create branch-3, as a
release branch for beta. Releasing 3.x from trunk will help us figure
out which incompatibilities can be called out in an upgrade guide
(e.g., "new feature X is incompatible with uncommon configuration Y")
and which require code changes (e.g., "data loss upgrading a cluster
with feature X"). Given how long trunk has been unreleased, we need
more data from deployments to triage. How to manage transitions
between major versions will always be case-by-case; consensus on how
we'll address generic incompatible changes is not saving any work.

Once created, removing functionality from branch-3 (leaving it in
trunk) _because_ nobody volunteers cycles to address urgent
compatibility issues is fair. It's also more workable than asking that
features be committed to a branch that we have no plan to release,
even as alpha. -C

On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
<vi...@apache.org> wrote:
> Tx for your replies, Andrew.
>
>>> For exit criteria, how about we time box it? My plan was to do monthly
>> alphas through the summer, leading up to beta in late August / early Sep.
>> At that point we freeze and stabilize for GA in Nov/Dec.
>
>
> Time-boxing is a reasonable exit-criterion.
>
>
>> In this case, does trunk-incompat essentially become the new trunk? Or are
>> we treating trunk-incompat as a feature branch, which periodically merges
>> changes from trunk?
>
>
> It’s the later. Essentially
>  - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
>  - trunk is always ready to ship
>  - and no compatible code gets left behind
>
> The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.
>
> Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.
>
> There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!
>
>> For 3.x, my strawman was to release off trunk for the alphas, then branch a
>> branch-3 for the beta and onwards.
>
>
> Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.
>
> +Vinod

Re: Looking to a Hadoop 3 release

Posted by Chris Douglas <cd...@apache.org>.

If we're not starting branch-3/trunk, what would distinguish it from
trunk/trunk-incompat? Is it the same mechanism with different labels?

That may be a reasonable strategy when we create branch-3, as a
release branch for beta. Releasing 3.x from trunk will help us figure
out which incompatibilities can be called out in an upgrade guide
(e.g., "new feature X is incompatible with uncommon configuration Y")
and which require code changes (e.g., "data loss upgrading a cluster
with feature X"). Given how long trunk has been unreleased, we need
more data from deployments to triage. How to manage transitions
between major versions will always be case-by-case; consensus on how
we'll address generic incompatible changes is not saving any work.

Once created, removing functionality from branch-3 (leaving it in
trunk) _because_ nobody volunteers cycles to address urgent
compatibility issues is fair. It's also more workable than asking that
features be committed to a branch that we have no plan to release,
even as alpha. -C

On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
<vi...@apache.org> wrote:
> Tx for your replies, Andrew.
>
>>> For exit criteria, how about we time box it? My plan was to do monthly
>> alphas through the summer, leading up to beta in late August / early Sep.
>> At that point we freeze and stabilize for GA in Nov/Dec.
>
>
> Time-boxing is a reasonable exit-criterion.
>
>
>> In this case, does trunk-incompat essentially become the new trunk? Or are
>> we treating trunk-incompat as a feature branch, which periodically merges
>> changes from trunk?
>
>
> It’s the later. Essentially
>  - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
>  - trunk is always ready to ship
>  - and no compatible code gets left behind
>
> The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.
>
> Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.
>
> There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!
>
>> For 3.x, my strawman was to release off trunk for the alphas, then branch a
>> branch-3 for the beta and onwards.
>
>
> Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.
>
> +Vinod

Re: Looking to a Hadoop 3 release

Posted by Chris Douglas <cd...@apache.org>.

If we're not starting branch-3/trunk, what would distinguish it from
trunk/trunk-incompat? Is it the same mechanism with different labels?

That may be a reasonable strategy when we create branch-3, as a
release branch for beta. Releasing 3.x from trunk will help us figure
out which incompatibilities can be called out in an upgrade guide
(e.g., "new feature X is incompatible with uncommon configuration Y")
and which require code changes (e.g., "data loss upgrading a cluster
with feature X"). Given how long trunk has been unreleased, we need
more data from deployments to triage. How to manage transitions
between major versions will always be case-by-case; consensus on how
we'll address generic incompatible changes is not saving any work.

Once created, removing functionality from branch-3 (leaving it in
trunk) _because_ nobody volunteers cycles to address urgent
compatibility issues is fair. It's also more workable than asking that
features be committed to a branch that we have no plan to release,
even as alpha. -C

On Fri, Apr 22, 2016 at 6:50 PM, Vinod Kumar Vavilapalli
<vi...@apache.org> wrote:
> Tx for your replies, Andrew.
>
>>> For exit criteria, how about we time box it? My plan was to do monthly
>> alphas through the summer, leading up to beta in late August / early Sep.
>> At that point we freeze and stabilize for GA in Nov/Dec.
>
>
> Time-boxing is a reasonable exit-criterion.
>
>
>> In this case, does trunk-incompat essentially become the new trunk? Or are
>> we treating trunk-incompat as a feature branch, which periodically merges
>> changes from trunk?
>
>
> It’s the later. Essentially
>  - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
>  - trunk is always ready to ship
>  - and no compatible code gets left behind
>
> The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.
>
> Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.
>
> There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!
>
>> For 3.x, my strawman was to release off trunk for the alphas, then branch a
>> branch-3 for the beta and onwards.
>
>
> Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.
>
> +Vinod

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Tx for your replies, Andrew.

>> For exit criteria, how about we time box it? My plan was to do monthly
> alphas through the summer, leading up to beta in late August / early Sep.
> At that point we freeze and stabilize for GA in Nov/Dec.


Time-boxing is a reasonable exit-criterion.


> In this case, does trunk-incompat essentially become the new trunk? Or are
> we treating trunk-incompat as a feature branch, which periodically merges
> changes from trunk?


It’s the later. Essentially
 - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
 - trunk is always ready to ship
 - and no compatible code gets left behind

The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.

Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.

There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!

> For 3.x, my strawman was to release off trunk for the alphas, then branch a
> branch-3 for the beta and onwards.


Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.

+Vinod

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Tx for your replies, Andrew.

>> For exit criteria, how about we time box it? My plan was to do monthly
> alphas through the summer, leading up to beta in late August / early Sep.
> At that point we freeze and stabilize for GA in Nov/Dec.


Time-boxing is a reasonable exit-criterion.


> In this case, does trunk-incompat essentially become the new trunk? Or are
> we treating trunk-incompat as a feature branch, which periodically merges
> changes from trunk?


It’s the later. Essentially
 - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
 - trunk is always ready to ship
 - and no compatible code gets left behind

The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.

Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.

There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!

> For 3.x, my strawman was to release off trunk for the alphas, then branch a
> branch-3 for the beta and onwards.


Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.

+Vinod

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Tx for your replies, Andrew.

>> For exit criteria, how about we time box it? My plan was to do monthly
> alphas through the summer, leading up to beta in late August / early Sep.
> At that point we freeze and stabilize for GA in Nov/Dec.


Time-boxing is a reasonable exit-criterion.


> In this case, does trunk-incompat essentially become the new trunk? Or are
> we treating trunk-incompat as a feature branch, which periodically merges
> changes from trunk?


It’s the later. Essentially
 - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
 - trunk is always ready to ship
 - and no compatible code gets left behind

The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.

Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.

There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!

> For 3.x, my strawman was to release off trunk for the alphas, then branch a
> branch-3 for the beta and onwards.


Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.

+Vinod

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Tx for your replies, Andrew.

>> For exit criteria, how about we time box it? My plan was to do monthly
> alphas through the summer, leading up to beta in late August / early Sep.
> At that point we freeze and stabilize for GA in Nov/Dec.


Time-boxing is a reasonable exit-criterion.


> In this case, does trunk-incompat essentially become the new trunk? Or are
> we treating trunk-incompat as a feature branch, which periodically merges
> changes from trunk?


It’s the later. Essentially
 - trunk-incompat = trunk + only incompatible changes, periodically kept up-to-date to trunk
 - trunk is always ready to ship
 - and no compatible code gets left behind

The reason for my proposal like this is to address the tension between “there is lot of compatible code in trunk that we are not shipping” and “don’t ship trunk, it has incompatibilities”. With this, we will not have (compatible) code not getting shipped to users.

Obviously, we can forget about all of my proposal completely if everyone puts in all compatible code into branch-2 / branch-3 or whatever the main releasable branch is. This didn’t work in practice, have seen this not happening prominently during 0.21, and now 3.x.

There is another related issue - "my feature is nearly ready, so I’ll just merge it into trunk as we don’t release that anyways, but not the current releasable branch - I’m lazy to fix the last few stability related issues”. With this, we will (should) get more disciplined, take feature stability on a branch seriously and merge a feature branch only when it is truly ready!

> For 3.x, my strawman was to release off trunk for the alphas, then branch a
> branch-3 for the beta and onwards.


Repeating above, I’m proposing continuing to make GA 3.x releases also off of trunk! This way only incompatible changes don’t get shipped to users - by design! Eventually, trunk-incompat will be latest 3.x GA + enough incompatible code to warrant a 4.x, 5.x etc.

+Vinod

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

Great comments Vinod, thanks for replying.

Since trunk is a superset of branch-2.8, I think the two efforts are mostly
aligned. The 2.8 blockers are likely also 3.0 blockers. For example, the
create-release and L&N JIRAs I mentioned are in this camp. The difference
between the two is the expectation as to the level of quality. Once we get
create-release and L&N settled, I think it's ready for an alpha. Yes, this
means we ship with some known issues, but right now there's no 3.0 artifact
for downstreams to compile and test against. Considering that we're
shipping incompatible changes, I want to give downstreams as much
opportunity to give feedback as possible.

While welcoming the push for alphas, i think we should set some exit
> criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then
> getting restless about calling it beta or GA of whatever. Essentially,
> instead of today’s questions as to "why we aren’t doing a 3.x release",
> we’d be fielding a "why is 3.x still considered alpha” question. This
> happened with 2.x alpha releases too and it wasn’t fun.
>
> For exit criteria, how about we time box it? My plan was to do monthly
alphas through the summer, leading up to beta in late August / early Sep.
At that point we freeze and stabilize for GA in Nov/Dec.

I think we all have an interest in declaring beta/GA, no one wants eternal
alpha releases.

On an unrelated note, offline I was pitching to a bunch of contributors
> another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of
> trunk directly*.
>
> What this gains us is that
>  - Trunk is always nearly stable or nearly ready for releases
>  - We no longer have some code lying around in some branch (today’s trunk)
> that is not releasable because it gets mixed with other undesirable and
> incompatible changes.
>  - This needs to be coupled with more discipline on individual features -
> medium to to large features are always worked upon in branches and get
> merged into trunk (and a nearing release!) when they are ready
>  - All incompatible changes go into some sort of a trunk-incompat branch
> and stay there till we accumulate enough of those to warrant another major
> release.
>

In this case, does trunk-incompat essentially become the new trunk? Or are
we treating trunk-incompat as a feature branch, which periodically merges
changes from trunk?

Linux has a "next" branch for separate from master for integrating pending
feature branches. I think this is a good model, and would be even better if
we published artifacts to assist with testing. However, that depends on
someone stepping up to be the maintainer of the integration branch.

I really like a more stringent policy around branch merges and new feature
development. That'd be great.

For 3.x, my strawman was to release off trunk for the alphas, then branch a
branch-3 for the beta and onwards.

Best,
Andrew

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <al...@yahoo.com.INVALID>.

> On Apr 22, 2016, at 6:10 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> Nope.
> 
> I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.
> 
> This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

	The only difference between “we aren’t releasing 4.x off of trunk” and “we aren’t releasing 4.x off of trunk-incompat” is 10 characters.

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <al...@yahoo.com.INVALID>.

> On Apr 22, 2016, at 6:10 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> Nope.
> 
> I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.
> 
> This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

	The only difference between “we aren’t releasing 4.x off of trunk” and “we aren’t releasing 4.x off of trunk-incompat” is 10 characters.

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <al...@yahoo.com.INVALID>.

> On Apr 22, 2016, at 6:10 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> Nope.
> 
> I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.
> 
> This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

	The only difference between “we aren’t releasing 4.x off of trunk” and “we aren’t releasing 4.x off of trunk-incompat” is 10 characters.

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <al...@yahoo.com.INVALID>.

> On Apr 22, 2016, at 6:10 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> Nope.
> 
> I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.
> 
> This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

	The only difference between “we aren’t releasing 4.x off of trunk” and “we aren’t releasing 4.x off of trunk-incompat” is 10 characters.

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Nope.

I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.

This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

We’ll do 2.8.x independently and later figure out if 2.9 is needed or not.

+Vinod

> On Apr 22, 2016, at 5:59 PM, Allen Wittenauer <aw...@apache.org> wrote:
> 
> 
>> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
>> 
>> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
>> 
>> What this gains us is that
>> - Trunk is always nearly stable or nearly ready for releases
>> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
>> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
>> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
>> 
>> Thoughts?
> 
> 	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.
> 
>

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Nope.

I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.

This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

We’ll do 2.8.x independently and later figure out if 2.9 is needed or not.

+Vinod

> On Apr 22, 2016, at 5:59 PM, Allen Wittenauer <aw...@apache.org> wrote:
> 
> 
>> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
>> 
>> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
>> 
>> What this gains us is that
>> - Trunk is always nearly stable or nearly ready for releases
>> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
>> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
>> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
>> 
>> Thoughts?
> 
> 	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.
> 
>

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Nope.

I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.

This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

We’ll do 2.8.x independently and later figure out if 2.9 is needed or not.

+Vinod

> On Apr 22, 2016, at 5:59 PM, Allen Wittenauer <aw...@apache.org> wrote:
> 
> 
>> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
>> 
>> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
>> 
>> What this gains us is that
>> - Trunk is always nearly stable or nearly ready for releases
>> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
>> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
>> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
>> 
>> Thoughts?
> 
> 	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.
> 
>

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Nope.

I’m proposing making a new 3.x release (as has been discussed in this thread) off today’s trunk (instead of creating a fresh branch-3) and create a new trunk-incompt where incompatible changes that we don’t want in 3.x go.

This is mainly to avoid repeating the “we are not releasing 3.x off trunk” issue when we start thinking about 4.x or any such major release in the future.

We’ll do 2.8.x independently and later figure out if 2.9 is needed or not.

+Vinod

> On Apr 22, 2016, at 5:59 PM, Allen Wittenauer <aw...@apache.org> wrote:
> 
> 
>> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
>> 
>> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
>> 
>> What this gains us is that
>> - Trunk is always nearly stable or nearly ready for releases
>> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
>> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
>> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
>> 
>> Thoughts?
> 
> 	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.
> 
>

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <aw...@apache.org>.

> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
> 
> What this gains us is that
> - Trunk is always nearly stable or nearly ready for releases
> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
> 
> Thoughts?

	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <aw...@apache.org>.

> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
> 
> What this gains us is that
> - Trunk is always nearly stable or nearly ready for releases
> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
> 
> Thoughts?

	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <aw...@apache.org>.

> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
> 
> What this gains us is that
> - Trunk is always nearly stable or nearly ready for releases
> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
> 
> Thoughts?

	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

Great comments Vinod, thanks for replying.

Since trunk is a superset of branch-2.8, I think the two efforts are mostly
aligned. The 2.8 blockers are likely also 3.0 blockers. For example, the
create-release and L&N JIRAs I mentioned are in this camp. The difference
between the two is the expectation as to the level of quality. Once we get
create-release and L&N settled, I think it's ready for an alpha. Yes, this
means we ship with some known issues, but right now there's no 3.0 artifact
for downstreams to compile and test against. Considering that we're
shipping incompatible changes, I want to give downstreams as much
opportunity to give feedback as possible.

While welcoming the push for alphas, i think we should set some exit
> criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then
> getting restless about calling it beta or GA of whatever. Essentially,
> instead of today’s questions as to "why we aren’t doing a 3.x release",
> we’d be fielding a "why is 3.x still considered alpha” question. This
> happened with 2.x alpha releases too and it wasn’t fun.
>
> For exit criteria, how about we time box it? My plan was to do monthly
alphas through the summer, leading up to beta in late August / early Sep.
At that point we freeze and stabilize for GA in Nov/Dec.

I think we all have an interest in declaring beta/GA, no one wants eternal
alpha releases.

On an unrelated note, offline I was pitching to a bunch of contributors
> another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of
> trunk directly*.
>
> What this gains us is that
>  - Trunk is always nearly stable or nearly ready for releases
>  - We no longer have some code lying around in some branch (today’s trunk)
> that is not releasable because it gets mixed with other undesirable and
> incompatible changes.
>  - This needs to be coupled with more discipline on individual features -
> medium to to large features are always worked upon in branches and get
> merged into trunk (and a nearing release!) when they are ready
>  - All incompatible changes go into some sort of a trunk-incompat branch
> and stay there till we accumulate enough of those to warrant another major
> release.
>

In this case, does trunk-incompat essentially become the new trunk? Or are
we treating trunk-incompat as a feature branch, which periodically merges
changes from trunk?

Linux has a "next" branch for separate from master for integrating pending
feature branches. I think this is a good model, and would be even better if
we published artifacts to assist with testing. However, that depends on
someone stepping up to be the maintainer of the integration branch.

I really like a more stringent policy around branch merges and new feature
development. That'd be great.

For 3.x, my strawman was to release off trunk for the alphas, then branch a
branch-3 for the beta and onwards.

Best,
Andrew

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

Great comments Vinod, thanks for replying.

Since trunk is a superset of branch-2.8, I think the two efforts are mostly
aligned. The 2.8 blockers are likely also 3.0 blockers. For example, the
create-release and L&N JIRAs I mentioned are in this camp. The difference
between the two is the expectation as to the level of quality. Once we get
create-release and L&N settled, I think it's ready for an alpha. Yes, this
means we ship with some known issues, but right now there's no 3.0 artifact
for downstreams to compile and test against. Considering that we're
shipping incompatible changes, I want to give downstreams as much
opportunity to give feedback as possible.

While welcoming the push for alphas, i think we should set some exit
> criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then
> getting restless about calling it beta or GA of whatever. Essentially,
> instead of today’s questions as to "why we aren’t doing a 3.x release",
> we’d be fielding a "why is 3.x still considered alpha” question. This
> happened with 2.x alpha releases too and it wasn’t fun.
>
> For exit criteria, how about we time box it? My plan was to do monthly
alphas through the summer, leading up to beta in late August / early Sep.
At that point we freeze and stabilize for GA in Nov/Dec.

I think we all have an interest in declaring beta/GA, no one wants eternal
alpha releases.

On an unrelated note, offline I was pitching to a bunch of contributors
> another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of
> trunk directly*.
>
> What this gains us is that
>  - Trunk is always nearly stable or nearly ready for releases
>  - We no longer have some code lying around in some branch (today’s trunk)
> that is not releasable because it gets mixed with other undesirable and
> incompatible changes.
>  - This needs to be coupled with more discipline on individual features -
> medium to to large features are always worked upon in branches and get
> merged into trunk (and a nearing release!) when they are ready
>  - All incompatible changes go into some sort of a trunk-incompat branch
> and stay there till we accumulate enough of those to warrant another major
> release.
>

In this case, does trunk-incompat essentially become the new trunk? Or are
we treating trunk-incompat as a feature branch, which periodically merges
changes from trunk?

Linux has a "next" branch for separate from master for integrating pending
feature branches. I think this is a good model, and would be even better if
we published artifacts to assist with testing. However, that depends on
someone stepping up to be the maintainer of the integration branch.

I really like a more stringent policy around branch merges and new feature
development. That'd be great.

For 3.x, my strawman was to release off trunk for the alphas, then branch a
branch-3 for the beta and onwards.

Best,
Andrew

Re: Looking to a Hadoop 3 release

Posted by Andrew Wang <an...@cloudera.com>.

Great comments Vinod, thanks for replying.

Since trunk is a superset of branch-2.8, I think the two efforts are mostly
aligned. The 2.8 blockers are likely also 3.0 blockers. For example, the
create-release and L&N JIRAs I mentioned are in this camp. The difference
between the two is the expectation as to the level of quality. Once we get
create-release and L&N settled, I think it's ready for an alpha. Yes, this
means we ship with some known issues, but right now there's no 3.0 artifact
for downstreams to compile and test against. Considering that we're
shipping incompatible changes, I want to give downstreams as much
opportunity to give feedback as possible.

While welcoming the push for alphas, i think we should set some exit
> criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then
> getting restless about calling it beta or GA of whatever. Essentially,
> instead of today’s questions as to "why we aren’t doing a 3.x release",
> we’d be fielding a "why is 3.x still considered alpha” question. This
> happened with 2.x alpha releases too and it wasn’t fun.
>
> For exit criteria, how about we time box it? My plan was to do monthly
alphas through the summer, leading up to beta in late August / early Sep.
At that point we freeze and stabilize for GA in Nov/Dec.

I think we all have an interest in declaring beta/GA, no one wants eternal
alpha releases.

On an unrelated note, offline I was pitching to a bunch of contributors
> another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of
> trunk directly*.
>
> What this gains us is that
>  - Trunk is always nearly stable or nearly ready for releases
>  - We no longer have some code lying around in some branch (today’s trunk)
> that is not releasable because it gets mixed with other undesirable and
> incompatible changes.
>  - This needs to be coupled with more discipline on individual features -
> medium to to large features are always worked upon in branches and get
> merged into trunk (and a nearing release!) when they are ready
>  - All incompatible changes go into some sort of a trunk-incompat branch
> and stay there till we accumulate enough of those to warrant another major
> release.
>

In this case, does trunk-incompat essentially become the new trunk? Or are
we treating trunk-incompat as a feature branch, which periodically merges
changes from trunk?

Linux has a "next" branch for separate from master for integrating pending
feature branches. I think this is a good model, and would be even better if
we published artifacts to assist with testing. However, that depends on
someone stepping up to be the maintainer of the integration branch.

I really like a more stringent policy around branch merges and new feature
development. That'd be great.

For 3.x, my strawman was to release off trunk for the alphas, then branch a
branch-3 for the beta and onwards.

Best,
Andrew

Re: Looking to a Hadoop 3 release

Posted by Allen Wittenauer <aw...@apache.org>.

> On Apr 22, 2016, at 5:38 PM, Vinod Kumar Vavilapalli <vi...@apache.org> wrote:
> 
> On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.
> 
> What this gains us is that
> - Trunk is always nearly stable or nearly ready for releases
> - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
> - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
> - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.
> 
> Thoughts?

	Unless I’m missing something, all this proposal does is (using today’s branch names) effectively rename trunk to trunk-incompat and branch-2 to trunk.  I’m unclear how moving "rotting trunk” to “rotting trunk-incompat” is really progress.

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Hi,

While welcoming the push for alphas, i think we should set some exit criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then getting restless about calling it beta or GA of whatever. Essentially, instead of today’s questions as to "why we aren’t doing a 3.x release", we’d be fielding a "why is 3.x still considered alpha” question. This happened with 2.x alpha releases too and it wasn’t fun.

On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.

What this gains us is that
 - Trunk is always nearly stable or nearly ready for releases
 - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
 - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
 - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.

Thoughts?

+Vinod


> On Apr 21, 2016, at 4:31 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi folks,
> 
> Very optimistically, we're still on track for a 3.0 alpha this month.
> Here's a JIRA query for 3.0 and 2.8:
> 
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20MAPREDUCE%2C%20YARN)%20AND%20%22Target%20Version%2Fs%22%20in%20(3.0.0%2C%202.8.0)%20AND%20statusCategory%20not%20in%20(Complete)%20ORDER%20BY%20priority
> 
> I think two of these are true alpha blockers: HADOOP-12892 and
> HADOOP-12893. I'm trying to help push both of those forward.
> 
> For the rest, I think it's probably okay to delay until the next alpha,
> since we're planning a few alphas leading up to beta. That said, if you are
> the owner of a Blocker targeted at 3.0.0, I'd encourage reviving those
> patches. The earlier the better for incompatible changes.
> 
> In all likelihood, this first release will slip into early May, but I'll be
> disappointed if we don't have an RC out before ApacheCon.
> 
> Best,
> Andrew
> 
> On Mon, Feb 22, 2016 at 3:19 PM, Colin P. McCabe <cm...@apache.org> wrote:
> 
>> I think starting a 3.0 alpha soon would be a great idea.  As some
>> other people commented, this would come with no compatibility
>> guarantees, so that we can iron out any issues.
>> 
>> Colin
>> 
>> On Mon, Feb 22, 2016 at 1:26 PM, Zhe Zhang <zh...@cloudera.com> wrote:
>>> Thanks Andrew for driving the effort!
>>> 
>>> +1 (non-binding) on starting the 3.0 release process now with 3.0 as an
>>> alpha.
>>> 
>>> I wanted to echo Andrew's point that backporting EC to branch-2 is a lot
>> of
>>> work. Considering that no concrete backporting plan has been proposed, it
>>> seems quite uncertain whether / when it can be released in 2.9. I think
>> we
>>> should rather concentrate our EC dev efforts to harden key features under
>>> the follow-on umbrella HDFS-8031 and make it solid for a 3.0 release.
>>> 
>>> Sincerely,
>>> Zhe
>>> 
>>> On Mon, Feb 22, 2016 at 9:25 AM Colin P. McCabe <cm...@apache.org>
>> wrote:
>>> 
>>>> +1 for a release of 3.0.  There are a lot of significant,
>>>> compatibility-breaking, but necessary changes in this release... we've
>>>> touched on some of them in this thread.
>>>> 
>>>> +1 for a parallel release of 2.8 as well.  I think we are pretty close
>>>> to this, barring a dozen or so blockers.
>>>> 
>>>> best,
>>>> Colin
>>>> 
>>>> On Mon, Feb 22, 2016 at 2:56 AM, Steve Loughran <stevel@hortonworks.com
>>> 
>>>> wrote:
>>>>> 
>>>>>> On 20 Feb 2016, at 15:34, Junping Du <jd...@hortonworks.com> wrote:
>>>>>> 
>>>>>> Shall we consolidate effort for 2.8.0 and 3.0.0? It doesn't sounds
>>>> reasonable to have two alpha releases to go in parallel. Is EC feature
>> the
>>>> main motivation of releasing hadoop 3 here? If so, I don't understand
>> why
>>>> this feature cannot land on 2.8.x or 2.9.x as an alpha feature.
>>>>> 
>>>>> 
>>>>> 
>>>>>> If we release 3.0 in a month like plan proposed below, it means we
>> will
>>>> have 4 active releases going in parallel - two alpha releases (2.8 and
>> 3.0)
>>>> and two stable releases (2.6.x and 2.7.x). It brings a lot of
>> challenges in
>>>> issues tracking and patch committing, not even mention the tremendous
>>>> effort of release verification and voting.
>>>>>> I would like to propose to wait 2.8 release become stable (may be 2nd
>>>> release in 2.8 branch cause first release is alpha due to discussion in
>>>> another email thread), then we can move to 3.0 as the only alpha
>> release.
>>>> In the meantime, we can bring more significant features (like ATS v2,
>> etc.)
>>>> to trunk and consolidate stable releases in 2.6.x and 2.7.x. I believe
>> that
>>>> make life easier. :)
>>>>>> Thoughts?
>>>>>> 
>>>>> 
>>>>> 2.8.0 is relatively close to shipping. I say relatively as I'm doing
>>>> some work with ATS 1.5 downstream and I'd like to make sure all that
>> works.
>>>> There's also a large collection of S3 and swift patches needing
>> attention
>>>> from any reviewers with time and credentials.
>>>>> 
>>>>> 3.x is going to take multiple iterations to stabilise, and with more
>>>> changes, more significant a rollout. I'd also like to do a complete
>> update
>>>> of all the dependencies before a final release, so we can have less
>>>> pressure to upgrade for a while, and get Sean's classloader patch in so
>>>> it's slightly less visible.
>>>>> 
>>>>> That means 3.0 is going to be an alpha release, not final.
>>>>> 
>>>>> one thing that could be shared is any build.xml automation of the
>>>> release process, to at least take away most of the manual steps in the
>>>> process, to have something more repeatable.
>>>>> 
>>>>> -steve
>>>>> 
>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Junping
>>>>>> ________________________________________
>>>>>> From: Yongjun Zhang <yz...@cloudera.com>
>>>>>> Sent: Friday, February 19, 2016 8:05 PM
>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>> Cc: common-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org;
>>>> yarn-dev@hadoop.apache.org
>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>> 
>>>>>> Thanks Andrew for initiating the effort!
>>>>>> 
>>>>>> +1 on pushing 3.x with extended alpha cycle, and continuing the more
>>>> stable
>>>>>> 2.x releases.
>>>>>> 
>>>>>> --Yongjun
>>>>>> 
>>>>>> On Thu, Feb 18, 2016 at 5:58 PM, Andrew Wang <
>> andrew.wang@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Kai,
>>>>>>> 
>>>>>>> Sure, I'm open to it. It's a new major release, so we're allowed to
>>>> make
>>>>>>> these kinds of big changes. The idea behind the extended alpha
>> cycle is
>>>>>>> that downstreams can give us feedback. This way if we do anything
>> too
>>>>>>> radical, we can address it in the next alpha and have downstreams
>>>> re-test.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Andrew
>>>>>>> 
>>>>>>> On Thu, Feb 18, 2016 at 5:23 PM, Zheng, Kai <ka...@intel.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks Andrew for driving this. Wonder if it's a good chance for
>>>>>>>> HADOOP-12579 (Deprecate and remove WriteableRPCEngine) to be in.
>> Note
>>>>>>> it's
>>>>>>>> not an incompatible change, but feel better to be done in the major
>>>>>>> release.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Kai
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Andrew Wang [mailto:andrew.wang@cloudera.com]
>>>>>>>> Sent: Friday, February 19, 2016 7:04 AM
>>>>>>>> To: hdfs-dev@hadoop.apache.org; Kihwal Lee <ki...@yahoo-inc.com>
>>>>>>>> Cc: mapreduce-dev@hadoop.apache.org; common-dev@hadoop.apache.org;
>>>>>>>> yarn-dev@hadoop.apache.org
>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>> 
>>>>>>>> Hi Kihwal,
>>>>>>>> 
>>>>>>>> I think there's still value in continuing the 2.x releases. 3.x
>> comes
>>>>>>> with
>>>>>>>> the incompatible bump to a JDK8 runtime, and also the fact that 3.x
>>>> won't
>>>>>>>> be beta or GA for some number of months. In the meanwhile, it'd be
>>>> good
>>>>>>> to
>>>>>>>> keep putting out regular, stable 2.x releases.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Andrew
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Feb 18, 2016 at 2:50 PM, Kihwal Lee
>>>> <kihwal@yahoo-inc.com.invalid
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Moving Hadoop 3 forward sounds fine. If EC is one of the main
>>>>>>>>> motivations, are we getting rid of branch-2.8?
>>>>>>>>> 
>>>>>>>>> Kihwal
>>>>>>>>> 
>>>>>>>>>     From: Andrew Wang <an...@cloudera.com>
>>>>>>>>> To: "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>
>>>>>>>>> Cc: "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>; "
>>>>>>>>> mapreduce-dev@hadoop.apache.org" <mapreduce-dev@hadoop.apache.org
>>> ;
>>>>>>>>> hdfs-dev <hd...@hadoop.apache.org>
>>>>>>>>> Sent: Thursday, February 18, 2016 4:35 PM
>>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Reviving this thread. I've seen renewed interest in a trunk
>> release
>>>>>>>>> since HDFS erasure coding has not yet made it to branch-2. Along
>> with
>>>>>>>>> JDK8, the shell script rewrite, and many other improvements, I
>> think
>>>>>>>>> it's time to revisit Hadoop 3.0 release plans.
>>>>>>>>> 
>>>>>>>>> My overall plan is still the same as in my original email: a
>> series
>>>> of
>>>>>>>>> regular alpha releases leading up to beta and GA. Alpha releases
>> make
>>>>>>>>> it easier for downstreams to integrate with our code, and making
>> them
>>>>>>>>> regular means features can be included when they are ready.
>>>>>>>>> 
>>>>>>>>> I know there are some incompatible changes waiting in the wings
>> (i.e.
>>>>>>>>> HDFS-6984 making FileStatus a PB rather than Writable, some of
>>>>>>>>> HADOOP-9991 bumping dependency versions) that would be good to get
>>>> in.
>>>>>>>>> If you have changes like this, please set the target version to
>> 3.0.0
>>>>>>>>> and mark them "Incompatible". We can use this JIRA query to track:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%2
>>>>>>>>> 
>>>> 0HDFS%2C%20YARN%2C%20MAPREDUCE)%20and%20%22Target%20Version%2Fs%22%20%
>>>>>>>>> 
>>>> 3D%20%223.0.0%22%20and%20resolution%3D%22unresolved%22%20and%20%22Hado
>>>>>>>>> 
>> op%20Flags%22%3D%22Incompatible%20change%22%20order%20by%20priority
>>>>>>>>> 
>>>>>>>>> There's some release-related stuff that needs to be sorted out
>>>>>>>>> (namely, the new CHANGES.txt and release note generation from
>> Yetus),
>>>>>>>>> but I'd tentatively like to roll the first alpha a month out, so
>>>> third
>>>>>>>>> week of March.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>>> On Mon, Mar 9, 2015 at 7:23 PM, Raymie Stata <
>> rstata@altiscale.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Avoiding the use of JDK8 language features (and, presumably,
>> APIs)
>>>>>>>>>> means you've abandoned #1, i.e., you haven't (really) bumped the
>> JDK
>>>>>>>>>> source version to JDK8.
>>>>>>>>>> 
>>>>>>>>>> Also, note that releasing from trunk is a way of achieving #3,
>> it's
>>>>>>>>>> not a way of abandoning it.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Mar 9, 2015 at 7:10 PM, Andrew Wang
>>>>>>>>>> <an...@cloudera.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi Raymie,
>>>>>>>>>>> 
>>>>>>>>>>> Konst proposed just releasing off of trunk rather than cutting a
>>>>>>>>>> branch-2,
>>>>>>>>>>> and there was general agreement there. So, consider #3
>> abandoned.
>>>>>>>>>>> 1&2
>>>>>>>>> can
>>>>>>>>>>> be achieved at the same time, we just need to avoid using JDK8
>>>>>>>>>>> language features in trunk so things can be backported.
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Andrew
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Mar 9, 2015 at 7:01 PM, Raymie Stata
>>>>>>>>>>> <rs...@altiscale.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> In this (and the related threads), I see the following three
>>>>>>>>>> requirements:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. "Bump the source JDK version to JDK8" (ie, drop JDK7
>> support).
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. "We'll still be releasing 2.x releases for a while, with
>>>>>>>>>>>> similar feature sets as 3.x."
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Avoid the "risk of split-brain behavior" by "minimize
>>>>>>>>>>>> backporting headaches. Pulling trunk > branch-2 > branch-2.x is
>>>>>>>> already tedious.
>>>>>>>>>>>> Adding a branch-3, branch-3.x would be obnoxious."
>>>>>>>>>>>> 
>>>>>>>>>>>> These three cannot be achieved at the same time.  Which do we
>>>>>>>> abandon?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia
>>>>>>>>>>>> <sa...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mar 5, 2015, at 3:21 PM, Siddharth Seth <sseth@apache.org
>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2) Simplification of configs - potentially separating client
>>>>>>>>>>>>>> side
>>>>>>>>>>>> configs
>>>>>>>>>>>>>> and those used by daemons. This is another source of
>> perpetual
>>>>>>>>>> confusion
>>>>>>>>>>>>>> for users.
>>>>>>>>>>>>> + 1 on this.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> sanjay
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Hi,

While welcoming the push for alphas, i think we should set some exit criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then getting restless about calling it beta or GA of whatever. Essentially, instead of today’s questions as to "why we aren’t doing a 3.x release", we’d be fielding a "why is 3.x still considered alpha” question. This happened with 2.x alpha releases too and it wasn’t fun.

On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.

What this gains us is that
 - Trunk is always nearly stable or nearly ready for releases
 - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
 - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
 - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.

Thoughts?

+Vinod


> On Apr 21, 2016, at 4:31 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi folks,
> 
> Very optimistically, we're still on track for a 3.0 alpha this month.
> Here's a JIRA query for 3.0 and 2.8:
> 
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20MAPREDUCE%2C%20YARN)%20AND%20%22Target%20Version%2Fs%22%20in%20(3.0.0%2C%202.8.0)%20AND%20statusCategory%20not%20in%20(Complete)%20ORDER%20BY%20priority
> 
> I think two of these are true alpha blockers: HADOOP-12892 and
> HADOOP-12893. I'm trying to help push both of those forward.
> 
> For the rest, I think it's probably okay to delay until the next alpha,
> since we're planning a few alphas leading up to beta. That said, if you are
> the owner of a Blocker targeted at 3.0.0, I'd encourage reviving those
> patches. The earlier the better for incompatible changes.
> 
> In all likelihood, this first release will slip into early May, but I'll be
> disappointed if we don't have an RC out before ApacheCon.
> 
> Best,
> Andrew
> 
> On Mon, Feb 22, 2016 at 3:19 PM, Colin P. McCabe <cm...@apache.org> wrote:
> 
>> I think starting a 3.0 alpha soon would be a great idea.  As some
>> other people commented, this would come with no compatibility
>> guarantees, so that we can iron out any issues.
>> 
>> Colin
>> 
>> On Mon, Feb 22, 2016 at 1:26 PM, Zhe Zhang <zh...@cloudera.com> wrote:
>>> Thanks Andrew for driving the effort!
>>> 
>>> +1 (non-binding) on starting the 3.0 release process now with 3.0 as an
>>> alpha.
>>> 
>>> I wanted to echo Andrew's point that backporting EC to branch-2 is a lot
>> of
>>> work. Considering that no concrete backporting plan has been proposed, it
>>> seems quite uncertain whether / when it can be released in 2.9. I think
>> we
>>> should rather concentrate our EC dev efforts to harden key features under
>>> the follow-on umbrella HDFS-8031 and make it solid for a 3.0 release.
>>> 
>>> Sincerely,
>>> Zhe
>>> 
>>> On Mon, Feb 22, 2016 at 9:25 AM Colin P. McCabe <cm...@apache.org>
>> wrote:
>>> 
>>>> +1 for a release of 3.0.  There are a lot of significant,
>>>> compatibility-breaking, but necessary changes in this release... we've
>>>> touched on some of them in this thread.
>>>> 
>>>> +1 for a parallel release of 2.8 as well.  I think we are pretty close
>>>> to this, barring a dozen or so blockers.
>>>> 
>>>> best,
>>>> Colin
>>>> 
>>>> On Mon, Feb 22, 2016 at 2:56 AM, Steve Loughran <stevel@hortonworks.com
>>> 
>>>> wrote:
>>>>> 
>>>>>> On 20 Feb 2016, at 15:34, Junping Du <jd...@hortonworks.com> wrote:
>>>>>> 
>>>>>> Shall we consolidate effort for 2.8.0 and 3.0.0? It doesn't sounds
>>>> reasonable to have two alpha releases to go in parallel. Is EC feature
>> the
>>>> main motivation of releasing hadoop 3 here? If so, I don't understand
>> why
>>>> this feature cannot land on 2.8.x or 2.9.x as an alpha feature.
>>>>> 
>>>>> 
>>>>> 
>>>>>> If we release 3.0 in a month like plan proposed below, it means we
>> will
>>>> have 4 active releases going in parallel - two alpha releases (2.8 and
>> 3.0)
>>>> and two stable releases (2.6.x and 2.7.x). It brings a lot of
>> challenges in
>>>> issues tracking and patch committing, not even mention the tremendous
>>>> effort of release verification and voting.
>>>>>> I would like to propose to wait 2.8 release become stable (may be 2nd
>>>> release in 2.8 branch cause first release is alpha due to discussion in
>>>> another email thread), then we can move to 3.0 as the only alpha
>> release.
>>>> In the meantime, we can bring more significant features (like ATS v2,
>> etc.)
>>>> to trunk and consolidate stable releases in 2.6.x and 2.7.x. I believe
>> that
>>>> make life easier. :)
>>>>>> Thoughts?
>>>>>> 
>>>>> 
>>>>> 2.8.0 is relatively close to shipping. I say relatively as I'm doing
>>>> some work with ATS 1.5 downstream and I'd like to make sure all that
>> works.
>>>> There's also a large collection of S3 and swift patches needing
>> attention
>>>> from any reviewers with time and credentials.
>>>>> 
>>>>> 3.x is going to take multiple iterations to stabilise, and with more
>>>> changes, more significant a rollout. I'd also like to do a complete
>> update
>>>> of all the dependencies before a final release, so we can have less
>>>> pressure to upgrade for a while, and get Sean's classloader patch in so
>>>> it's slightly less visible.
>>>>> 
>>>>> That means 3.0 is going to be an alpha release, not final.
>>>>> 
>>>>> one thing that could be shared is any build.xml automation of the
>>>> release process, to at least take away most of the manual steps in the
>>>> process, to have something more repeatable.
>>>>> 
>>>>> -steve
>>>>> 
>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Junping
>>>>>> ________________________________________
>>>>>> From: Yongjun Zhang <yz...@cloudera.com>
>>>>>> Sent: Friday, February 19, 2016 8:05 PM
>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>> Cc: common-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org;
>>>> yarn-dev@hadoop.apache.org
>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>> 
>>>>>> Thanks Andrew for initiating the effort!
>>>>>> 
>>>>>> +1 on pushing 3.x with extended alpha cycle, and continuing the more
>>>> stable
>>>>>> 2.x releases.
>>>>>> 
>>>>>> --Yongjun
>>>>>> 
>>>>>> On Thu, Feb 18, 2016 at 5:58 PM, Andrew Wang <
>> andrew.wang@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Kai,
>>>>>>> 
>>>>>>> Sure, I'm open to it. It's a new major release, so we're allowed to
>>>> make
>>>>>>> these kinds of big changes. The idea behind the extended alpha
>> cycle is
>>>>>>> that downstreams can give us feedback. This way if we do anything
>> too
>>>>>>> radical, we can address it in the next alpha and have downstreams
>>>> re-test.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Andrew
>>>>>>> 
>>>>>>> On Thu, Feb 18, 2016 at 5:23 PM, Zheng, Kai <ka...@intel.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks Andrew for driving this. Wonder if it's a good chance for
>>>>>>>> HADOOP-12579 (Deprecate and remove WriteableRPCEngine) to be in.
>> Note
>>>>>>> it's
>>>>>>>> not an incompatible change, but feel better to be done in the major
>>>>>>> release.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Kai
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Andrew Wang [mailto:andrew.wang@cloudera.com]
>>>>>>>> Sent: Friday, February 19, 2016 7:04 AM
>>>>>>>> To: hdfs-dev@hadoop.apache.org; Kihwal Lee <ki...@yahoo-inc.com>
>>>>>>>> Cc: mapreduce-dev@hadoop.apache.org; common-dev@hadoop.apache.org;
>>>>>>>> yarn-dev@hadoop.apache.org
>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>> 
>>>>>>>> Hi Kihwal,
>>>>>>>> 
>>>>>>>> I think there's still value in continuing the 2.x releases. 3.x
>> comes
>>>>>>> with
>>>>>>>> the incompatible bump to a JDK8 runtime, and also the fact that 3.x
>>>> won't
>>>>>>>> be beta or GA for some number of months. In the meanwhile, it'd be
>>>> good
>>>>>>> to
>>>>>>>> keep putting out regular, stable 2.x releases.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Andrew
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Feb 18, 2016 at 2:50 PM, Kihwal Lee
>>>> <kihwal@yahoo-inc.com.invalid
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Moving Hadoop 3 forward sounds fine. If EC is one of the main
>>>>>>>>> motivations, are we getting rid of branch-2.8?
>>>>>>>>> 
>>>>>>>>> Kihwal
>>>>>>>>> 
>>>>>>>>>     From: Andrew Wang <an...@cloudera.com>
>>>>>>>>> To: "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>
>>>>>>>>> Cc: "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>; "
>>>>>>>>> mapreduce-dev@hadoop.apache.org" <mapreduce-dev@hadoop.apache.org
>>> ;
>>>>>>>>> hdfs-dev <hd...@hadoop.apache.org>
>>>>>>>>> Sent: Thursday, February 18, 2016 4:35 PM
>>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Reviving this thread. I've seen renewed interest in a trunk
>> release
>>>>>>>>> since HDFS erasure coding has not yet made it to branch-2. Along
>> with
>>>>>>>>> JDK8, the shell script rewrite, and many other improvements, I
>> think
>>>>>>>>> it's time to revisit Hadoop 3.0 release plans.
>>>>>>>>> 
>>>>>>>>> My overall plan is still the same as in my original email: a
>> series
>>>> of
>>>>>>>>> regular alpha releases leading up to beta and GA. Alpha releases
>> make
>>>>>>>>> it easier for downstreams to integrate with our code, and making
>> them
>>>>>>>>> regular means features can be included when they are ready.
>>>>>>>>> 
>>>>>>>>> I know there are some incompatible changes waiting in the wings
>> (i.e.
>>>>>>>>> HDFS-6984 making FileStatus a PB rather than Writable, some of
>>>>>>>>> HADOOP-9991 bumping dependency versions) that would be good to get
>>>> in.
>>>>>>>>> If you have changes like this, please set the target version to
>> 3.0.0
>>>>>>>>> and mark them "Incompatible". We can use this JIRA query to track:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%2
>>>>>>>>> 
>>>> 0HDFS%2C%20YARN%2C%20MAPREDUCE)%20and%20%22Target%20Version%2Fs%22%20%
>>>>>>>>> 
>>>> 3D%20%223.0.0%22%20and%20resolution%3D%22unresolved%22%20and%20%22Hado
>>>>>>>>> 
>> op%20Flags%22%3D%22Incompatible%20change%22%20order%20by%20priority
>>>>>>>>> 
>>>>>>>>> There's some release-related stuff that needs to be sorted out
>>>>>>>>> (namely, the new CHANGES.txt and release note generation from
>> Yetus),
>>>>>>>>> but I'd tentatively like to roll the first alpha a month out, so
>>>> third
>>>>>>>>> week of March.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>>> On Mon, Mar 9, 2015 at 7:23 PM, Raymie Stata <
>> rstata@altiscale.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Avoiding the use of JDK8 language features (and, presumably,
>> APIs)
>>>>>>>>>> means you've abandoned #1, i.e., you haven't (really) bumped the
>> JDK
>>>>>>>>>> source version to JDK8.
>>>>>>>>>> 
>>>>>>>>>> Also, note that releasing from trunk is a way of achieving #3,
>> it's
>>>>>>>>>> not a way of abandoning it.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Mar 9, 2015 at 7:10 PM, Andrew Wang
>>>>>>>>>> <an...@cloudera.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi Raymie,
>>>>>>>>>>> 
>>>>>>>>>>> Konst proposed just releasing off of trunk rather than cutting a
>>>>>>>>>> branch-2,
>>>>>>>>>>> and there was general agreement there. So, consider #3
>> abandoned.
>>>>>>>>>>> 1&2
>>>>>>>>> can
>>>>>>>>>>> be achieved at the same time, we just need to avoid using JDK8
>>>>>>>>>>> language features in trunk so things can be backported.
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Andrew
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Mar 9, 2015 at 7:01 PM, Raymie Stata
>>>>>>>>>>> <rs...@altiscale.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> In this (and the related threads), I see the following three
>>>>>>>>>> requirements:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. "Bump the source JDK version to JDK8" (ie, drop JDK7
>> support).
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. "We'll still be releasing 2.x releases for a while, with
>>>>>>>>>>>> similar feature sets as 3.x."
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Avoid the "risk of split-brain behavior" by "minimize
>>>>>>>>>>>> backporting headaches. Pulling trunk > branch-2 > branch-2.x is
>>>>>>>> already tedious.
>>>>>>>>>>>> Adding a branch-3, branch-3.x would be obnoxious."
>>>>>>>>>>>> 
>>>>>>>>>>>> These three cannot be achieved at the same time.  Which do we
>>>>>>>> abandon?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia
>>>>>>>>>>>> <sa...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mar 5, 2015, at 3:21 PM, Siddharth Seth <sseth@apache.org
>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2) Simplification of configs - potentially separating client
>>>>>>>>>>>>>> side
>>>>>>>>>>>> configs
>>>>>>>>>>>>>> and those used by daemons. This is another source of
>> perpetual
>>>>>>>>>> confusion
>>>>>>>>>>>>>> for users.
>>>>>>>>>>>>> + 1 on this.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> sanjay
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Hi,

While welcoming the push for alphas, i think we should set some exit criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then getting restless about calling it beta or GA of whatever. Essentially, instead of today’s questions as to "why we aren’t doing a 3.x release", we’d be fielding a "why is 3.x still considered alpha” question. This happened with 2.x alpha releases too and it wasn’t fun.

On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.

What this gains us is that
 - Trunk is always nearly stable or nearly ready for releases
 - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
 - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
 - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.

Thoughts?

+Vinod


> On Apr 21, 2016, at 4:31 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi folks,
> 
> Very optimistically, we're still on track for a 3.0 alpha this month.
> Here's a JIRA query for 3.0 and 2.8:
> 
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20MAPREDUCE%2C%20YARN)%20AND%20%22Target%20Version%2Fs%22%20in%20(3.0.0%2C%202.8.0)%20AND%20statusCategory%20not%20in%20(Complete)%20ORDER%20BY%20priority
> 
> I think two of these are true alpha blockers: HADOOP-12892 and
> HADOOP-12893. I'm trying to help push both of those forward.
> 
> For the rest, I think it's probably okay to delay until the next alpha,
> since we're planning a few alphas leading up to beta. That said, if you are
> the owner of a Blocker targeted at 3.0.0, I'd encourage reviving those
> patches. The earlier the better for incompatible changes.
> 
> In all likelihood, this first release will slip into early May, but I'll be
> disappointed if we don't have an RC out before ApacheCon.
> 
> Best,
> Andrew
> 
> On Mon, Feb 22, 2016 at 3:19 PM, Colin P. McCabe <cm...@apache.org> wrote:
> 
>> I think starting a 3.0 alpha soon would be a great idea.  As some
>> other people commented, this would come with no compatibility
>> guarantees, so that we can iron out any issues.
>> 
>> Colin
>> 
>> On Mon, Feb 22, 2016 at 1:26 PM, Zhe Zhang <zh...@cloudera.com> wrote:
>>> Thanks Andrew for driving the effort!
>>> 
>>> +1 (non-binding) on starting the 3.0 release process now with 3.0 as an
>>> alpha.
>>> 
>>> I wanted to echo Andrew's point that backporting EC to branch-2 is a lot
>> of
>>> work. Considering that no concrete backporting plan has been proposed, it
>>> seems quite uncertain whether / when it can be released in 2.9. I think
>> we
>>> should rather concentrate our EC dev efforts to harden key features under
>>> the follow-on umbrella HDFS-8031 and make it solid for a 3.0 release.
>>> 
>>> Sincerely,
>>> Zhe
>>> 
>>> On Mon, Feb 22, 2016 at 9:25 AM Colin P. McCabe <cm...@apache.org>
>> wrote:
>>> 
>>>> +1 for a release of 3.0.  There are a lot of significant,
>>>> compatibility-breaking, but necessary changes in this release... we've
>>>> touched on some of them in this thread.
>>>> 
>>>> +1 for a parallel release of 2.8 as well.  I think we are pretty close
>>>> to this, barring a dozen or so blockers.
>>>> 
>>>> best,
>>>> Colin
>>>> 
>>>> On Mon, Feb 22, 2016 at 2:56 AM, Steve Loughran <stevel@hortonworks.com
>>> 
>>>> wrote:
>>>>> 
>>>>>> On 20 Feb 2016, at 15:34, Junping Du <jd...@hortonworks.com> wrote:
>>>>>> 
>>>>>> Shall we consolidate effort for 2.8.0 and 3.0.0? It doesn't sounds
>>>> reasonable to have two alpha releases to go in parallel. Is EC feature
>> the
>>>> main motivation of releasing hadoop 3 here? If so, I don't understand
>> why
>>>> this feature cannot land on 2.8.x or 2.9.x as an alpha feature.
>>>>> 
>>>>> 
>>>>> 
>>>>>> If we release 3.0 in a month like plan proposed below, it means we
>> will
>>>> have 4 active releases going in parallel - two alpha releases (2.8 and
>> 3.0)
>>>> and two stable releases (2.6.x and 2.7.x). It brings a lot of
>> challenges in
>>>> issues tracking and patch committing, not even mention the tremendous
>>>> effort of release verification and voting.
>>>>>> I would like to propose to wait 2.8 release become stable (may be 2nd
>>>> release in 2.8 branch cause first release is alpha due to discussion in
>>>> another email thread), then we can move to 3.0 as the only alpha
>> release.
>>>> In the meantime, we can bring more significant features (like ATS v2,
>> etc.)
>>>> to trunk and consolidate stable releases in 2.6.x and 2.7.x. I believe
>> that
>>>> make life easier. :)
>>>>>> Thoughts?
>>>>>> 
>>>>> 
>>>>> 2.8.0 is relatively close to shipping. I say relatively as I'm doing
>>>> some work with ATS 1.5 downstream and I'd like to make sure all that
>> works.
>>>> There's also a large collection of S3 and swift patches needing
>> attention
>>>> from any reviewers with time and credentials.
>>>>> 
>>>>> 3.x is going to take multiple iterations to stabilise, and with more
>>>> changes, more significant a rollout. I'd also like to do a complete
>> update
>>>> of all the dependencies before a final release, so we can have less
>>>> pressure to upgrade for a while, and get Sean's classloader patch in so
>>>> it's slightly less visible.
>>>>> 
>>>>> That means 3.0 is going to be an alpha release, not final.
>>>>> 
>>>>> one thing that could be shared is any build.xml automation of the
>>>> release process, to at least take away most of the manual steps in the
>>>> process, to have something more repeatable.
>>>>> 
>>>>> -steve
>>>>> 
>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Junping
>>>>>> ________________________________________
>>>>>> From: Yongjun Zhang <yz...@cloudera.com>
>>>>>> Sent: Friday, February 19, 2016 8:05 PM
>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>> Cc: common-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org;
>>>> yarn-dev@hadoop.apache.org
>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>> 
>>>>>> Thanks Andrew for initiating the effort!
>>>>>> 
>>>>>> +1 on pushing 3.x with extended alpha cycle, and continuing the more
>>>> stable
>>>>>> 2.x releases.
>>>>>> 
>>>>>> --Yongjun
>>>>>> 
>>>>>> On Thu, Feb 18, 2016 at 5:58 PM, Andrew Wang <
>> andrew.wang@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Kai,
>>>>>>> 
>>>>>>> Sure, I'm open to it. It's a new major release, so we're allowed to
>>>> make
>>>>>>> these kinds of big changes. The idea behind the extended alpha
>> cycle is
>>>>>>> that downstreams can give us feedback. This way if we do anything
>> too
>>>>>>> radical, we can address it in the next alpha and have downstreams
>>>> re-test.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Andrew
>>>>>>> 
>>>>>>> On Thu, Feb 18, 2016 at 5:23 PM, Zheng, Kai <ka...@intel.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks Andrew for driving this. Wonder if it's a good chance for
>>>>>>>> HADOOP-12579 (Deprecate and remove WriteableRPCEngine) to be in.
>> Note
>>>>>>> it's
>>>>>>>> not an incompatible change, but feel better to be done in the major
>>>>>>> release.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Kai
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Andrew Wang [mailto:andrew.wang@cloudera.com]
>>>>>>>> Sent: Friday, February 19, 2016 7:04 AM
>>>>>>>> To: hdfs-dev@hadoop.apache.org; Kihwal Lee <ki...@yahoo-inc.com>
>>>>>>>> Cc: mapreduce-dev@hadoop.apache.org; common-dev@hadoop.apache.org;
>>>>>>>> yarn-dev@hadoop.apache.org
>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>> 
>>>>>>>> Hi Kihwal,
>>>>>>>> 
>>>>>>>> I think there's still value in continuing the 2.x releases. 3.x
>> comes
>>>>>>> with
>>>>>>>> the incompatible bump to a JDK8 runtime, and also the fact that 3.x
>>>> won't
>>>>>>>> be beta or GA for some number of months. In the meanwhile, it'd be
>>>> good
>>>>>>> to
>>>>>>>> keep putting out regular, stable 2.x releases.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Andrew
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Feb 18, 2016 at 2:50 PM, Kihwal Lee
>>>> <kihwal@yahoo-inc.com.invalid
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Moving Hadoop 3 forward sounds fine. If EC is one of the main
>>>>>>>>> motivations, are we getting rid of branch-2.8?
>>>>>>>>> 
>>>>>>>>> Kihwal
>>>>>>>>> 
>>>>>>>>>     From: Andrew Wang <an...@cloudera.com>
>>>>>>>>> To: "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>
>>>>>>>>> Cc: "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>; "
>>>>>>>>> mapreduce-dev@hadoop.apache.org" <mapreduce-dev@hadoop.apache.org
>>> ;
>>>>>>>>> hdfs-dev <hd...@hadoop.apache.org>
>>>>>>>>> Sent: Thursday, February 18, 2016 4:35 PM
>>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Reviving this thread. I've seen renewed interest in a trunk
>> release
>>>>>>>>> since HDFS erasure coding has not yet made it to branch-2. Along
>> with
>>>>>>>>> JDK8, the shell script rewrite, and many other improvements, I
>> think
>>>>>>>>> it's time to revisit Hadoop 3.0 release plans.
>>>>>>>>> 
>>>>>>>>> My overall plan is still the same as in my original email: a
>> series
>>>> of
>>>>>>>>> regular alpha releases leading up to beta and GA. Alpha releases
>> make
>>>>>>>>> it easier for downstreams to integrate with our code, and making
>> them
>>>>>>>>> regular means features can be included when they are ready.
>>>>>>>>> 
>>>>>>>>> I know there are some incompatible changes waiting in the wings
>> (i.e.
>>>>>>>>> HDFS-6984 making FileStatus a PB rather than Writable, some of
>>>>>>>>> HADOOP-9991 bumping dependency versions) that would be good to get
>>>> in.
>>>>>>>>> If you have changes like this, please set the target version to
>> 3.0.0
>>>>>>>>> and mark them "Incompatible". We can use this JIRA query to track:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%2
>>>>>>>>> 
>>>> 0HDFS%2C%20YARN%2C%20MAPREDUCE)%20and%20%22Target%20Version%2Fs%22%20%
>>>>>>>>> 
>>>> 3D%20%223.0.0%22%20and%20resolution%3D%22unresolved%22%20and%20%22Hado
>>>>>>>>> 
>> op%20Flags%22%3D%22Incompatible%20change%22%20order%20by%20priority
>>>>>>>>> 
>>>>>>>>> There's some release-related stuff that needs to be sorted out
>>>>>>>>> (namely, the new CHANGES.txt and release note generation from
>> Yetus),
>>>>>>>>> but I'd tentatively like to roll the first alpha a month out, so
>>>> third
>>>>>>>>> week of March.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>>> On Mon, Mar 9, 2015 at 7:23 PM, Raymie Stata <
>> rstata@altiscale.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Avoiding the use of JDK8 language features (and, presumably,
>> APIs)
>>>>>>>>>> means you've abandoned #1, i.e., you haven't (really) bumped the
>> JDK
>>>>>>>>>> source version to JDK8.
>>>>>>>>>> 
>>>>>>>>>> Also, note that releasing from trunk is a way of achieving #3,
>> it's
>>>>>>>>>> not a way of abandoning it.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Mar 9, 2015 at 7:10 PM, Andrew Wang
>>>>>>>>>> <an...@cloudera.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi Raymie,
>>>>>>>>>>> 
>>>>>>>>>>> Konst proposed just releasing off of trunk rather than cutting a
>>>>>>>>>> branch-2,
>>>>>>>>>>> and there was general agreement there. So, consider #3
>> abandoned.
>>>>>>>>>>> 1&2
>>>>>>>>> can
>>>>>>>>>>> be achieved at the same time, we just need to avoid using JDK8
>>>>>>>>>>> language features in trunk so things can be backported.
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Andrew
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Mar 9, 2015 at 7:01 PM, Raymie Stata
>>>>>>>>>>> <rs...@altiscale.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> In this (and the related threads), I see the following three
>>>>>>>>>> requirements:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. "Bump the source JDK version to JDK8" (ie, drop JDK7
>> support).
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. "We'll still be releasing 2.x releases for a while, with
>>>>>>>>>>>> similar feature sets as 3.x."
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Avoid the "risk of split-brain behavior" by "minimize
>>>>>>>>>>>> backporting headaches. Pulling trunk > branch-2 > branch-2.x is
>>>>>>>> already tedious.
>>>>>>>>>>>> Adding a branch-3, branch-3.x would be obnoxious."
>>>>>>>>>>>> 
>>>>>>>>>>>> These three cannot be achieved at the same time.  Which do we
>>>>>>>> abandon?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia
>>>>>>>>>>>> <sa...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mar 5, 2015, at 3:21 PM, Siddharth Seth <sseth@apache.org
>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2) Simplification of configs - potentially separating client
>>>>>>>>>>>>>> side
>>>>>>>>>>>> configs
>>>>>>>>>>>>>> and those used by daemons. This is another source of
>> perpetual
>>>>>>>>>> confusion
>>>>>>>>>>>>>> for users.
>>>>>>>>>>>>> + 1 on this.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> sanjay
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>

Re: Looking to a Hadoop 3 release

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.

Hi,

While welcoming the push for alphas, i think we should set some exit criteria. Otherwise, I can imagine us doing 3/4/5 alpha releases, and then getting restless about calling it beta or GA of whatever. Essentially, instead of today’s questions as to "why we aren’t doing a 3.x release", we’d be fielding a "why is 3.x still considered alpha” question. This happened with 2.x alpha releases too and it wasn’t fun.

On an unrelated note, offline I was pitching to a bunch of contributors another idea to deal with rotting trunk post 3.x: *Make 3.x releases off of trunk directly*.

What this gains us is that
 - Trunk is always nearly stable or nearly ready for releases
 - We no longer have some code lying around in some branch (today’s trunk) that is not releasable because it gets mixed with other undesirable and incompatible changes.
 - This needs to be coupled with more discipline on individual features - medium to to large features are always worked upon in branches and get merged into trunk (and a nearing release!) when they are ready
 - All incompatible changes go into some sort of a trunk-incompat branch and stay there till we accumulate enough of those to warrant another major release.

Thoughts?

+Vinod


> On Apr 21, 2016, at 4:31 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi folks,
> 
> Very optimistically, we're still on track for a 3.0 alpha this month.
> Here's a JIRA query for 3.0 and 2.8:
> 
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20MAPREDUCE%2C%20YARN)%20AND%20%22Target%20Version%2Fs%22%20in%20(3.0.0%2C%202.8.0)%20AND%20statusCategory%20not%20in%20(Complete)%20ORDER%20BY%20priority
> 
> I think two of these are true alpha blockers: HADOOP-12892 and
> HADOOP-12893. I'm trying to help push both of those forward.
> 
> For the rest, I think it's probably okay to delay until the next alpha,
> since we're planning a few alphas leading up to beta. That said, if you are
> the owner of a Blocker targeted at 3.0.0, I'd encourage reviving those
> patches. The earlier the better for incompatible changes.
> 
> In all likelihood, this first release will slip into early May, but I'll be
> disappointed if we don't have an RC out before ApacheCon.
> 
> Best,
> Andrew
> 
> On Mon, Feb 22, 2016 at 3:19 PM, Colin P. McCabe <cm...@apache.org> wrote:
> 
>> I think starting a 3.0 alpha soon would be a great idea.  As some
>> other people commented, this would come with no compatibility
>> guarantees, so that we can iron out any issues.
>> 
>> Colin
>> 
>> On Mon, Feb 22, 2016 at 1:26 PM, Zhe Zhang <zh...@cloudera.com> wrote:
>>> Thanks Andrew for driving the effort!
>>> 
>>> +1 (non-binding) on starting the 3.0 release process now with 3.0 as an
>>> alpha.
>>> 
>>> I wanted to echo Andrew's point that backporting EC to branch-2 is a lot
>> of
>>> work. Considering that no concrete backporting plan has been proposed, it
>>> seems quite uncertain whether / when it can be released in 2.9. I think
>> we
>>> should rather concentrate our EC dev efforts to harden key features under
>>> the follow-on umbrella HDFS-8031 and make it solid for a 3.0 release.
>>> 
>>> Sincerely,
>>> Zhe
>>> 
>>> On Mon, Feb 22, 2016 at 9:25 AM Colin P. McCabe <cm...@apache.org>
>> wrote:
>>> 
>>>> +1 for a release of 3.0.  There are a lot of significant,
>>>> compatibility-breaking, but necessary changes in this release... we've
>>>> touched on some of them in this thread.
>>>> 
>>>> +1 for a parallel release of 2.8 as well.  I think we are pretty close
>>>> to this, barring a dozen or so blockers.
>>>> 
>>>> best,
>>>> Colin
>>>> 
>>>> On Mon, Feb 22, 2016 at 2:56 AM, Steve Loughran <stevel@hortonworks.com
>>> 
>>>> wrote:
>>>>> 
>>>>>> On 20 Feb 2016, at 15:34, Junping Du <jd...@hortonworks.com> wrote:
>>>>>> 
>>>>>> Shall we consolidate effort for 2.8.0 and 3.0.0? It doesn't sounds
>>>> reasonable to have two alpha releases to go in parallel. Is EC feature
>> the
>>>> main motivation of releasing hadoop 3 here? If so, I don't understand
>> why
>>>> this feature cannot land on 2.8.x or 2.9.x as an alpha feature.
>>>>> 
>>>>> 
>>>>> 
>>>>>> If we release 3.0 in a month like plan proposed below, it means we
>> will
>>>> have 4 active releases going in parallel - two alpha releases (2.8 and
>> 3.0)
>>>> and two stable releases (2.6.x and 2.7.x). It brings a lot of
>> challenges in
>>>> issues tracking and patch committing, not even mention the tremendous
>>>> effort of release verification and voting.
>>>>>> I would like to propose to wait 2.8 release become stable (may be 2nd
>>>> release in 2.8 branch cause first release is alpha due to discussion in
>>>> another email thread), then we can move to 3.0 as the only alpha
>> release.
>>>> In the meantime, we can bring more significant features (like ATS v2,
>> etc.)
>>>> to trunk and consolidate stable releases in 2.6.x and 2.7.x. I believe
>> that
>>>> make life easier. :)
>>>>>> Thoughts?
>>>>>> 
>>>>> 
>>>>> 2.8.0 is relatively close to shipping. I say relatively as I'm doing
>>>> some work with ATS 1.5 downstream and I'd like to make sure all that
>> works.
>>>> There's also a large collection of S3 and swift patches needing
>> attention
>>>> from any reviewers with time and credentials.
>>>>> 
>>>>> 3.x is going to take multiple iterations to stabilise, and with more
>>>> changes, more significant a rollout. I'd also like to do a complete
>> update
>>>> of all the dependencies before a final release, so we can have less
>>>> pressure to upgrade for a while, and get Sean's classloader patch in so
>>>> it's slightly less visible.
>>>>> 
>>>>> That means 3.0 is going to be an alpha release, not final.
>>>>> 
>>>>> one thing that could be shared is any build.xml automation of the
>>>> release process, to at least take away most of the manual steps in the
>>>> process, to have something more repeatable.
>>>>> 
>>>>> -steve
>>>>> 
>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Junping
>>>>>> ________________________________________
>>>>>> From: Yongjun Zhang <yz...@cloudera.com>
>>>>>> Sent: Friday, February 19, 2016 8:05 PM
>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>> Cc: common-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org;
>>>> yarn-dev@hadoop.apache.org
>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>> 
>>>>>> Thanks Andrew for initiating the effort!
>>>>>> 
>>>>>> +1 on pushing 3.x with extended alpha cycle, and continuing the more
>>>> stable
>>>>>> 2.x releases.
>>>>>> 
>>>>>> --Yongjun
>>>>>> 
>>>>>> On Thu, Feb 18, 2016 at 5:58 PM, Andrew Wang <
>> andrew.wang@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Kai,
>>>>>>> 
>>>>>>> Sure, I'm open to it. It's a new major release, so we're allowed to
>>>> make
>>>>>>> these kinds of big changes. The idea behind the extended alpha
>> cycle is
>>>>>>> that downstreams can give us feedback. This way if we do anything
>> too
>>>>>>> radical, we can address it in the next alpha and have downstreams
>>>> re-test.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Andrew
>>>>>>> 
>>>>>>> On Thu, Feb 18, 2016 at 5:23 PM, Zheng, Kai <ka...@intel.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks Andrew for driving this. Wonder if it's a good chance for
>>>>>>>> HADOOP-12579 (Deprecate and remove WriteableRPCEngine) to be in.
>> Note
>>>>>>> it's
>>>>>>>> not an incompatible change, but feel better to be done in the major
>>>>>>> release.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Kai
>>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Andrew Wang [mailto:andrew.wang@cloudera.com]
>>>>>>>> Sent: Friday, February 19, 2016 7:04 AM
>>>>>>>> To: hdfs-dev@hadoop.apache.org; Kihwal Lee <ki...@yahoo-inc.com>
>>>>>>>> Cc: mapreduce-dev@hadoop.apache.org; common-dev@hadoop.apache.org;
>>>>>>>> yarn-dev@hadoop.apache.org
>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>> 
>>>>>>>> Hi Kihwal,
>>>>>>>> 
>>>>>>>> I think there's still value in continuing the 2.x releases. 3.x
>> comes
>>>>>>> with
>>>>>>>> the incompatible bump to a JDK8 runtime, and also the fact that 3.x
>>>> won't
>>>>>>>> be beta or GA for some number of months. In the meanwhile, it'd be
>>>> good
>>>>>>> to
>>>>>>>> keep putting out regular, stable 2.x releases.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Andrew
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Feb 18, 2016 at 2:50 PM, Kihwal Lee
>>>> <kihwal@yahoo-inc.com.invalid
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Moving Hadoop 3 forward sounds fine. If EC is one of the main
>>>>>>>>> motivations, are we getting rid of branch-2.8?
>>>>>>>>> 
>>>>>>>>> Kihwal
>>>>>>>>> 
>>>>>>>>>     From: Andrew Wang <an...@cloudera.com>
>>>>>>>>> To: "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>
>>>>>>>>> Cc: "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>; "
>>>>>>>>> mapreduce-dev@hadoop.apache.org" <mapreduce-dev@hadoop.apache.org
>>> ;
>>>>>>>>> hdfs-dev <hd...@hadoop.apache.org>
>>>>>>>>> Sent: Thursday, February 18, 2016 4:35 PM
>>>>>>>>> Subject: Re: Looking to a Hadoop 3 release
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Reviving this thread. I've seen renewed interest in a trunk
>> release
>>>>>>>>> since HDFS erasure coding has not yet made it to branch-2. Along
>> with
>>>>>>>>> JDK8, the shell script rewrite, and many other improvements, I
>> think
>>>>>>>>> it's time to revisit Hadoop 3.0 release plans.
>>>>>>>>> 
>>>>>>>>> My overall plan is still the same as in my original email: a
>> series
>>>> of
>>>>>>>>> regular alpha releases leading up to beta and GA. Alpha releases
>> make
>>>>>>>>> it easier for downstreams to integrate with our code, and making
>> them
>>>>>>>>> regular means features can be included when they are ready.
>>>>>>>>> 
>>>>>>>>> I know there are some incompatible changes waiting in the wings
>> (i.e.
>>>>>>>>> HDFS-6984 making FileStatus a PB rather than Writable, some of
>>>>>>>>> HADOOP-9991 bumping dependency versions) that would be good to get
>>>> in.
>>>>>>>>> If you have changes like this, please set the target version to
>> 3.0.0
>>>>>>>>> and mark them "Incompatible". We can use this JIRA query to track:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%2
>>>>>>>>> 
>>>> 0HDFS%2C%20YARN%2C%20MAPREDUCE)%20and%20%22Target%20Version%2Fs%22%20%
>>>>>>>>> 
>>>> 3D%20%223.0.0%22%20and%20resolution%3D%22unresolved%22%20and%20%22Hado
>>>>>>>>> 
>> op%20Flags%22%3D%22Incompatible%20change%22%20order%20by%20priority
>>>>>>>>> 
>>>>>>>>> There's some release-related stuff that needs to be sorted out
>>>>>>>>> (namely, the new CHANGES.txt and release note generation from
>> Yetus),
>>>>>>>>> but I'd tentatively like to roll the first alpha a month out, so
>>>> third
>>>>>>>>> week of March.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>>> On Mon, Mar 9, 2015 at 7:23 PM, Raymie Stata <
>> rstata@altiscale.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Avoiding the use of JDK8 language features (and, presumably,
>> APIs)
>>>>>>>>>> means you've abandoned #1, i.e., you haven't (really) bumped the
>> JDK
>>>>>>>>>> source version to JDK8.
>>>>>>>>>> 
>>>>>>>>>> Also, note that releasing from trunk is a way of achieving #3,
>> it's
>>>>>>>>>> not a way of abandoning it.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Mon, Mar 9, 2015 at 7:10 PM, Andrew Wang
>>>>>>>>>> <an...@cloudera.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi Raymie,
>>>>>>>>>>> 
>>>>>>>>>>> Konst proposed just releasing off of trunk rather than cutting a
>>>>>>>>>> branch-2,
>>>>>>>>>>> and there was general agreement there. So, consider #3
>> abandoned.
>>>>>>>>>>> 1&2
>>>>>>>>> can
>>>>>>>>>>> be achieved at the same time, we just need to avoid using JDK8
>>>>>>>>>>> language features in trunk so things can be backported.
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Andrew
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Mar 9, 2015 at 7:01 PM, Raymie Stata
>>>>>>>>>>> <rs...@altiscale.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> In this (and the related threads), I see the following three
>>>>>>>>>> requirements:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. "Bump the source JDK version to JDK8" (ie, drop JDK7
>> support).
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. "We'll still be releasing 2.x releases for a while, with
>>>>>>>>>>>> similar feature sets as 3.x."
>>>>>>>>>>>> 
>>>>>>>>>>>> 3. Avoid the "risk of split-brain behavior" by "minimize
>>>>>>>>>>>> backporting headaches. Pulling trunk > branch-2 > branch-2.x is
>>>>>>>> already tedious.
>>>>>>>>>>>> Adding a branch-3, branch-3.x would be obnoxious."
>>>>>>>>>>>> 
>>>>>>>>>>>> These three cannot be achieved at the same time.  Which do we
>>>>>>>> abandon?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia
>>>>>>>>>>>> <sa...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mar 5, 2015, at 3:21 PM, Siddharth Seth <sseth@apache.org
>>> 
>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2) Simplification of configs - potentially separating client
>>>>>>>>>>>>>> side
>>>>>>>>>>>> configs
>>>>>>>>>>>>>> and those used by daemons. This is another source of
>> perpetual
>>>>>>>>>> confusion
>>>>>>>>>>>>>> for users.
>>>>>>>>>>>>> + 1 on this.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> sanjay
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>