You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Munagala Ramanath <ra...@datatorrent.com> on 2016/07/11 15:59:48 UTC

Bleeding edge branch ?

We've had a number of issues recently related to dependencies on old
versions
of various packages/libraries such as Hadoop itself, Google guava,
HTTPClient,
mbassador, etc.

How about we create a "bleeding-edge" branch in both Core and Malhar which
will use the latest versions of these various dependencies, upgrade to Java
8 so
we can use the new Java features, etc. ?

This will give us an opportunity to discover these sorts of problems early
and,
when we are ready to pull the trigger for a major version, we have a branch
ready
for merge with, hopefully, minimal additional effort.

There will be no guarantees w.r.t. this branch so people using it use it at
their own
risk.

Ram

Re: Bleeding edge branch ?

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Can we create JIRAs for the issues we are facing by staying with the older
versions and link them to a story JIRA. If we know the particulars of all
the problems we might be able to chart a more finer grained course.

Thanks

On Wed, Jul 20, 2016 at 10:56 AM, Pradeep A. Dalvi <pr...@apache.org> wrote:

> I agree with Sandesh on following. The official branch from where releases
> are cut, shall continue taking EOL into consideration. However we also need
> to be prepared wrt future releases of Hadoop.
>
> --prad
>
> On Wed, Jul 20, 2016 at 10:43 AM, Sandesh Hegde <sa...@datatorrent.com>
> wrote:
>
> > @Amol
> >
> > EOL is important for master branch. To start the work on next version of
> > Hadoop on different branch ( let us call that master++ ), we should not
> > worry about the EOL. Eventually, master++ becomes master and the master++
> > will continue on the later version of the Hadoop.
> >
> >
> >
> > On Wed, Jul 20, 2016 at 10:30 AM Siyuan Hua <si...@datatorrent.com>
> > wrote:
> >
> > > Ok, whether branches or forks. I still think we should have at least
> some
> > > materialized version of malhar/core for the big influencer like java,
> > > hadoop or even kafka. Java 8, for example, is actually not new.  We
> don't
> > > have to be aggressive to try out new features from those right now. But
> > we
> > > can at least have some CI run build/test periodically and make sure our
> > > current code is future-prove and avoid some future-deprecated code when
> > we
> > > add new features. Also if people ask for it, we can have a link to
> point
> > > them to.  BTW, High-level API can definitely benefit from java 8.  :)
> > >
> > > Regards,
> > > Siyuan
> > >
> > > On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde <
> sandesh@datatorrent.com>
> > > wrote:
> > >
> > > > Our current model of supporting the oldest supported Hadoop,
> penalizes
> > > the
> > > > users of latest Hadoop versions by favoring the slow movers.
> > > > Also, we won't benefit from the increased maturity of the Hadoop
> > > platform,
> > > > as we will be working on the many years old version of Hadoop.
> > > > We also need to incentivize our customers to upgrade their Hadoop
> > > version,
> > > > by making use of new features.
> > > >
> > > > My vote goes to start the work on the Hadoop 2.6 ( or any other
> > version )
> > > > in a different branch, without waiting for the EOL policies.
> > > >
> > > > On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <thomas@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > -0
> > > > >
> > > > > I read the thread twice, it is not clear to me what benefit Apex
> > users
> > > > > derive from this exercise. A branch normally contains development
> > work
> > > > that
> > > > > is eventually brought back to the main line and into a release.
> Here,
> > > the
> > > > > suggestion seems to be an open ended effort to play with latest
> tech,
> > > > isn't
> > > > > that something anyone (including a group of folks) can do in a
> fork.
> > I
> > > > > don't see value in a permanent branch for that, who is going to
> > > maintain
> > > > > such code and who will ever use it?
> > > > >
> > > > > There was a point that we can find out about potential problems
> with
> > > > later
> > > > > versions. The way to find such issues is to take the releases and
> run
> > > > them
> > > > > on these later versions (that's what users do), not by changing the
> > > code!
> > > > >
> > > > > Regarding Java version: Our users don't use Apex in a vacuum.
> Please
> > > > have a
> > > > > look at ASF Hadoop and the distros EOL policies. That will answer
> the
> > > > > question what Java version is appropriate. I would be surprised if
> > > > > something that works on Java 7 falls flat on the face with Java 8
> as
> > a
> > > > lot
> > > > > of diligence goes into backward compatibility. Again the way to
> tests
> > > > this
> > > > > is to run verification with existing Apex releases on Java 8 based
> > > stack.
> > > > >
> > > > > Regarding Hadoop version: This has been discussed off record
> several
> > > > times
> > > > > and there are actual JIRA tickets marked accordingly so that the
> work
> > > is
> > > > > done when we move. It is a separate discussion, no need to mix Java
> > > > > versions and branching with it. I agree with what David said, if
> > > someone
> > > > > can show that we can move up to 2.6 based on EOL policies and what
> > > known
> > > > > Apex users have in production, then we should work on that upgrade.
> > The
> > > > way
> > > > > I imagine it would work is that we have a Hadoop-2.6 (or whatever
> > > > version)
> > > > > branch, make all the upgrade related changes there (which should
> be a
> > > > list
> > > > > of JIRAs) and then merge it back to master when we are satisfied.
> > After
> > > > > that, the branch can be deleted.
> > > > >
> > > > > Thomas
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> > > > > chinmay@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > I'm -0 on this idea.
> > > > > >
> > > > > > Here is the reason:
> > > > > > Unless we see a real case where users want to see everything on
> > > latest,
> > > > > > this branch might quickly become low hanging fruit and eventually
> > get
> > > > > > obsolete because its anyway a "no gaurantee" branch.
> > > > > >
> > > > > > We have a bunch of dependencies which we'll have to take care of
> to
> > > > > really
> > > > > > make it bleeding edge. Specially about malhar, its a long list.
> > That
> > > > > looks
> > > > > > like quite significant work.
> > > > > > Moreover, if this branch is going to be in "may or may not work"
> > > state;
> > > > > I,
> > > > > > as a user or developer, would bank on what certainly works.
> > > > > >
> > > > > > I also think that, if its going to be "no gaurantee" then its
> worth
> > > > > > spending time contributions towards master rather than
> > bleeding-edge
> > > > > > branch.
> > > > > >
> > > > > > If a question of "should we upgrade?" comes, the community is
> > mature
> > > to
> > > > > > take that call then and work accordingly.
> > > > > >
> > > > > > -Chinmay.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <
> > priyag@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > +1 for creating such branch.
> > > > > > > One of us will have to rebase it with master branch at
> > intervals. I
> > > > > don't
> > > > > > > think everyone will cherry-pick their commits here. We can make
> > it
> > > > once
> > > > > > in
> > > > > > > a month activity. Are we considering updating all dependency
> > > library
> > > > > > > version as well?
> > > > > > >
> > > > > > > -Priyanka
> > > > > > >
> > > > > > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <
> > > > > ram@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Following up on some comments, wanted to clarify what I have
> in
> > > > mind
> > > > > > for
> > > > > > > > this branch:
> > > > > > > >
> > > > > > > > 1. The main goal is to stay up-to-date with new releases, so
> > if a
> > > > > > > question
> > > > > > > > of the form
> > > > > > > >     "A new release of X is available, should we upgrade ?"
> > comes
> > > > up,
> > > > > > the
> > > > > > > > answer is
> > > > > > > >     *always* an *emphatic* yes; otherwise it doesn't bleed
> > enough
> > > > > (:-)
> > > > > > as
> > > > > > > > Sanjay points out.
> > > > > > > > 2. Pull requests are submitted as always; there is no
> > requirement
> > > > to
> > > > > > > > generate an additional
> > > > > > > >     pull requests against this branch. It may get
> > > > > merged/cherry-picked
> > > > > > > > depending on who has the
> > > > > > > >    time and inclination to do it.
> > > > > > > > 3. There is no expectation of dedication of any additional
> > > > resources,
> > > > > > so
> > > > > > > > people work on
> > > > > > > >     it as and when time is available. ("No guarantee" means
> > > exactly
> > > > > > > that).
> > > > > > > > So there is no
> > > > > > > >     question of "maintaining" this branch.
> > > > > > > > 4. This branch is not to be encumbered with legacy and/or
> > > backward
> > > > > > > > compatibility issues.
> > > > > > > > 5. This branch is not an experimental sandbox to try out new
> > > > > > algorithms,
> > > > > > > > architectural changes
> > > > > > > >     and other such changes.
> > > > > > > >
> > > > > > > > As always, I'm open to other ideas, but that is what I had in
> > > mind
> > > > > > when I
> > > > > > > > made the suggestion.
> > > > > > > >
> > > > > > > > Ram
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <
> > > > > sanjay@datatorrent.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > As the name suggests the "bleeding-edge" branch ideally
> > should
> > > > use
> > > > > > > > bleeding
> > > > > > > > > edge versions so I would like to see Java 8 used there (and
> > > > Hadoop
> > > > > 3
> > > > > > > when
> > > > > > > > > it does eventually come out) to make the maintenance effort
> > > > > > > worthwhile...
> > > > > > > > >
> > > > > > > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <
> > > > david@datatorrent.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm
> > especially
> > > > > strong
> > > > > > > +1
> > > > > > > > > for
> > > > > > > > > > upgrading the Hadoop dependency version.
> > > > > > > > > >
> > > > > > > > > > Here are my reasons:
> > > > > > > > > >
> > > > > > > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still
> > > supports
> > > > > > Java
> > > > > > > 7
> > > > > > > > > and
> > > > > > > > > > there will probably be some time (I'm guessing more than
> > one
> > > > > year)
> > > > > > > for
> > > > > > > > > > Hadoop 3 to become GA and for major distros to support
> > Hadoop
> > > > 3.
> > > > > > The
> > > > > > > > > > maintenance effort for having two branches, one for Java
> 7
> > > and
> > > > > one
> > > > > > > for
> > > > > > > > > Java
> > > > > > > > > > 8 is not worth it at this time.
> > > > > > > > > >
> > > > > > > > > > - Apex currently uses Hadoop 2.2 dependencies, marked
> > > > "provided".
> > > > > > And
> > > > > > > > > > Hadoop 2.4 has been released more than two years ago, and
> > it
> > > > > added
> > > > > > a
> > > > > > > > lot
> > > > > > > > > of
> > > > > > > > > > features in the API that Apex can make use of. Most
> distros
> > > > > already
> > > > > > > > > bundle
> > > > > > > > > > Hadoop 2.6 or later. Although some old versions of
> Cloudera
> > > > that
> > > > > > > > include
> > > > > > > > > > hadoop version earlier than 2.4 still have not reached
> > > > > end-of-life
> > > > > > > yet,
> > > > > > > > > the
> > > > > > > > > > number of users using those old versions is probably very
> > > > small.
> > > > > > > > > >
> > > > > > > > > > David
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > > > > > > > ram@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > We've had a number of issues recently related to
> > > dependencies
> > > > > on
> > > > > > > old
> > > > > > > > > > > versions
> > > > > > > > > > > of various packages/libraries such as Hadoop itself,
> > Google
> > > > > > guava,
> > > > > > > > > > > HTTPClient,
> > > > > > > > > > > mbassador, etc.
> > > > > > > > > > >
> > > > > > > > > > > How about we create a "bleeding-edge" branch in both
> Core
> > > and
> > > > > > > Malhar
> > > > > > > > > > which
> > > > > > > > > > > will use the latest versions of these various
> > dependencies,
> > > > > > upgrade
> > > > > > > > to
> > > > > > > > > > Java
> > > > > > > > > > > 8 so
> > > > > > > > > > > we can use the new Java features, etc. ?
> > > > > > > > > > >
> > > > > > > > > > > This will give us an opportunity to discover these
> sorts
> > of
> > > > > > > problems
> > > > > > > > > > early
> > > > > > > > > > > and,
> > > > > > > > > > > when we are ready to pull the trigger for a major
> > version,
> > > we
> > > > > > have
> > > > > > > a
> > > > > > > > > > branch
> > > > > > > > > > > ready
> > > > > > > > > > > for merge with, hopefully, minimal additional effort.
> > > > > > > > > > >
> > > > > > > > > > > There will be no guarantees w.r.t. this branch so
> people
> > > > using
> > > > > it
> > > > > > > use
> > > > > > > > > it
> > > > > > > > > > at
> > > > > > > > > > > their own
> > > > > > > > > > > risk.
> > > > > > > > > > >
> > > > > > > > > > > Ram
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by "Pradeep A. Dalvi" <pr...@apache.org>.
I agree with Sandesh on following. The official branch from where releases
are cut, shall continue taking EOL into consideration. However we also need
to be prepared wrt future releases of Hadoop.

--prad

On Wed, Jul 20, 2016 at 10:43 AM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> @Amol
>
> EOL is important for master branch. To start the work on next version of
> Hadoop on different branch ( let us call that master++ ), we should not
> worry about the EOL. Eventually, master++ becomes master and the master++
> will continue on the later version of the Hadoop.
>
>
>
> On Wed, Jul 20, 2016 at 10:30 AM Siyuan Hua <si...@datatorrent.com>
> wrote:
>
> > Ok, whether branches or forks. I still think we should have at least some
> > materialized version of malhar/core for the big influencer like java,
> > hadoop or even kafka. Java 8, for example, is actually not new.  We don't
> > have to be aggressive to try out new features from those right now. But
> we
> > can at least have some CI run build/test periodically and make sure our
> > current code is future-prove and avoid some future-deprecated code when
> we
> > add new features. Also if people ask for it, we can have a link to point
> > them to.  BTW, High-level API can definitely benefit from java 8.  :)
> >
> > Regards,
> > Siyuan
> >
> > On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde <sa...@datatorrent.com>
> > wrote:
> >
> > > Our current model of supporting the oldest supported Hadoop, penalizes
> > the
> > > users of latest Hadoop versions by favoring the slow movers.
> > > Also, we won't benefit from the increased maturity of the Hadoop
> > platform,
> > > as we will be working on the many years old version of Hadoop.
> > > We also need to incentivize our customers to upgrade their Hadoop
> > version,
> > > by making use of new features.
> > >
> > > My vote goes to start the work on the Hadoop 2.6 ( or any other
> version )
> > > in a different branch, without waiting for the EOL policies.
> > >
> > > On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <th...@datatorrent.com>
> > > wrote:
> > >
> > > > -0
> > > >
> > > > I read the thread twice, it is not clear to me what benefit Apex
> users
> > > > derive from this exercise. A branch normally contains development
> work
> > > that
> > > > is eventually brought back to the main line and into a release. Here,
> > the
> > > > suggestion seems to be an open ended effort to play with latest tech,
> > > isn't
> > > > that something anyone (including a group of folks) can do in a fork.
> I
> > > > don't see value in a permanent branch for that, who is going to
> > maintain
> > > > such code and who will ever use it?
> > > >
> > > > There was a point that we can find out about potential problems with
> > > later
> > > > versions. The way to find such issues is to take the releases and run
> > > them
> > > > on these later versions (that's what users do), not by changing the
> > code!
> > > >
> > > > Regarding Java version: Our users don't use Apex in a vacuum. Please
> > > have a
> > > > look at ASF Hadoop and the distros EOL policies. That will answer the
> > > > question what Java version is appropriate. I would be surprised if
> > > > something that works on Java 7 falls flat on the face with Java 8 as
> a
> > > lot
> > > > of diligence goes into backward compatibility. Again the way to tests
> > > this
> > > > is to run verification with existing Apex releases on Java 8 based
> > stack.
> > > >
> > > > Regarding Hadoop version: This has been discussed off record several
> > > times
> > > > and there are actual JIRA tickets marked accordingly so that the work
> > is
> > > > done when we move. It is a separate discussion, no need to mix Java
> > > > versions and branching with it. I agree with what David said, if
> > someone
> > > > can show that we can move up to 2.6 based on EOL policies and what
> > known
> > > > Apex users have in production, then we should work on that upgrade.
> The
> > > way
> > > > I imagine it would work is that we have a Hadoop-2.6 (or whatever
> > > version)
> > > > branch, make all the upgrade related changes there (which should be a
> > > list
> > > > of JIRAs) and then merge it back to master when we are satisfied.
> After
> > > > that, the branch can be deleted.
> > > >
> > > > Thomas
> > > >
> > > >
> > > >
> > > > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> > > > chinmay@datatorrent.com>
> > > > wrote:
> > > >
> > > > > I'm -0 on this idea.
> > > > >
> > > > > Here is the reason:
> > > > > Unless we see a real case where users want to see everything on
> > latest,
> > > > > this branch might quickly become low hanging fruit and eventually
> get
> > > > > obsolete because its anyway a "no gaurantee" branch.
> > > > >
> > > > > We have a bunch of dependencies which we'll have to take care of to
> > > > really
> > > > > make it bleeding edge. Specially about malhar, its a long list.
> That
> > > > looks
> > > > > like quite significant work.
> > > > > Moreover, if this branch is going to be in "may or may not work"
> > state;
> > > > I,
> > > > > as a user or developer, would bank on what certainly works.
> > > > >
> > > > > I also think that, if its going to be "no gaurantee" then its worth
> > > > > spending time contributions towards master rather than
> bleeding-edge
> > > > > branch.
> > > > >
> > > > > If a question of "should we upgrade?" comes, the community is
> mature
> > to
> > > > > take that call then and work accordingly.
> > > > >
> > > > > -Chinmay.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <
> priyag@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > +1 for creating such branch.
> > > > > > One of us will have to rebase it with master branch at
> intervals. I
> > > > don't
> > > > > > think everyone will cherry-pick their commits here. We can make
> it
> > > once
> > > > > in
> > > > > > a month activity. Are we considering updating all dependency
> > library
> > > > > > version as well?
> > > > > >
> > > > > > -Priyanka
> > > > > >
> > > > > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <
> > > > ram@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Following up on some comments, wanted to clarify what I have in
> > > mind
> > > > > for
> > > > > > > this branch:
> > > > > > >
> > > > > > > 1. The main goal is to stay up-to-date with new releases, so
> if a
> > > > > > question
> > > > > > > of the form
> > > > > > >     "A new release of X is available, should we upgrade ?"
> comes
> > > up,
> > > > > the
> > > > > > > answer is
> > > > > > >     *always* an *emphatic* yes; otherwise it doesn't bleed
> enough
> > > > (:-)
> > > > > as
> > > > > > > Sanjay points out.
> > > > > > > 2. Pull requests are submitted as always; there is no
> requirement
> > > to
> > > > > > > generate an additional
> > > > > > >     pull requests against this branch. It may get
> > > > merged/cherry-picked
> > > > > > > depending on who has the
> > > > > > >    time and inclination to do it.
> > > > > > > 3. There is no expectation of dedication of any additional
> > > resources,
> > > > > so
> > > > > > > people work on
> > > > > > >     it as and when time is available. ("No guarantee" means
> > exactly
> > > > > > that).
> > > > > > > So there is no
> > > > > > >     question of "maintaining" this branch.
> > > > > > > 4. This branch is not to be encumbered with legacy and/or
> > backward
> > > > > > > compatibility issues.
> > > > > > > 5. This branch is not an experimental sandbox to try out new
> > > > > algorithms,
> > > > > > > architectural changes
> > > > > > >     and other such changes.
> > > > > > >
> > > > > > > As always, I'm open to other ideas, but that is what I had in
> > mind
> > > > > when I
> > > > > > > made the suggestion.
> > > > > > >
> > > > > > > Ram
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <
> > > > sanjay@datatorrent.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > As the name suggests the "bleeding-edge" branch ideally
> should
> > > use
> > > > > > > bleeding
> > > > > > > > edge versions so I would like to see Java 8 used there (and
> > > Hadoop
> > > > 3
> > > > > > when
> > > > > > > > it does eventually come out) to make the maintenance effort
> > > > > > worthwhile...
> > > > > > > >
> > > > > > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <
> > > david@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm
> especially
> > > > strong
> > > > > > +1
> > > > > > > > for
> > > > > > > > > upgrading the Hadoop dependency version.
> > > > > > > > >
> > > > > > > > > Here are my reasons:
> > > > > > > > >
> > > > > > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still
> > supports
> > > > > Java
> > > > > > 7
> > > > > > > > and
> > > > > > > > > there will probably be some time (I'm guessing more than
> one
> > > > year)
> > > > > > for
> > > > > > > > > Hadoop 3 to become GA and for major distros to support
> Hadoop
> > > 3.
> > > > > The
> > > > > > > > > maintenance effort for having two branches, one for Java 7
> > and
> > > > one
> > > > > > for
> > > > > > > > Java
> > > > > > > > > 8 is not worth it at this time.
> > > > > > > > >
> > > > > > > > > - Apex currently uses Hadoop 2.2 dependencies, marked
> > > "provided".
> > > > > And
> > > > > > > > > Hadoop 2.4 has been released more than two years ago, and
> it
> > > > added
> > > > > a
> > > > > > > lot
> > > > > > > > of
> > > > > > > > > features in the API that Apex can make use of. Most distros
> > > > already
> > > > > > > > bundle
> > > > > > > > > Hadoop 2.6 or later. Although some old versions of Cloudera
> > > that
> > > > > > > include
> > > > > > > > > hadoop version earlier than 2.4 still have not reached
> > > > end-of-life
> > > > > > yet,
> > > > > > > > the
> > > > > > > > > number of users using those old versions is probably very
> > > small.
> > > > > > > > >
> > > > > > > > > David
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > > > > > > ram@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > We've had a number of issues recently related to
> > dependencies
> > > > on
> > > > > > old
> > > > > > > > > > versions
> > > > > > > > > > of various packages/libraries such as Hadoop itself,
> Google
> > > > > guava,
> > > > > > > > > > HTTPClient,
> > > > > > > > > > mbassador, etc.
> > > > > > > > > >
> > > > > > > > > > How about we create a "bleeding-edge" branch in both Core
> > and
> > > > > > Malhar
> > > > > > > > > which
> > > > > > > > > > will use the latest versions of these various
> dependencies,
> > > > > upgrade
> > > > > > > to
> > > > > > > > > Java
> > > > > > > > > > 8 so
> > > > > > > > > > we can use the new Java features, etc. ?
> > > > > > > > > >
> > > > > > > > > > This will give us an opportunity to discover these sorts
> of
> > > > > > problems
> > > > > > > > > early
> > > > > > > > > > and,
> > > > > > > > > > when we are ready to pull the trigger for a major
> version,
> > we
> > > > > have
> > > > > > a
> > > > > > > > > branch
> > > > > > > > > > ready
> > > > > > > > > > for merge with, hopefully, minimal additional effort.
> > > > > > > > > >
> > > > > > > > > > There will be no guarantees w.r.t. this branch so people
> > > using
> > > > it
> > > > > > use
> > > > > > > > it
> > > > > > > > > at
> > > > > > > > > > their own
> > > > > > > > > > risk.
> > > > > > > > > >
> > > > > > > > > > Ram
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Sandesh Hegde <sa...@datatorrent.com>.
@Amol

EOL is important for master branch. To start the work on next version of
Hadoop on different branch ( let us call that master++ ), we should not
worry about the EOL. Eventually, master++ becomes master and the master++
will continue on the later version of the Hadoop.



On Wed, Jul 20, 2016 at 10:30 AM Siyuan Hua <si...@datatorrent.com> wrote:

> Ok, whether branches or forks. I still think we should have at least some
> materialized version of malhar/core for the big influencer like java,
> hadoop or even kafka. Java 8, for example, is actually not new.  We don't
> have to be aggressive to try out new features from those right now. But we
> can at least have some CI run build/test periodically and make sure our
> current code is future-prove and avoid some future-deprecated code when we
> add new features. Also if people ask for it, we can have a link to point
> them to.  BTW, High-level API can definitely benefit from java 8.  :)
>
> Regards,
> Siyuan
>
> On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde <sa...@datatorrent.com>
> wrote:
>
> > Our current model of supporting the oldest supported Hadoop, penalizes
> the
> > users of latest Hadoop versions by favoring the slow movers.
> > Also, we won't benefit from the increased maturity of the Hadoop
> platform,
> > as we will be working on the many years old version of Hadoop.
> > We also need to incentivize our customers to upgrade their Hadoop
> version,
> > by making use of new features.
> >
> > My vote goes to start the work on the Hadoop 2.6 ( or any other version )
> > in a different branch, without waiting for the EOL policies.
> >
> > On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <th...@datatorrent.com>
> > wrote:
> >
> > > -0
> > >
> > > I read the thread twice, it is not clear to me what benefit Apex users
> > > derive from this exercise. A branch normally contains development work
> > that
> > > is eventually brought back to the main line and into a release. Here,
> the
> > > suggestion seems to be an open ended effort to play with latest tech,
> > isn't
> > > that something anyone (including a group of folks) can do in a fork. I
> > > don't see value in a permanent branch for that, who is going to
> maintain
> > > such code and who will ever use it?
> > >
> > > There was a point that we can find out about potential problems with
> > later
> > > versions. The way to find such issues is to take the releases and run
> > them
> > > on these later versions (that's what users do), not by changing the
> code!
> > >
> > > Regarding Java version: Our users don't use Apex in a vacuum. Please
> > have a
> > > look at ASF Hadoop and the distros EOL policies. That will answer the
> > > question what Java version is appropriate. I would be surprised if
> > > something that works on Java 7 falls flat on the face with Java 8 as a
> > lot
> > > of diligence goes into backward compatibility. Again the way to tests
> > this
> > > is to run verification with existing Apex releases on Java 8 based
> stack.
> > >
> > > Regarding Hadoop version: This has been discussed off record several
> > times
> > > and there are actual JIRA tickets marked accordingly so that the work
> is
> > > done when we move. It is a separate discussion, no need to mix Java
> > > versions and branching with it. I agree with what David said, if
> someone
> > > can show that we can move up to 2.6 based on EOL policies and what
> known
> > > Apex users have in production, then we should work on that upgrade. The
> > way
> > > I imagine it would work is that we have a Hadoop-2.6 (or whatever
> > version)
> > > branch, make all the upgrade related changes there (which should be a
> > list
> > > of JIRAs) and then merge it back to master when we are satisfied. After
> > > that, the branch can be deleted.
> > >
> > > Thomas
> > >
> > >
> > >
> > > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> > > chinmay@datatorrent.com>
> > > wrote:
> > >
> > > > I'm -0 on this idea.
> > > >
> > > > Here is the reason:
> > > > Unless we see a real case where users want to see everything on
> latest,
> > > > this branch might quickly become low hanging fruit and eventually get
> > > > obsolete because its anyway a "no gaurantee" branch.
> > > >
> > > > We have a bunch of dependencies which we'll have to take care of to
> > > really
> > > > make it bleeding edge. Specially about malhar, its a long list. That
> > > looks
> > > > like quite significant work.
> > > > Moreover, if this branch is going to be in "may or may not work"
> state;
> > > I,
> > > > as a user or developer, would bank on what certainly works.
> > > >
> > > > I also think that, if its going to be "no gaurantee" then its worth
> > > > spending time contributions towards master rather than bleeding-edge
> > > > branch.
> > > >
> > > > If a question of "should we upgrade?" comes, the community is mature
> to
> > > > take that call then and work accordingly.
> > > >
> > > > -Chinmay.
> > > >
> > > >
> > > >
> > > > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <priyag@apache.org
> >
> > > > wrote:
> > > >
> > > > > +1 for creating such branch.
> > > > > One of us will have to rebase it with master branch at intervals. I
> > > don't
> > > > > think everyone will cherry-pick their commits here. We can make it
> > once
> > > > in
> > > > > a month activity. Are we considering updating all dependency
> library
> > > > > version as well?
> > > > >
> > > > > -Priyanka
> > > > >
> > > > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <
> > > ram@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > Following up on some comments, wanted to clarify what I have in
> > mind
> > > > for
> > > > > > this branch:
> > > > > >
> > > > > > 1. The main goal is to stay up-to-date with new releases, so if a
> > > > > question
> > > > > > of the form
> > > > > >     "A new release of X is available, should we upgrade ?" comes
> > up,
> > > > the
> > > > > > answer is
> > > > > >     *always* an *emphatic* yes; otherwise it doesn't bleed enough
> > > (:-)
> > > > as
> > > > > > Sanjay points out.
> > > > > > 2. Pull requests are submitted as always; there is no requirement
> > to
> > > > > > generate an additional
> > > > > >     pull requests against this branch. It may get
> > > merged/cherry-picked
> > > > > > depending on who has the
> > > > > >    time and inclination to do it.
> > > > > > 3. There is no expectation of dedication of any additional
> > resources,
> > > > so
> > > > > > people work on
> > > > > >     it as and when time is available. ("No guarantee" means
> exactly
> > > > > that).
> > > > > > So there is no
> > > > > >     question of "maintaining" this branch.
> > > > > > 4. This branch is not to be encumbered with legacy and/or
> backward
> > > > > > compatibility issues.
> > > > > > 5. This branch is not an experimental sandbox to try out new
> > > > algorithms,
> > > > > > architectural changes
> > > > > >     and other such changes.
> > > > > >
> > > > > > As always, I'm open to other ideas, but that is what I had in
> mind
> > > > when I
> > > > > > made the suggestion.
> > > > > >
> > > > > > Ram
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <
> > > sanjay@datatorrent.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > As the name suggests the "bleeding-edge" branch ideally should
> > use
> > > > > > bleeding
> > > > > > > edge versions so I would like to see Java 8 used there (and
> > Hadoop
> > > 3
> > > > > when
> > > > > > > it does eventually come out) to make the maintenance effort
> > > > > worthwhile...
> > > > > > >
> > > > > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <
> > david@datatorrent.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially
> > > strong
> > > > > +1
> > > > > > > for
> > > > > > > > upgrading the Hadoop dependency version.
> > > > > > > >
> > > > > > > > Here are my reasons:
> > > > > > > >
> > > > > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still
> supports
> > > > Java
> > > > > 7
> > > > > > > and
> > > > > > > > there will probably be some time (I'm guessing more than one
> > > year)
> > > > > for
> > > > > > > > Hadoop 3 to become GA and for major distros to support Hadoop
> > 3.
> > > > The
> > > > > > > > maintenance effort for having two branches, one for Java 7
> and
> > > one
> > > > > for
> > > > > > > Java
> > > > > > > > 8 is not worth it at this time.
> > > > > > > >
> > > > > > > > - Apex currently uses Hadoop 2.2 dependencies, marked
> > "provided".
> > > > And
> > > > > > > > Hadoop 2.4 has been released more than two years ago, and it
> > > added
> > > > a
> > > > > > lot
> > > > > > > of
> > > > > > > > features in the API that Apex can make use of. Most distros
> > > already
> > > > > > > bundle
> > > > > > > > Hadoop 2.6 or later. Although some old versions of Cloudera
> > that
> > > > > > include
> > > > > > > > hadoop version earlier than 2.4 still have not reached
> > > end-of-life
> > > > > yet,
> > > > > > > the
> > > > > > > > number of users using those old versions is probably very
> > small.
> > > > > > > >
> > > > > > > > David
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > > > > > ram@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > We've had a number of issues recently related to
> dependencies
> > > on
> > > > > old
> > > > > > > > > versions
> > > > > > > > > of various packages/libraries such as Hadoop itself, Google
> > > > guava,
> > > > > > > > > HTTPClient,
> > > > > > > > > mbassador, etc.
> > > > > > > > >
> > > > > > > > > How about we create a "bleeding-edge" branch in both Core
> and
> > > > > Malhar
> > > > > > > > which
> > > > > > > > > will use the latest versions of these various dependencies,
> > > > upgrade
> > > > > > to
> > > > > > > > Java
> > > > > > > > > 8 so
> > > > > > > > > we can use the new Java features, etc. ?
> > > > > > > > >
> > > > > > > > > This will give us an opportunity to discover these sorts of
> > > > > problems
> > > > > > > > early
> > > > > > > > > and,
> > > > > > > > > when we are ready to pull the trigger for a major version,
> we
> > > > have
> > > > > a
> > > > > > > > branch
> > > > > > > > > ready
> > > > > > > > > for merge with, hopefully, minimal additional effort.
> > > > > > > > >
> > > > > > > > > There will be no guarantees w.r.t. this branch so people
> > using
> > > it
> > > > > use
> > > > > > > it
> > > > > > > > at
> > > > > > > > > their own
> > > > > > > > > risk.
> > > > > > > > >
> > > > > > > > > Ram
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Siyuan Hua <si...@datatorrent.com>.
Ok, whether branches or forks. I still think we should have at least some
materialized version of malhar/core for the big influencer like java,
hadoop or even kafka. Java 8, for example, is actually not new.  We don't
have to be aggressive to try out new features from those right now. But we
can at least have some CI run build/test periodically and make sure our
current code is future-prove and avoid some future-deprecated code when we
add new features. Also if people ask for it, we can have a link to point
them to.  BTW, High-level API can definitely benefit from java 8.  :)

Regards,
Siyuan

On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> Our current model of supporting the oldest supported Hadoop, penalizes the
> users of latest Hadoop versions by favoring the slow movers.
> Also, we won't benefit from the increased maturity of the Hadoop platform,
> as we will be working on the many years old version of Hadoop.
> We also need to incentivize our customers to upgrade their Hadoop version,
> by making use of new features.
>
> My vote goes to start the work on the Hadoop 2.6 ( or any other version )
> in a different branch, without waiting for the EOL policies.
>
> On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > -0
> >
> > I read the thread twice, it is not clear to me what benefit Apex users
> > derive from this exercise. A branch normally contains development work
> that
> > is eventually brought back to the main line and into a release. Here, the
> > suggestion seems to be an open ended effort to play with latest tech,
> isn't
> > that something anyone (including a group of folks) can do in a fork. I
> > don't see value in a permanent branch for that, who is going to maintain
> > such code and who will ever use it?
> >
> > There was a point that we can find out about potential problems with
> later
> > versions. The way to find such issues is to take the releases and run
> them
> > on these later versions (that's what users do), not by changing the code!
> >
> > Regarding Java version: Our users don't use Apex in a vacuum. Please
> have a
> > look at ASF Hadoop and the distros EOL policies. That will answer the
> > question what Java version is appropriate. I would be surprised if
> > something that works on Java 7 falls flat on the face with Java 8 as a
> lot
> > of diligence goes into backward compatibility. Again the way to tests
> this
> > is to run verification with existing Apex releases on Java 8 based stack.
> >
> > Regarding Hadoop version: This has been discussed off record several
> times
> > and there are actual JIRA tickets marked accordingly so that the work is
> > done when we move. It is a separate discussion, no need to mix Java
> > versions and branching with it. I agree with what David said, if someone
> > can show that we can move up to 2.6 based on EOL policies and what known
> > Apex users have in production, then we should work on that upgrade. The
> way
> > I imagine it would work is that we have a Hadoop-2.6 (or whatever
> version)
> > branch, make all the upgrade related changes there (which should be a
> list
> > of JIRAs) and then merge it back to master when we are satisfied. After
> > that, the branch can be deleted.
> >
> > Thomas
> >
> >
> >
> > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> > chinmay@datatorrent.com>
> > wrote:
> >
> > > I'm -0 on this idea.
> > >
> > > Here is the reason:
> > > Unless we see a real case where users want to see everything on latest,
> > > this branch might quickly become low hanging fruit and eventually get
> > > obsolete because its anyway a "no gaurantee" branch.
> > >
> > > We have a bunch of dependencies which we'll have to take care of to
> > really
> > > make it bleeding edge. Specially about malhar, its a long list. That
> > looks
> > > like quite significant work.
> > > Moreover, if this branch is going to be in "may or may not work" state;
> > I,
> > > as a user or developer, would bank on what certainly works.
> > >
> > > I also think that, if its going to be "no gaurantee" then its worth
> > > spending time contributions towards master rather than bleeding-edge
> > > branch.
> > >
> > > If a question of "should we upgrade?" comes, the community is mature to
> > > take that call then and work accordingly.
> > >
> > > -Chinmay.
> > >
> > >
> > >
> > > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <pr...@apache.org>
> > > wrote:
> > >
> > > > +1 for creating such branch.
> > > > One of us will have to rebase it with master branch at intervals. I
> > don't
> > > > think everyone will cherry-pick their commits here. We can make it
> once
> > > in
> > > > a month activity. Are we considering updating all dependency library
> > > > version as well?
> > > >
> > > > -Priyanka
> > > >
> > > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <
> > ram@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Following up on some comments, wanted to clarify what I have in
> mind
> > > for
> > > > > this branch:
> > > > >
> > > > > 1. The main goal is to stay up-to-date with new releases, so if a
> > > > question
> > > > > of the form
> > > > >     "A new release of X is available, should we upgrade ?" comes
> up,
> > > the
> > > > > answer is
> > > > >     *always* an *emphatic* yes; otherwise it doesn't bleed enough
> > (:-)
> > > as
> > > > > Sanjay points out.
> > > > > 2. Pull requests are submitted as always; there is no requirement
> to
> > > > > generate an additional
> > > > >     pull requests against this branch. It may get
> > merged/cherry-picked
> > > > > depending on who has the
> > > > >    time and inclination to do it.
> > > > > 3. There is no expectation of dedication of any additional
> resources,
> > > so
> > > > > people work on
> > > > >     it as and when time is available. ("No guarantee" means exactly
> > > > that).
> > > > > So there is no
> > > > >     question of "maintaining" this branch.
> > > > > 4. This branch is not to be encumbered with legacy and/or backward
> > > > > compatibility issues.
> > > > > 5. This branch is not an experimental sandbox to try out new
> > > algorithms,
> > > > > architectural changes
> > > > >     and other such changes.
> > > > >
> > > > > As always, I'm open to other ideas, but that is what I had in mind
> > > when I
> > > > > made the suggestion.
> > > > >
> > > > > Ram
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <
> > sanjay@datatorrent.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > As the name suggests the "bleeding-edge" branch ideally should
> use
> > > > > bleeding
> > > > > > edge versions so I would like to see Java 8 used there (and
> Hadoop
> > 3
> > > > when
> > > > > > it does eventually come out) to make the maintenance effort
> > > > worthwhile...
> > > > > >
> > > > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <
> david@datatorrent.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially
> > strong
> > > > +1
> > > > > > for
> > > > > > > upgrading the Hadoop dependency version.
> > > > > > >
> > > > > > > Here are my reasons:
> > > > > > >
> > > > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports
> > > Java
> > > > 7
> > > > > > and
> > > > > > > there will probably be some time (I'm guessing more than one
> > year)
> > > > for
> > > > > > > Hadoop 3 to become GA and for major distros to support Hadoop
> 3.
> > > The
> > > > > > > maintenance effort for having two branches, one for Java 7 and
> > one
> > > > for
> > > > > > Java
> > > > > > > 8 is not worth it at this time.
> > > > > > >
> > > > > > > - Apex currently uses Hadoop 2.2 dependencies, marked
> "provided".
> > > And
> > > > > > > Hadoop 2.4 has been released more than two years ago, and it
> > added
> > > a
> > > > > lot
> > > > > > of
> > > > > > > features in the API that Apex can make use of. Most distros
> > already
> > > > > > bundle
> > > > > > > Hadoop 2.6 or later. Although some old versions of Cloudera
> that
> > > > > include
> > > > > > > hadoop version earlier than 2.4 still have not reached
> > end-of-life
> > > > yet,
> > > > > > the
> > > > > > > number of users using those old versions is probably very
> small.
> > > > > > >
> > > > > > > David
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > > > > ram@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > We've had a number of issues recently related to dependencies
> > on
> > > > old
> > > > > > > > versions
> > > > > > > > of various packages/libraries such as Hadoop itself, Google
> > > guava,
> > > > > > > > HTTPClient,
> > > > > > > > mbassador, etc.
> > > > > > > >
> > > > > > > > How about we create a "bleeding-edge" branch in both Core and
> > > > Malhar
> > > > > > > which
> > > > > > > > will use the latest versions of these various dependencies,
> > > upgrade
> > > > > to
> > > > > > > Java
> > > > > > > > 8 so
> > > > > > > > we can use the new Java features, etc. ?
> > > > > > > >
> > > > > > > > This will give us an opportunity to discover these sorts of
> > > > problems
> > > > > > > early
> > > > > > > > and,
> > > > > > > > when we are ready to pull the trigger for a major version, we
> > > have
> > > > a
> > > > > > > branch
> > > > > > > > ready
> > > > > > > > for merge with, hopefully, minimal additional effort.
> > > > > > > >
> > > > > > > > There will be no guarantees w.r.t. this branch so people
> using
> > it
> > > > use
> > > > > > it
> > > > > > > at
> > > > > > > > their own
> > > > > > > > risk.
> > > > > > > >
> > > > > > > > Ram
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Amol Kekre <am...@datatorrent.com>.
Sandesh,
Not worrying about EOL is a big deal. It creates problems for current
users, and also sends a message to new users (pre-adoption) on how we will
take care of them. Two branches, etc. need to be thought through by all of
us in terms of our ability to support. IMHO, we are rushing on this topic.

Thks,
Amol


On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde <sa...@datatorrent.com>
wrote:

> Our current model of supporting the oldest supported Hadoop, penalizes the
> users of latest Hadoop versions by favoring the slow movers.
> Also, we won't benefit from the increased maturity of the Hadoop platform,
> as we will be working on the many years old version of Hadoop.
> We also need to incentivize our customers to upgrade their Hadoop version,
> by making use of new features.
>
> My vote goes to start the work on the Hadoop 2.6 ( or any other version )
> in a different branch, without waiting for the EOL policies.
>
> On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > -0
> >
> > I read the thread twice, it is not clear to me what benefit Apex users
> > derive from this exercise. A branch normally contains development work
> that
> > is eventually brought back to the main line and into a release. Here, the
> > suggestion seems to be an open ended effort to play with latest tech,
> isn't
> > that something anyone (including a group of folks) can do in a fork. I
> > don't see value in a permanent branch for that, who is going to maintain
> > such code and who will ever use it?
> >
> > There was a point that we can find out about potential problems with
> later
> > versions. The way to find such issues is to take the releases and run
> them
> > on these later versions (that's what users do), not by changing the code!
> >
> > Regarding Java version: Our users don't use Apex in a vacuum. Please
> have a
> > look at ASF Hadoop and the distros EOL policies. That will answer the
> > question what Java version is appropriate. I would be surprised if
> > something that works on Java 7 falls flat on the face with Java 8 as a
> lot
> > of diligence goes into backward compatibility. Again the way to tests
> this
> > is to run verification with existing Apex releases on Java 8 based stack.
> >
> > Regarding Hadoop version: This has been discussed off record several
> times
> > and there are actual JIRA tickets marked accordingly so that the work is
> > done when we move. It is a separate discussion, no need to mix Java
> > versions and branching with it. I agree with what David said, if someone
> > can show that we can move up to 2.6 based on EOL policies and what known
> > Apex users have in production, then we should work on that upgrade. The
> way
> > I imagine it would work is that we have a Hadoop-2.6 (or whatever
> version)
> > branch, make all the upgrade related changes there (which should be a
> list
> > of JIRAs) and then merge it back to master when we are satisfied. After
> > that, the branch can be deleted.
> >
> > Thomas
> >
> >
> >
> > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> > chinmay@datatorrent.com>
> > wrote:
> >
> > > I'm -0 on this idea.
> > >
> > > Here is the reason:
> > > Unless we see a real case where users want to see everything on latest,
> > > this branch might quickly become low hanging fruit and eventually get
> > > obsolete because its anyway a "no gaurantee" branch.
> > >
> > > We have a bunch of dependencies which we'll have to take care of to
> > really
> > > make it bleeding edge. Specially about malhar, its a long list. That
> > looks
> > > like quite significant work.
> > > Moreover, if this branch is going to be in "may or may not work" state;
> > I,
> > > as a user or developer, would bank on what certainly works.
> > >
> > > I also think that, if its going to be "no gaurantee" then its worth
> > > spending time contributions towards master rather than bleeding-edge
> > > branch.
> > >
> > > If a question of "should we upgrade?" comes, the community is mature to
> > > take that call then and work accordingly.
> > >
> > > -Chinmay.
> > >
> > >
> > >
> > > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <pr...@apache.org>
> > > wrote:
> > >
> > > > +1 for creating such branch.
> > > > One of us will have to rebase it with master branch at intervals. I
> > don't
> > > > think everyone will cherry-pick their commits here. We can make it
> once
> > > in
> > > > a month activity. Are we considering updating all dependency library
> > > > version as well?
> > > >
> > > > -Priyanka
> > > >
> > > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <
> > ram@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Following up on some comments, wanted to clarify what I have in
> mind
> > > for
> > > > > this branch:
> > > > >
> > > > > 1. The main goal is to stay up-to-date with new releases, so if a
> > > > question
> > > > > of the form
> > > > >     "A new release of X is available, should we upgrade ?" comes
> up,
> > > the
> > > > > answer is
> > > > >     *always* an *emphatic* yes; otherwise it doesn't bleed enough
> > (:-)
> > > as
> > > > > Sanjay points out.
> > > > > 2. Pull requests are submitted as always; there is no requirement
> to
> > > > > generate an additional
> > > > >     pull requests against this branch. It may get
> > merged/cherry-picked
> > > > > depending on who has the
> > > > >    time and inclination to do it.
> > > > > 3. There is no expectation of dedication of any additional
> resources,
> > > so
> > > > > people work on
> > > > >     it as and when time is available. ("No guarantee" means exactly
> > > > that).
> > > > > So there is no
> > > > >     question of "maintaining" this branch.
> > > > > 4. This branch is not to be encumbered with legacy and/or backward
> > > > > compatibility issues.
> > > > > 5. This branch is not an experimental sandbox to try out new
> > > algorithms,
> > > > > architectural changes
> > > > >     and other such changes.
> > > > >
> > > > > As always, I'm open to other ideas, but that is what I had in mind
> > > when I
> > > > > made the suggestion.
> > > > >
> > > > > Ram
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <
> > sanjay@datatorrent.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > As the name suggests the "bleeding-edge" branch ideally should
> use
> > > > > bleeding
> > > > > > edge versions so I would like to see Java 8 used there (and
> Hadoop
> > 3
> > > > when
> > > > > > it does eventually come out) to make the maintenance effort
> > > > worthwhile...
> > > > > >
> > > > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <
> david@datatorrent.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially
> > strong
> > > > +1
> > > > > > for
> > > > > > > upgrading the Hadoop dependency version.
> > > > > > >
> > > > > > > Here are my reasons:
> > > > > > >
> > > > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports
> > > Java
> > > > 7
> > > > > > and
> > > > > > > there will probably be some time (I'm guessing more than one
> > year)
> > > > for
> > > > > > > Hadoop 3 to become GA and for major distros to support Hadoop
> 3.
> > > The
> > > > > > > maintenance effort for having two branches, one for Java 7 and
> > one
> > > > for
> > > > > > Java
> > > > > > > 8 is not worth it at this time.
> > > > > > >
> > > > > > > - Apex currently uses Hadoop 2.2 dependencies, marked
> "provided".
> > > And
> > > > > > > Hadoop 2.4 has been released more than two years ago, and it
> > added
> > > a
> > > > > lot
> > > > > > of
> > > > > > > features in the API that Apex can make use of. Most distros
> > already
> > > > > > bundle
> > > > > > > Hadoop 2.6 or later. Although some old versions of Cloudera
> that
> > > > > include
> > > > > > > hadoop version earlier than 2.4 still have not reached
> > end-of-life
> > > > yet,
> > > > > > the
> > > > > > > number of users using those old versions is probably very
> small.
> > > > > > >
> > > > > > > David
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > > > > ram@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > We've had a number of issues recently related to dependencies
> > on
> > > > old
> > > > > > > > versions
> > > > > > > > of various packages/libraries such as Hadoop itself, Google
> > > guava,
> > > > > > > > HTTPClient,
> > > > > > > > mbassador, etc.
> > > > > > > >
> > > > > > > > How about we create a "bleeding-edge" branch in both Core and
> > > > Malhar
> > > > > > > which
> > > > > > > > will use the latest versions of these various dependencies,
> > > upgrade
> > > > > to
> > > > > > > Java
> > > > > > > > 8 so
> > > > > > > > we can use the new Java features, etc. ?
> > > > > > > >
> > > > > > > > This will give us an opportunity to discover these sorts of
> > > > problems
> > > > > > > early
> > > > > > > > and,
> > > > > > > > when we are ready to pull the trigger for a major version, we
> > > have
> > > > a
> > > > > > > branch
> > > > > > > > ready
> > > > > > > > for merge with, hopefully, minimal additional effort.
> > > > > > > >
> > > > > > > > There will be no guarantees w.r.t. this branch so people
> using
> > it
> > > > use
> > > > > > it
> > > > > > > at
> > > > > > > > their own
> > > > > > > > risk.
> > > > > > > >
> > > > > > > > Ram
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Sandesh Hegde <sa...@datatorrent.com>.
Our current model of supporting the oldest supported Hadoop, penalizes the
users of latest Hadoop versions by favoring the slow movers.
Also, we won't benefit from the increased maturity of the Hadoop platform,
as we will be working on the many years old version of Hadoop.
We also need to incentivize our customers to upgrade their Hadoop version,
by making use of new features.

My vote goes to start the work on the Hadoop 2.6 ( or any other version )
in a different branch, without waiting for the EOL policies.

On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <th...@datatorrent.com> wrote:

> -0
>
> I read the thread twice, it is not clear to me what benefit Apex users
> derive from this exercise. A branch normally contains development work that
> is eventually brought back to the main line and into a release. Here, the
> suggestion seems to be an open ended effort to play with latest tech, isn't
> that something anyone (including a group of folks) can do in a fork. I
> don't see value in a permanent branch for that, who is going to maintain
> such code and who will ever use it?
>
> There was a point that we can find out about potential problems with later
> versions. The way to find such issues is to take the releases and run them
> on these later versions (that's what users do), not by changing the code!
>
> Regarding Java version: Our users don't use Apex in a vacuum. Please have a
> look at ASF Hadoop and the distros EOL policies. That will answer the
> question what Java version is appropriate. I would be surprised if
> something that works on Java 7 falls flat on the face with Java 8 as a lot
> of diligence goes into backward compatibility. Again the way to tests this
> is to run verification with existing Apex releases on Java 8 based stack.
>
> Regarding Hadoop version: This has been discussed off record several times
> and there are actual JIRA tickets marked accordingly so that the work is
> done when we move. It is a separate discussion, no need to mix Java
> versions and branching with it. I agree with what David said, if someone
> can show that we can move up to 2.6 based on EOL policies and what known
> Apex users have in production, then we should work on that upgrade. The way
> I imagine it would work is that we have a Hadoop-2.6 (or whatever version)
> branch, make all the upgrade related changes there (which should be a list
> of JIRAs) and then merge it back to master when we are satisfied. After
> that, the branch can be deleted.
>
> Thomas
>
>
>
> On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> chinmay@datatorrent.com>
> wrote:
>
> > I'm -0 on this idea.
> >
> > Here is the reason:
> > Unless we see a real case where users want to see everything on latest,
> > this branch might quickly become low hanging fruit and eventually get
> > obsolete because its anyway a "no gaurantee" branch.
> >
> > We have a bunch of dependencies which we'll have to take care of to
> really
> > make it bleeding edge. Specially about malhar, its a long list. That
> looks
> > like quite significant work.
> > Moreover, if this branch is going to be in "may or may not work" state;
> I,
> > as a user or developer, would bank on what certainly works.
> >
> > I also think that, if its going to be "no gaurantee" then its worth
> > spending time contributions towards master rather than bleeding-edge
> > branch.
> >
> > If a question of "should we upgrade?" comes, the community is mature to
> > take that call then and work accordingly.
> >
> > -Chinmay.
> >
> >
> >
> > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <pr...@apache.org>
> > wrote:
> >
> > > +1 for creating such branch.
> > > One of us will have to rebase it with master branch at intervals. I
> don't
> > > think everyone will cherry-pick their commits here. We can make it once
> > in
> > > a month activity. Are we considering updating all dependency library
> > > version as well?
> > >
> > > -Priyanka
> > >
> > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <
> ram@datatorrent.com>
> > > wrote:
> > >
> > > > Following up on some comments, wanted to clarify what I have in mind
> > for
> > > > this branch:
> > > >
> > > > 1. The main goal is to stay up-to-date with new releases, so if a
> > > question
> > > > of the form
> > > >     "A new release of X is available, should we upgrade ?" comes up,
> > the
> > > > answer is
> > > >     *always* an *emphatic* yes; otherwise it doesn't bleed enough
> (:-)
> > as
> > > > Sanjay points out.
> > > > 2. Pull requests are submitted as always; there is no requirement to
> > > > generate an additional
> > > >     pull requests against this branch. It may get
> merged/cherry-picked
> > > > depending on who has the
> > > >    time and inclination to do it.
> > > > 3. There is no expectation of dedication of any additional resources,
> > so
> > > > people work on
> > > >     it as and when time is available. ("No guarantee" means exactly
> > > that).
> > > > So there is no
> > > >     question of "maintaining" this branch.
> > > > 4. This branch is not to be encumbered with legacy and/or backward
> > > > compatibility issues.
> > > > 5. This branch is not an experimental sandbox to try out new
> > algorithms,
> > > > architectural changes
> > > >     and other such changes.
> > > >
> > > > As always, I'm open to other ideas, but that is what I had in mind
> > when I
> > > > made the suggestion.
> > > >
> > > > Ram
> > > >
> > > >
> > > >
> > > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <
> sanjay@datatorrent.com
> > >
> > > > wrote:
> > > >
> > > > > As the name suggests the "bleeding-edge" branch ideally should use
> > > > bleeding
> > > > > edge versions so I would like to see Java 8 used there (and Hadoop
> 3
> > > when
> > > > > it does eventually come out) to make the maintenance effort
> > > worthwhile...
> > > > >
> > > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <david@datatorrent.com
> >
> > > > wrote:
> > > > >
> > > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially
> strong
> > > +1
> > > > > for
> > > > > > upgrading the Hadoop dependency version.
> > > > > >
> > > > > > Here are my reasons:
> > > > > >
> > > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports
> > Java
> > > 7
> > > > > and
> > > > > > there will probably be some time (I'm guessing more than one
> year)
> > > for
> > > > > > Hadoop 3 to become GA and for major distros to support Hadoop 3.
> > The
> > > > > > maintenance effort for having two branches, one for Java 7 and
> one
> > > for
> > > > > Java
> > > > > > 8 is not worth it at this time.
> > > > > >
> > > > > > - Apex currently uses Hadoop 2.2 dependencies, marked "provided".
> > And
> > > > > > Hadoop 2.4 has been released more than two years ago, and it
> added
> > a
> > > > lot
> > > > > of
> > > > > > features in the API that Apex can make use of. Most distros
> already
> > > > > bundle
> > > > > > Hadoop 2.6 or later. Although some old versions of Cloudera that
> > > > include
> > > > > > hadoop version earlier than 2.4 still have not reached
> end-of-life
> > > yet,
> > > > > the
> > > > > > number of users using those old versions is probably very small.
> > > > > >
> > > > > > David
> > > > > >
> > > > > >
> > > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > > > ram@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > We've had a number of issues recently related to dependencies
> on
> > > old
> > > > > > > versions
> > > > > > > of various packages/libraries such as Hadoop itself, Google
> > guava,
> > > > > > > HTTPClient,
> > > > > > > mbassador, etc.
> > > > > > >
> > > > > > > How about we create a "bleeding-edge" branch in both Core and
> > > Malhar
> > > > > > which
> > > > > > > will use the latest versions of these various dependencies,
> > upgrade
> > > > to
> > > > > > Java
> > > > > > > 8 so
> > > > > > > we can use the new Java features, etc. ?
> > > > > > >
> > > > > > > This will give us an opportunity to discover these sorts of
> > > problems
> > > > > > early
> > > > > > > and,
> > > > > > > when we are ready to pull the trigger for a major version, we
> > have
> > > a
> > > > > > branch
> > > > > > > ready
> > > > > > > for merge with, hopefully, minimal additional effort.
> > > > > > >
> > > > > > > There will be no guarantees w.r.t. this branch so people using
> it
> > > use
> > > > > it
> > > > > > at
> > > > > > > their own
> > > > > > > risk.
> > > > > > >
> > > > > > > Ram
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Thomas Weise <th...@datatorrent.com>.
-0

I read the thread twice, it is not clear to me what benefit Apex users
derive from this exercise. A branch normally contains development work that
is eventually brought back to the main line and into a release. Here, the
suggestion seems to be an open ended effort to play with latest tech, isn't
that something anyone (including a group of folks) can do in a fork. I
don't see value in a permanent branch for that, who is going to maintain
such code and who will ever use it?

There was a point that we can find out about potential problems with later
versions. The way to find such issues is to take the releases and run them
on these later versions (that's what users do), not by changing the code!

Regarding Java version: Our users don't use Apex in a vacuum. Please have a
look at ASF Hadoop and the distros EOL policies. That will answer the
question what Java version is appropriate. I would be surprised if
something that works on Java 7 falls flat on the face with Java 8 as a lot
of diligence goes into backward compatibility. Again the way to tests this
is to run verification with existing Apex releases on Java 8 based stack.

Regarding Hadoop version: This has been discussed off record several times
and there are actual JIRA tickets marked accordingly so that the work is
done when we move. It is a separate discussion, no need to mix Java
versions and branching with it. I agree with what David said, if someone
can show that we can move up to 2.6 based on EOL policies and what known
Apex users have in production, then we should work on that upgrade. The way
I imagine it would work is that we have a Hadoop-2.6 (or whatever version)
branch, make all the upgrade related changes there (which should be a list
of JIRAs) and then merge it back to master when we are satisfied. After
that, the branch can be deleted.

Thomas



On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <ch...@datatorrent.com>
wrote:

> I'm -0 on this idea.
>
> Here is the reason:
> Unless we see a real case where users want to see everything on latest,
> this branch might quickly become low hanging fruit and eventually get
> obsolete because its anyway a "no gaurantee" branch.
>
> We have a bunch of dependencies which we'll have to take care of to really
> make it bleeding edge. Specially about malhar, its a long list. That looks
> like quite significant work.
> Moreover, if this branch is going to be in "may or may not work" state; I,
> as a user or developer, would bank on what certainly works.
>
> I also think that, if its going to be "no gaurantee" then its worth
> spending time contributions towards master rather than bleeding-edge
> branch.
>
> If a question of "should we upgrade?" comes, the community is mature to
> take that call then and work accordingly.
>
> -Chinmay.
>
>
>
> On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <pr...@apache.org>
> wrote:
>
> > +1 for creating such branch.
> > One of us will have to rebase it with master branch at intervals. I don't
> > think everyone will cherry-pick their commits here. We can make it once
> in
> > a month activity. Are we considering updating all dependency library
> > version as well?
> >
> > -Priyanka
> >
> > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <ra...@datatorrent.com>
> > wrote:
> >
> > > Following up on some comments, wanted to clarify what I have in mind
> for
> > > this branch:
> > >
> > > 1. The main goal is to stay up-to-date with new releases, so if a
> > question
> > > of the form
> > >     "A new release of X is available, should we upgrade ?" comes up,
> the
> > > answer is
> > >     *always* an *emphatic* yes; otherwise it doesn't bleed enough (:-)
> as
> > > Sanjay points out.
> > > 2. Pull requests are submitted as always; there is no requirement to
> > > generate an additional
> > >     pull requests against this branch. It may get merged/cherry-picked
> > > depending on who has the
> > >    time and inclination to do it.
> > > 3. There is no expectation of dedication of any additional resources,
> so
> > > people work on
> > >     it as and when time is available. ("No guarantee" means exactly
> > that).
> > > So there is no
> > >     question of "maintaining" this branch.
> > > 4. This branch is not to be encumbered with legacy and/or backward
> > > compatibility issues.
> > > 5. This branch is not an experimental sandbox to try out new
> algorithms,
> > > architectural changes
> > >     and other such changes.
> > >
> > > As always, I'm open to other ideas, but that is what I had in mind
> when I
> > > made the suggestion.
> > >
> > > Ram
> > >
> > >
> > >
> > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <sanjay@datatorrent.com
> >
> > > wrote:
> > >
> > > > As the name suggests the "bleeding-edge" branch ideally should use
> > > bleeding
> > > > edge versions so I would like to see Java 8 used there (and Hadoop 3
> > when
> > > > it does eventually come out) to make the maintenance effort
> > worthwhile...
> > > >
> > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <da...@datatorrent.com>
> > > wrote:
> > > >
> > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially strong
> > +1
> > > > for
> > > > > upgrading the Hadoop dependency version.
> > > > >
> > > > > Here are my reasons:
> > > > >
> > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports
> Java
> > 7
> > > > and
> > > > > there will probably be some time (I'm guessing more than one year)
> > for
> > > > > Hadoop 3 to become GA and for major distros to support Hadoop 3.
> The
> > > > > maintenance effort for having two branches, one for Java 7 and one
> > for
> > > > Java
> > > > > 8 is not worth it at this time.
> > > > >
> > > > > - Apex currently uses Hadoop 2.2 dependencies, marked "provided".
> And
> > > > > Hadoop 2.4 has been released more than two years ago, and it added
> a
> > > lot
> > > > of
> > > > > features in the API that Apex can make use of. Most distros already
> > > > bundle
> > > > > Hadoop 2.6 or later. Although some old versions of Cloudera that
> > > include
> > > > > hadoop version earlier than 2.4 still have not reached end-of-life
> > yet,
> > > > the
> > > > > number of users using those old versions is probably very small.
> > > > >
> > > > > David
> > > > >
> > > > >
> > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > > ram@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > We've had a number of issues recently related to dependencies on
> > old
> > > > > > versions
> > > > > > of various packages/libraries such as Hadoop itself, Google
> guava,
> > > > > > HTTPClient,
> > > > > > mbassador, etc.
> > > > > >
> > > > > > How about we create a "bleeding-edge" branch in both Core and
> > Malhar
> > > > > which
> > > > > > will use the latest versions of these various dependencies,
> upgrade
> > > to
> > > > > Java
> > > > > > 8 so
> > > > > > we can use the new Java features, etc. ?
> > > > > >
> > > > > > This will give us an opportunity to discover these sorts of
> > problems
> > > > > early
> > > > > > and,
> > > > > > when we are ready to pull the trigger for a major version, we
> have
> > a
> > > > > branch
> > > > > > ready
> > > > > > for merge with, hopefully, minimal additional effort.
> > > > > >
> > > > > > There will be no guarantees w.r.t. this branch so people using it
> > use
> > > > it
> > > > > at
> > > > > > their own
> > > > > > risk.
> > > > > >
> > > > > > Ram
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Chinmay Kolhatkar <ch...@datatorrent.com>.
I'm -0 on this idea.

Here is the reason:
Unless we see a real case where users want to see everything on latest,
this branch might quickly become low hanging fruit and eventually get
obsolete because its anyway a "no gaurantee" branch.

We have a bunch of dependencies which we'll have to take care of to really
make it bleeding edge. Specially about malhar, its a long list. That looks
like quite significant work.
Moreover, if this branch is going to be in "may or may not work" state; I,
as a user or developer, would bank on what certainly works.

I also think that, if its going to be "no gaurantee" then its worth
spending time contributions towards master rather than bleeding-edge branch.

If a question of "should we upgrade?" comes, the community is mature to
take that call then and work accordingly.

-Chinmay.



On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale <pr...@apache.org> wrote:

> +1 for creating such branch.
> One of us will have to rebase it with master branch at intervals. I don't
> think everyone will cherry-pick their commits here. We can make it once in
> a month activity. Are we considering updating all dependency library
> version as well?
>
> -Priyanka
>
> On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <ra...@datatorrent.com>
> wrote:
>
> > Following up on some comments, wanted to clarify what I have in mind for
> > this branch:
> >
> > 1. The main goal is to stay up-to-date with new releases, so if a
> question
> > of the form
> >     "A new release of X is available, should we upgrade ?" comes up, the
> > answer is
> >     *always* an *emphatic* yes; otherwise it doesn't bleed enough (:-) as
> > Sanjay points out.
> > 2. Pull requests are submitted as always; there is no requirement to
> > generate an additional
> >     pull requests against this branch. It may get merged/cherry-picked
> > depending on who has the
> >    time and inclination to do it.
> > 3. There is no expectation of dedication of any additional resources, so
> > people work on
> >     it as and when time is available. ("No guarantee" means exactly
> that).
> > So there is no
> >     question of "maintaining" this branch.
> > 4. This branch is not to be encumbered with legacy and/or backward
> > compatibility issues.
> > 5. This branch is not an experimental sandbox to try out new algorithms,
> > architectural changes
> >     and other such changes.
> >
> > As always, I'm open to other ideas, but that is what I had in mind when I
> > made the suggestion.
> >
> > Ram
> >
> >
> >
> > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <sa...@datatorrent.com>
> > wrote:
> >
> > > As the name suggests the "bleeding-edge" branch ideally should use
> > bleeding
> > > edge versions so I would like to see Java 8 used there (and Hadoop 3
> when
> > > it does eventually come out) to make the maintenance effort
> worthwhile...
> > >
> > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <da...@datatorrent.com>
> > wrote:
> > >
> > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially strong
> +1
> > > for
> > > > upgrading the Hadoop dependency version.
> > > >
> > > > Here are my reasons:
> > > >
> > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports Java
> 7
> > > and
> > > > there will probably be some time (I'm guessing more than one year)
> for
> > > > Hadoop 3 to become GA and for major distros to support Hadoop 3. The
> > > > maintenance effort for having two branches, one for Java 7 and one
> for
> > > Java
> > > > 8 is not worth it at this time.
> > > >
> > > > - Apex currently uses Hadoop 2.2 dependencies, marked "provided". And
> > > > Hadoop 2.4 has been released more than two years ago, and it added a
> > lot
> > > of
> > > > features in the API that Apex can make use of. Most distros already
> > > bundle
> > > > Hadoop 2.6 or later. Although some old versions of Cloudera that
> > include
> > > > hadoop version earlier than 2.4 still have not reached end-of-life
> yet,
> > > the
> > > > number of users using those old versions is probably very small.
> > > >
> > > > David
> > > >
> > > >
> > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> > ram@datatorrent.com>
> > > > wrote:
> > > >
> > > > > We've had a number of issues recently related to dependencies on
> old
> > > > > versions
> > > > > of various packages/libraries such as Hadoop itself, Google guava,
> > > > > HTTPClient,
> > > > > mbassador, etc.
> > > > >
> > > > > How about we create a "bleeding-edge" branch in both Core and
> Malhar
> > > > which
> > > > > will use the latest versions of these various dependencies, upgrade
> > to
> > > > Java
> > > > > 8 so
> > > > > we can use the new Java features, etc. ?
> > > > >
> > > > > This will give us an opportunity to discover these sorts of
> problems
> > > > early
> > > > > and,
> > > > > when we are ready to pull the trigger for a major version, we have
> a
> > > > branch
> > > > > ready
> > > > > for merge with, hopefully, minimal additional effort.
> > > > >
> > > > > There will be no guarantees w.r.t. this branch so people using it
> use
> > > it
> > > > at
> > > > > their own
> > > > > risk.
> > > > >
> > > > > Ram
> > > > >
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Priyanka Gugale <pr...@apache.org>.
+1 for creating such branch.
One of us will have to rebase it with master branch at intervals. I don't
think everyone will cherry-pick their commits here. We can make it once in
a month activity. Are we considering updating all dependency library
version as well?

-Priyanka

On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <ra...@datatorrent.com>
wrote:

> Following up on some comments, wanted to clarify what I have in mind for
> this branch:
>
> 1. The main goal is to stay up-to-date with new releases, so if a question
> of the form
>     "A new release of X is available, should we upgrade ?" comes up, the
> answer is
>     *always* an *emphatic* yes; otherwise it doesn't bleed enough (:-) as
> Sanjay points out.
> 2. Pull requests are submitted as always; there is no requirement to
> generate an additional
>     pull requests against this branch. It may get merged/cherry-picked
> depending on who has the
>    time and inclination to do it.
> 3. There is no expectation of dedication of any additional resources, so
> people work on
>     it as and when time is available. ("No guarantee" means exactly that).
> So there is no
>     question of "maintaining" this branch.
> 4. This branch is not to be encumbered with legacy and/or backward
> compatibility issues.
> 5. This branch is not an experimental sandbox to try out new algorithms,
> architectural changes
>     and other such changes.
>
> As always, I'm open to other ideas, but that is what I had in mind when I
> made the suggestion.
>
> Ram
>
>
>
> On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <sa...@datatorrent.com>
> wrote:
>
> > As the name suggests the "bleeding-edge" branch ideally should use
> bleeding
> > edge versions so I would like to see Java 8 used there (and Hadoop 3 when
> > it does eventually come out) to make the maintenance effort worthwhile...
> >
> > On Mon, Jul 11, 2016 at 12:05 PM, David Yan <da...@datatorrent.com>
> wrote:
> >
> > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially strong +1
> > for
> > > upgrading the Hadoop dependency version.
> > >
> > > Here are my reasons:
> > >
> > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports Java 7
> > and
> > > there will probably be some time (I'm guessing more than one year) for
> > > Hadoop 3 to become GA and for major distros to support Hadoop 3. The
> > > maintenance effort for having two branches, one for Java 7 and one for
> > Java
> > > 8 is not worth it at this time.
> > >
> > > - Apex currently uses Hadoop 2.2 dependencies, marked "provided". And
> > > Hadoop 2.4 has been released more than two years ago, and it added a
> lot
> > of
> > > features in the API that Apex can make use of. Most distros already
> > bundle
> > > Hadoop 2.6 or later. Although some old versions of Cloudera that
> include
> > > hadoop version earlier than 2.4 still have not reached end-of-life yet,
> > the
> > > number of users using those old versions is probably very small.
> > >
> > > David
> > >
> > >
> > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <
> ram@datatorrent.com>
> > > wrote:
> > >
> > > > We've had a number of issues recently related to dependencies on old
> > > > versions
> > > > of various packages/libraries such as Hadoop itself, Google guava,
> > > > HTTPClient,
> > > > mbassador, etc.
> > > >
> > > > How about we create a "bleeding-edge" branch in both Core and Malhar
> > > which
> > > > will use the latest versions of these various dependencies, upgrade
> to
> > > Java
> > > > 8 so
> > > > we can use the new Java features, etc. ?
> > > >
> > > > This will give us an opportunity to discover these sorts of problems
> > > early
> > > > and,
> > > > when we are ready to pull the trigger for a major version, we have a
> > > branch
> > > > ready
> > > > for merge with, hopefully, minimal additional effort.
> > > >
> > > > There will be no guarantees w.r.t. this branch so people using it use
> > it
> > > at
> > > > their own
> > > > risk.
> > > >
> > > > Ram
> > > >
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Munagala Ramanath <ra...@datatorrent.com>.
Following up on some comments, wanted to clarify what I have in mind for
this branch:

1. The main goal is to stay up-to-date with new releases, so if a question
of the form
    "A new release of X is available, should we upgrade ?" comes up, the
answer is
    *always* an *emphatic* yes; otherwise it doesn't bleed enough (:-) as
Sanjay points out.
2. Pull requests are submitted as always; there is no requirement to
generate an additional
    pull requests against this branch. It may get merged/cherry-picked
depending on who has the
   time and inclination to do it.
3. There is no expectation of dedication of any additional resources, so
people work on
    it as and when time is available. ("No guarantee" means exactly that).
So there is no
    question of "maintaining" this branch.
4. This branch is not to be encumbered with legacy and/or backward
compatibility issues.
5. This branch is not an experimental sandbox to try out new algorithms,
architectural changes
    and other such changes.

As always, I'm open to other ideas, but that is what I had in mind when I
made the suggestion.

Ram



On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare <sa...@datatorrent.com>
wrote:

> As the name suggests the "bleeding-edge" branch ideally should use bleeding
> edge versions so I would like to see Java 8 used there (and Hadoop 3 when
> it does eventually come out) to make the maintenance effort worthwhile...
>
> On Mon, Jul 11, 2016 at 12:05 PM, David Yan <da...@datatorrent.com> wrote:
>
> > I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially strong +1
> for
> > upgrading the Hadoop dependency version.
> >
> > Here are my reasons:
> >
> > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports Java 7
> and
> > there will probably be some time (I'm guessing more than one year) for
> > Hadoop 3 to become GA and for major distros to support Hadoop 3. The
> > maintenance effort for having two branches, one for Java 7 and one for
> Java
> > 8 is not worth it at this time.
> >
> > - Apex currently uses Hadoop 2.2 dependencies, marked "provided". And
> > Hadoop 2.4 has been released more than two years ago, and it added a lot
> of
> > features in the API that Apex can make use of. Most distros already
> bundle
> > Hadoop 2.6 or later. Although some old versions of Cloudera that include
> > hadoop version earlier than 2.4 still have not reached end-of-life yet,
> the
> > number of users using those old versions is probably very small.
> >
> > David
> >
> >
> > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <ra...@datatorrent.com>
> > wrote:
> >
> > > We've had a number of issues recently related to dependencies on old
> > > versions
> > > of various packages/libraries such as Hadoop itself, Google guava,
> > > HTTPClient,
> > > mbassador, etc.
> > >
> > > How about we create a "bleeding-edge" branch in both Core and Malhar
> > which
> > > will use the latest versions of these various dependencies, upgrade to
> > Java
> > > 8 so
> > > we can use the new Java features, etc. ?
> > >
> > > This will give us an opportunity to discover these sorts of problems
> > early
> > > and,
> > > when we are ready to pull the trigger for a major version, we have a
> > branch
> > > ready
> > > for merge with, hopefully, minimal additional effort.
> > >
> > > There will be no guarantees w.r.t. this branch so people using it use
> it
> > at
> > > their own
> > > risk.
> > >
> > > Ram
> > >
> >
>

Re: Bleeding edge branch ?

Posted by Sanjay Pujare <sa...@datatorrent.com>.
As the name suggests the "bleeding-edge" branch ideally should use bleeding
edge versions so I would like to see Java 8 used there (and Hadoop 3 when
it does eventually come out) to make the maintenance effort worthwhile...

On Mon, Jul 11, 2016 at 12:05 PM, David Yan <da...@datatorrent.com> wrote:

> I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially strong +1 for
> upgrading the Hadoop dependency version.
>
> Here are my reasons:
>
> - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports Java 7 and
> there will probably be some time (I'm guessing more than one year) for
> Hadoop 3 to become GA and for major distros to support Hadoop 3. The
> maintenance effort for having two branches, one for Java 7 and one for Java
> 8 is not worth it at this time.
>
> - Apex currently uses Hadoop 2.2 dependencies, marked "provided". And
> Hadoop 2.4 has been released more than two years ago, and it added a lot of
> features in the API that Apex can make use of. Most distros already bundle
> Hadoop 2.6 or later. Although some old versions of Cloudera that include
> hadoop version earlier than 2.4 still have not reached end-of-life yet, the
> number of users using those old versions is probably very small.
>
> David
>
>
> On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <ra...@datatorrent.com>
> wrote:
>
> > We've had a number of issues recently related to dependencies on old
> > versions
> > of various packages/libraries such as Hadoop itself, Google guava,
> > HTTPClient,
> > mbassador, etc.
> >
> > How about we create a "bleeding-edge" branch in both Core and Malhar
> which
> > will use the latest versions of these various dependencies, upgrade to
> Java
> > 8 so
> > we can use the new Java features, etc. ?
> >
> > This will give us an opportunity to discover these sorts of problems
> early
> > and,
> > when we are ready to pull the trigger for a major version, we have a
> branch
> > ready
> > for merge with, hopefully, minimal additional effort.
> >
> > There will be no guarantees w.r.t. this branch so people using it use it
> at
> > their own
> > risk.
> >
> > Ram
> >
>

Re: Bleeding edge branch ?

Posted by David Yan <da...@datatorrent.com>.
I'm -0 on Java 8, but I'm +1 on the rest, and I'm especially strong +1 for
upgrading the Hadoop dependency version.

Here are my reasons:

- Hadoop 3 will require Java 8, but Hadoop 2.7.2 still supports Java 7 and
there will probably be some time (I'm guessing more than one year) for
Hadoop 3 to become GA and for major distros to support Hadoop 3. The
maintenance effort for having two branches, one for Java 7 and one for Java
8 is not worth it at this time.

- Apex currently uses Hadoop 2.2 dependencies, marked "provided". And
Hadoop 2.4 has been released more than two years ago, and it added a lot of
features in the API that Apex can make use of. Most distros already bundle
Hadoop 2.6 or later. Although some old versions of Cloudera that include
hadoop version earlier than 2.4 still have not reached end-of-life yet, the
number of users using those old versions is probably very small.

David


On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <ra...@datatorrent.com>
wrote:

> We've had a number of issues recently related to dependencies on old
> versions
> of various packages/libraries such as Hadoop itself, Google guava,
> HTTPClient,
> mbassador, etc.
>
> How about we create a "bleeding-edge" branch in both Core and Malhar which
> will use the latest versions of these various dependencies, upgrade to Java
> 8 so
> we can use the new Java features, etc. ?
>
> This will give us an opportunity to discover these sorts of problems early
> and,
> when we are ready to pull the trigger for a major version, we have a branch
> ready
> for merge with, hopefully, minimal additional effort.
>
> There will be no guarantees w.r.t. this branch so people using it use it at
> their own
> risk.
>
> Ram
>

Re: Bleeding edge branch ?

Posted by Sandesh Hegde <sa...@datatorrent.com>.
+1 with some variation

Support next version, compared to one supported by the Apex main, of the
Hadoop instead of the latest Hadoop. This makes moving the Apex main to
next version of the Hadoop easy.



On Mon, Jul 11, 2016 at 10:33 AM Sanjay Pujare <sa...@datatorrent.com>
wrote:

> strong +1 (will be nice to have some dedicated resource to maintain this
> branch)
>
> On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <ra...@datatorrent.com>
> wrote:
>
> > We've had a number of issues recently related to dependencies on old
> > versions
> > of various packages/libraries such as Hadoop itself, Google guava,
> > HTTPClient,
> > mbassador, etc.
> >
> > How about we create a "bleeding-edge" branch in both Core and Malhar
> which
> > will use the latest versions of these various dependencies, upgrade to
> Java
> > 8 so
> > we can use the new Java features, etc. ?
> >
> > This will give us an opportunity to discover these sorts of problems
> early
> > and,
> > when we are ready to pull the trigger for a major version, we have a
> branch
> > ready
> > for merge with, hopefully, minimal additional effort.
> >
> > There will be no guarantees w.r.t. this branch so people using it use it
> at
> > their own
> > risk.
> >
> > Ram
> >
>

Re: Bleeding edge branch ?

Posted by Sanjay Pujare <sa...@datatorrent.com>.
strong +1 (will be nice to have some dedicated resource to maintain this
branch)

On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <ra...@datatorrent.com>
wrote:

> We've had a number of issues recently related to dependencies on old
> versions
> of various packages/libraries such as Hadoop itself, Google guava,
> HTTPClient,
> mbassador, etc.
>
> How about we create a "bleeding-edge" branch in both Core and Malhar which
> will use the latest versions of these various dependencies, upgrade to Java
> 8 so
> we can use the new Java features, etc. ?
>
> This will give us an opportunity to discover these sorts of problems early
> and,
> when we are ready to pull the trigger for a major version, we have a branch
> ready
> for merge with, hopefully, minimal additional effort.
>
> There will be no guarantees w.r.t. this branch so people using it use it at
> their own
> risk.
>
> Ram
>

Re: Bleeding edge branch ?

Posted by Siyuan Hua <si...@datatorrent.com>.
+1

On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath <ra...@datatorrent.com>
wrote:

> We've had a number of issues recently related to dependencies on old
> versions
> of various packages/libraries such as Hadoop itself, Google guava,
> HTTPClient,
> mbassador, etc.
>
> How about we create a "bleeding-edge" branch in both Core and Malhar which
> will use the latest versions of these various dependencies, upgrade to Java
> 8 so
> we can use the new Java features, etc. ?
>
> This will give us an opportunity to discover these sorts of problems early
> and,
> when we are ready to pull the trigger for a major version, we have a branch
> ready
> for merge with, hopefully, minimal additional effort.
>
> There will be no guarantees w.r.t. this branch so people using it use it at
> their own
> risk.
>
> Ram
>