You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by Arun C Murthy <ac...@hortonworks.com> on 2014/06/24 20:43:21 UTC

Re: Moving to JDK7, JDK8 and new major releases

Andrew,

 Thanks for starting this thread. I'll edit the wiki to provide more context around rolling-upgrades etc. which, as I pointed out in the original thread, are key IMHO.

On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com> wrote:
> https://wiki.apache.org/hadoop/MovingToJdk7and8
> 
> I think based on our current compatibility guidelines, Proposal A is the
> most attractive. We're pretty hamstrung by the requirement to keep the
> classpath the same, which would be solved by either OSGI or shading our
> deps (but that's a different discussion).

I don't see that anywhere in our current compatibility guidelines.

As you can see from http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html we do not have such a policy (pasted here for convenience):

Java Classpath

User applications built against Hadoop might add all Hadoop jars (including Hadoop's library dependencies) to the application's classpath. Adding new dependencies or updating the version of existing dependencies may interfere with those in applications' classpaths.

Policy

Currently, there is NO policy on when Hadoop's dependencies can change.

Furthermore, we have *already* changed our classpath in hadoop-2.x. Again, as I pointed out in the previous thread, here is the precedent:

On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Also, this is something we already have done i.e. we updated some of our software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as dramatic as JDK. Here are some examples:
> https://issues.apache.org/jira/browse/HADOOP-9991
> https://issues.apache.org/jira/browse/HADOOP-10102
> https://issues.apache.org/jira/browse/HADOOP-10103
> https://issues.apache.org/jira/browse/HADOOP-10104
> https://issues.apache.org/jira/browse/HADOOP-10503

thanks,
Arun
-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
That classpath policy was explicitly added because we can't lock down our
dependencies for security/bug fix reasons, and also because if we do update
something explicitly, their transitive dependencies can change -beyond our
control.

https://issues.apache.org/jira/browse/HADOOP-9555 is an example of this: an
update of ZK explicitly to fix an HA problem. Are there changes in its
dependencies? I don't know. But we didn't have a choice to update if we
wanted NN & RM failover to work reliably, so we have to take any other
changes that went in.

JDK upgrades can be viewed as an extension of this -we are changing the
base platform that Hadoop runs on. More precisely, for the Java 6- >Java 7
update, we are reflecting the fact that nobody is running in production on
Java 6

Do you realise we actually moved to Java 6 in 2008?
https://issues.apache.org/jira/browse/HADOOP-2325 . That was six years ago
-half the names on that list are not active on the project any more.

What we did there was issue a warning in 0.18 that it would be the last
Java 5 version; 0.19  moved up -we can do the same for a Hadoop 2.x release
at some point this year.



On 24 June 2014 11:43, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
That classpath policy was explicitly added because we can't lock down our
dependencies for security/bug fix reasons, and also because if we do update
something explicitly, their transitive dependencies can change -beyond our
control.

https://issues.apache.org/jira/browse/HADOOP-9555 is an example of this: an
update of ZK explicitly to fix an HA problem. Are there changes in its
dependencies? I don't know. But we didn't have a choice to update if we
wanted NN & RM failover to work reliably, so we have to take any other
changes that went in.

JDK upgrades can be viewed as an extension of this -we are changing the
base platform that Hadoop runs on. More precisely, for the Java 6- >Java 7
update, we are reflecting the fact that nobody is running in production on
Java 6

Do you realise we actually moved to Java 6 in 2008?
https://issues.apache.org/jira/browse/HADOOP-2325 . That was six years ago
-half the names on that list are not active on the project any more.

What we did there was issue a warning in 0.18 that it would be the last
Java 5 version; 0.19  moved up -we can do the same for a Hadoop 2.x release
at some point this year.



On 24 June 2014 11:43, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Jun 24, 2014, at 4:22 PM, Andrew Wang <an...@cloudera.com> wrote:


> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.

I think there is a some confusion/misunderstanding here.

With hadoop-2 the user is completely in control of his own classpath (we had a similar, but limited capability in hadoop-1 w/ https://issues.apache.org/jira/browse/MAPREDUCE-1938).

Furthermore, it's probably not well known that in hadoop-2 the user application (MR or otherwise) can also pick the JDK version by using JAVA_HOME env for the container. So, in effect, MR applications can continue to use java6 while YARN is running java7 - this hasn't been tested extensively though. This capability did not exist in hadoop-1. We've also made some progress with https://issues.apache.org/jira/browse/MAPREDUCE-1700 to defuse user jar-deps from MR system jars. https://issues.apache.org/jira/browse/MAPREDUCE-4421 also helps by ensuring MR applications can pick exact version of MR jars they were compiled against; and not rely on cluster installs.

Hope that helps somewhat.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
On 24 June 2014 16:22, Andrew Wang <an...@cloudera.com> wrote:

> Steve can do a better job explaining
> this to me, but we haven't bumped things like Jetty or Guava because they
> are on the classpath and are not compatible. There is this line in the
> compat guidelines:
>

The current jetty version has unreliability issues, which can only be
mitigated by updating or moving away from. MR did the latter, webHDFS has
yet to do either.

Guava is trouble as it aggressively deprecates things...if we move to a new
version we know that some code that runs against the version we distribute
today will fail.

AFAIK the three uses we make of it are
 (a) @VisibleForTesting
 (b) Preconditions.*
 (c) enhanced collections

If that's the case, we can strip out guava from Hadoop entirely. (a) and
(b) are trivial to replicate in our own code; (c) perhaps cut and paste.
Then we could just drop it and say "please use the version you like"

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Jun 24, 2014, at 4:22 PM, Andrew Wang <an...@cloudera.com> wrote:


> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.

I think there is a some confusion/misunderstanding here.

With hadoop-2 the user is completely in control of his own classpath (we had a similar, but limited capability in hadoop-1 w/ https://issues.apache.org/jira/browse/MAPREDUCE-1938).

Furthermore, it's probably not well known that in hadoop-2 the user application (MR or otherwise) can also pick the JDK version by using JAVA_HOME env for the container. So, in effect, MR applications can continue to use java6 while YARN is running java7 - this hasn't been tested extensively though. This capability did not exist in hadoop-1. We've also made some progress with https://issues.apache.org/jira/browse/MAPREDUCE-1700 to defuse user jar-deps from MR system jars. https://issues.apache.org/jira/browse/MAPREDUCE-4421 also helps by ensuring MR applications can pick exact version of MR jars they were compiled against; and not rely on cluster installs.

Hope that helps somewhat.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
Following up on ecosystem, I just took a look at the Apache trunk pom.xml
files for HBase, Flume and Oozie.  All are specifying 1.6 for source and
target in the maven-compiler-plugin configuration, so there may be
additional follow-up required here.  (For example, if HBase has made a
statement that its client will continue to support JDK6, then it wouldn't
be practical for them to link to a JDK7 version of hadoop-common.)

+1 for the whole plan though.  We can work through these details.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Fri, Jun 27, 2014 at 3:10 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> +1 to making 2.6 the last JDK6 release.
>
> If we want, 2.7 could be a parallel release or one soon after 2.6. We could
> upgrade other dependencies that require JDK7 as well.
>
>
> On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com>
> wrote:
>
> > Thanks everyone for the discussion. Looks like we have come to a
> pragmatic
> > and progressive conclusion.
> >
> > In terms of execution of the consensus plan, I think a little bit of
> > caution is in order.
> >
> > Let's give downstream projects more of a runway.
> >
> > I propose we inform HBase, Pig, Hive etc. that we are considering making
> > 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> > are comfortable we can pull the trigger in 2.7.
> >
> > thanks,
> > Arun
> >
> >
> > > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> > wrote:
> > >
> > > As someone else already mentioned, we should announce one future
> release
> > > (may be, 2.5) as the last JDK6-based release before making the move to
> > JDK7.
> > >
> > > I am comfortable calling 2.5 the last JDK6 release.
> > >
> > >
> > > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <
> andrew.wang@cloudera.com>
> > > wrote:
> > >
> > >> Hi all, responding to multiple messages here,
> > >>
> > >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> > like
> > >> the story there is improved and still improving.
> > >>
> > >> However, I think we still suffer from this at least on the HDFS side.
> We
> > >> have a single JAR for all of HDFS, and our clients need to have all
> the
> > fun
> > >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava
> at
> > >> the front of the classpath and the HDFS client still works okay, but
> > this
> > >> is more happy coincidence than anything else. While we're leaking
> deps,
> > >> we're in a scary situation.
> > >>
> > >> API compat to me means that an app should be able to run on a new
> minor
> > >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> > like
> > >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> > and
> > >> have nothing break. If we muck with the classpath, my understanding is
> > that
> > >> this could break.
> > >>
> > >> Owen, bumping the minimum JDK version in a minor release like this
> > should
> > >> be a one-time exception as Tucu stated. A number of people have
> pointed
> > out
> > >> how painful a forced JDK upgrade is for end users, and it's not
> > something
> > >> we should be springing on them in a minor release unless we're *very*
> > >> confident like in this case.
> > >>
> > >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> > on
> > >> JDK7 across the CDH stack, so I think that's an indication that most
> > >> ecosystem projects are ready to make the jump. Is that sufficient in
> > your
> > >> mind?
> > >>
> > >> For the record, I'm also +1 on the Tucu plan. Is it too late to do
> this
> > for
> > >> 2.5? I'll offer to help out with some of the mechanics.
> > >>
> > >> Thanks,
> > >> Andrew
> > >>
> > >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> > cnauroth@hortonworks.com>
> > >> wrote:
> > >>
> > >>> I understood the plan for avoiding JDK7-specific features in our
> code,
> > >> and
> > >>> your suggestion to add an extra Jenkins job is a great way to guard
> > >> against
> > >>> that.  The thing I haven't seen discussed yet is how downstream
> > projects
> > >>> will continue to consume our built artifacts.  If a downstream
> project
> > >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class
> files,
> > >> but
> > >>> their project is still building with 1.6, then it would be a nasty
> > >>> surprise.
> > >>>
> > >>> These are the options I see:
> > >>>
> > >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> > >>> feasible, unless all other ecosystem projects have moved to JDK7
> > already.
> > >>> If not, then waiting on a single long pole project would hold up our
> > >>> migration indefinitely.
> > >>>
> > >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > >>> ecosystem upgrades.  I find this undesirable, because in a certain
> > sense,
> > >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> > that
> > >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> > format.)
> > >>>
> > >>> 3. Just declare a clean break on some version (your earlier email
> said
> > >> 2.5)
> > >>> and start publishing artifacts built with JDK7 and no -target option.
> > >>> Overall, this is my preferred option.  However, as a side effect,
> this
> > >>> sets us up for longer-term maintenance and patch releases off of the
> > 2.4
> > >>> branch if a downstream project that's still on 1.6 needs to pick up a
> > >>> critical bug fix.
> > >>>
> > >>> Of course, this is all a moot point if all the downstream ecosystem
> > >>> projects have already made the switch to JDK7.  I don't know the
> status
> > >> of
> > >>> that off the top of my head.  Maybe someone else out there knows?  If
> > >> not,
> > >>> then I expect I can free up enough in a few weeks to volunteer for
> > >> tracking
> > >>> down that information.
> > >>>
> > >>> Chris Nauroth
> > >>> Hortonworks
> > >>> http://hortonworks.com/
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > >>> wrote:
> > >>>
> > >>>> Chris,
> > >>>>
> > >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > >>> are
> > >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> > >> jdk6
> > >>>> both at compile and runtime.
> > >>>>
> > >>>> you need to compile with jdk6 to ensure you are not running into
> that
> > >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> > >>> jenkins
> > >>>> job.
> > >>>>
> > >>>>
> > >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > >> cnauroth@hortonworks.com
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > >>> the
> > >>>>> proposals and catching up on the discussion in this thread.
> > >>>>>
> > >>>>> Has anyone yet considered how to coordinate this change with
> > >> downstream
> > >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> > >>> first
> > >>>>> before we make the move?  Would we switch to JDK7, but run javac
> > >>> -target
> > >>>>> 1.6 to maintain compatibility for downstream projects during an
> > >> interim
> > >>>>> period?
> > >>>>>
> > >>>>> Chris Nauroth
> > >>>>> Hortonworks
> > >>>>> http://hortonworks.com/
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <
> omalley@apache.org>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > >>> tucu@cloudera.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> After reading this thread and thinking a bit about it, I think it
> > >>>>> should
> > >>>>>> be
> > >>>>>>> OK such move up to JDK7 in Hadoop
> > >>>>>>
> > >>>>>>
> > >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> > >> incompatible
> > >>>>> change
> > >>>>>> and is fine in the 2 branch. (Although I think it is would *not*
> be
> > >>>>>> appropriate for a patch release.) Of course we need to do it with
> > >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> > >> is
> > >>> a
> > >>>>> good
> > >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> > >> I
> > >>>>> would
> > >>>>>> push back on that.
> > >>>>>>
> > >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> > >>> line
> > >>>>> for
> > >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> > >> been
> > >>>> in
> > >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> > >> stream
> > >>>>> users.
> > >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> > >>>> before
> > >>>>> we
> > >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> > >>>>>>
> > >>>>>> .. Owen
> > >>>>>
> > >>>>> --
> > >>>>> CONFIDENTIALITY NOTICE
> > >>>>> NOTICE: This message is intended for the use of the individual or
> > >>> entity
> > >>>> to
> > >>>>> which it is addressed and may contain information that is
> > >> confidential,
> > >>>>> privileged and exempt from disclosure under applicable law. If the
> > >>> reader
> > >>>>> of this message is not the intended recipient, you are hereby
> > >> notified
> > >>>> that
> > >>>>> any printing, copying, dissemination, distribution, disclosure or
> > >>>>> forwarding of this communication is strictly prohibited. If you
> have
> > >>>>> received this communication in error, please contact the sender
> > >>>> immediately
> > >>>>> and delete it from your system. Thank You.
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Alejandro
> > >>>
> > >>> --
> > >>> CONFIDENTIALITY NOTICE
> > >>> NOTICE: This message is intended for the use of the individual or
> > entity
> > >> to
> > >>> which it is addressed and may contain information that is
> confidential,
> > >>> privileged and exempt from disclosure under applicable law. If the
> > reader
> > >>> of this message is not the intended recipient, you are hereby
> notified
> > >> that
> > >>> any printing, copying, dissemination, distribution, disclosure or
> > >>> forwarding of this communication is strictly prohibited. If you have
> > >>> received this communication in error, please contact the sender
> > >> immediately
> > >>> and delete it from your system. Thank You.
> > >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
Following up on ecosystem, I just took a look at the Apache trunk pom.xml
files for HBase, Flume and Oozie.  All are specifying 1.6 for source and
target in the maven-compiler-plugin configuration, so there may be
additional follow-up required here.  (For example, if HBase has made a
statement that its client will continue to support JDK6, then it wouldn't
be practical for them to link to a JDK7 version of hadoop-common.)

+1 for the whole plan though.  We can work through these details.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Fri, Jun 27, 2014 at 3:10 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> +1 to making 2.6 the last JDK6 release.
>
> If we want, 2.7 could be a parallel release or one soon after 2.6. We could
> upgrade other dependencies that require JDK7 as well.
>
>
> On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com>
> wrote:
>
> > Thanks everyone for the discussion. Looks like we have come to a
> pragmatic
> > and progressive conclusion.
> >
> > In terms of execution of the consensus plan, I think a little bit of
> > caution is in order.
> >
> > Let's give downstream projects more of a runway.
> >
> > I propose we inform HBase, Pig, Hive etc. that we are considering making
> > 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> > are comfortable we can pull the trigger in 2.7.
> >
> > thanks,
> > Arun
> >
> >
> > > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> > wrote:
> > >
> > > As someone else already mentioned, we should announce one future
> release
> > > (may be, 2.5) as the last JDK6-based release before making the move to
> > JDK7.
> > >
> > > I am comfortable calling 2.5 the last JDK6 release.
> > >
> > >
> > > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <
> andrew.wang@cloudera.com>
> > > wrote:
> > >
> > >> Hi all, responding to multiple messages here,
> > >>
> > >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> > like
> > >> the story there is improved and still improving.
> > >>
> > >> However, I think we still suffer from this at least on the HDFS side.
> We
> > >> have a single JAR for all of HDFS, and our clients need to have all
> the
> > fun
> > >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava
> at
> > >> the front of the classpath and the HDFS client still works okay, but
> > this
> > >> is more happy coincidence than anything else. While we're leaking
> deps,
> > >> we're in a scary situation.
> > >>
> > >> API compat to me means that an app should be able to run on a new
> minor
> > >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> > like
> > >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> > and
> > >> have nothing break. If we muck with the classpath, my understanding is
> > that
> > >> this could break.
> > >>
> > >> Owen, bumping the minimum JDK version in a minor release like this
> > should
> > >> be a one-time exception as Tucu stated. A number of people have
> pointed
> > out
> > >> how painful a forced JDK upgrade is for end users, and it's not
> > something
> > >> we should be springing on them in a minor release unless we're *very*
> > >> confident like in this case.
> > >>
> > >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> > on
> > >> JDK7 across the CDH stack, so I think that's an indication that most
> > >> ecosystem projects are ready to make the jump. Is that sufficient in
> > your
> > >> mind?
> > >>
> > >> For the record, I'm also +1 on the Tucu plan. Is it too late to do
> this
> > for
> > >> 2.5? I'll offer to help out with some of the mechanics.
> > >>
> > >> Thanks,
> > >> Andrew
> > >>
> > >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> > cnauroth@hortonworks.com>
> > >> wrote:
> > >>
> > >>> I understood the plan for avoiding JDK7-specific features in our
> code,
> > >> and
> > >>> your suggestion to add an extra Jenkins job is a great way to guard
> > >> against
> > >>> that.  The thing I haven't seen discussed yet is how downstream
> > projects
> > >>> will continue to consume our built artifacts.  If a downstream
> project
> > >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class
> files,
> > >> but
> > >>> their project is still building with 1.6, then it would be a nasty
> > >>> surprise.
> > >>>
> > >>> These are the options I see:
> > >>>
> > >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> > >>> feasible, unless all other ecosystem projects have moved to JDK7
> > already.
> > >>> If not, then waiting on a single long pole project would hold up our
> > >>> migration indefinitely.
> > >>>
> > >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > >>> ecosystem upgrades.  I find this undesirable, because in a certain
> > sense,
> > >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> > that
> > >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> > format.)
> > >>>
> > >>> 3. Just declare a clean break on some version (your earlier email
> said
> > >> 2.5)
> > >>> and start publishing artifacts built with JDK7 and no -target option.
> > >>> Overall, this is my preferred option.  However, as a side effect,
> this
> > >>> sets us up for longer-term maintenance and patch releases off of the
> > 2.4
> > >>> branch if a downstream project that's still on 1.6 needs to pick up a
> > >>> critical bug fix.
> > >>>
> > >>> Of course, this is all a moot point if all the downstream ecosystem
> > >>> projects have already made the switch to JDK7.  I don't know the
> status
> > >> of
> > >>> that off the top of my head.  Maybe someone else out there knows?  If
> > >> not,
> > >>> then I expect I can free up enough in a few weeks to volunteer for
> > >> tracking
> > >>> down that information.
> > >>>
> > >>> Chris Nauroth
> > >>> Hortonworks
> > >>> http://hortonworks.com/
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > >>> wrote:
> > >>>
> > >>>> Chris,
> > >>>>
> > >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > >>> are
> > >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> > >> jdk6
> > >>>> both at compile and runtime.
> > >>>>
> > >>>> you need to compile with jdk6 to ensure you are not running into
> that
> > >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> > >>> jenkins
> > >>>> job.
> > >>>>
> > >>>>
> > >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > >> cnauroth@hortonworks.com
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > >>> the
> > >>>>> proposals and catching up on the discussion in this thread.
> > >>>>>
> > >>>>> Has anyone yet considered how to coordinate this change with
> > >> downstream
> > >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> > >>> first
> > >>>>> before we make the move?  Would we switch to JDK7, but run javac
> > >>> -target
> > >>>>> 1.6 to maintain compatibility for downstream projects during an
> > >> interim
> > >>>>> period?
> > >>>>>
> > >>>>> Chris Nauroth
> > >>>>> Hortonworks
> > >>>>> http://hortonworks.com/
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <
> omalley@apache.org>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > >>> tucu@cloudera.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> After reading this thread and thinking a bit about it, I think it
> > >>>>> should
> > >>>>>> be
> > >>>>>>> OK such move up to JDK7 in Hadoop
> > >>>>>>
> > >>>>>>
> > >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> > >> incompatible
> > >>>>> change
> > >>>>>> and is fine in the 2 branch. (Although I think it is would *not*
> be
> > >>>>>> appropriate for a patch release.) Of course we need to do it with
> > >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> > >> is
> > >>> a
> > >>>>> good
> > >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> > >> I
> > >>>>> would
> > >>>>>> push back on that.
> > >>>>>>
> > >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> > >>> line
> > >>>>> for
> > >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> > >> been
> > >>>> in
> > >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> > >> stream
> > >>>>> users.
> > >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> > >>>> before
> > >>>>> we
> > >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> > >>>>>>
> > >>>>>> .. Owen
> > >>>>>
> > >>>>> --
> > >>>>> CONFIDENTIALITY NOTICE
> > >>>>> NOTICE: This message is intended for the use of the individual or
> > >>> entity
> > >>>> to
> > >>>>> which it is addressed and may contain information that is
> > >> confidential,
> > >>>>> privileged and exempt from disclosure under applicable law. If the
> > >>> reader
> > >>>>> of this message is not the intended recipient, you are hereby
> > >> notified
> > >>>> that
> > >>>>> any printing, copying, dissemination, distribution, disclosure or
> > >>>>> forwarding of this communication is strictly prohibited. If you
> have
> > >>>>> received this communication in error, please contact the sender
> > >>>> immediately
> > >>>>> and delete it from your system. Thank You.
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Alejandro
> > >>>
> > >>> --
> > >>> CONFIDENTIALITY NOTICE
> > >>> NOTICE: This message is intended for the use of the individual or
> > entity
> > >> to
> > >>> which it is addressed and may contain information that is
> confidential,
> > >>> privileged and exempt from disclosure under applicable law. If the
> > reader
> > >>> of this message is not the intended recipient, you are hereby
> notified
> > >> that
> > >>> any printing, copying, dissemination, distribution, disclosure or
> > >>> forwarding of this communication is strictly prohibited. If you have
> > >>> received this communication in error, please contact the sender
> > >> immediately
> > >>> and delete it from your system. Thank You.
> > >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
Following up on ecosystem, I just took a look at the Apache trunk pom.xml
files for HBase, Flume and Oozie.  All are specifying 1.6 for source and
target in the maven-compiler-plugin configuration, so there may be
additional follow-up required here.  (For example, if HBase has made a
statement that its client will continue to support JDK6, then it wouldn't
be practical for them to link to a JDK7 version of hadoop-common.)

+1 for the whole plan though.  We can work through these details.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Fri, Jun 27, 2014 at 3:10 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> +1 to making 2.6 the last JDK6 release.
>
> If we want, 2.7 could be a parallel release or one soon after 2.6. We could
> upgrade other dependencies that require JDK7 as well.
>
>
> On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com>
> wrote:
>
> > Thanks everyone for the discussion. Looks like we have come to a
> pragmatic
> > and progressive conclusion.
> >
> > In terms of execution of the consensus plan, I think a little bit of
> > caution is in order.
> >
> > Let's give downstream projects more of a runway.
> >
> > I propose we inform HBase, Pig, Hive etc. that we are considering making
> > 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> > are comfortable we can pull the trigger in 2.7.
> >
> > thanks,
> > Arun
> >
> >
> > > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> > wrote:
> > >
> > > As someone else already mentioned, we should announce one future
> release
> > > (may be, 2.5) as the last JDK6-based release before making the move to
> > JDK7.
> > >
> > > I am comfortable calling 2.5 the last JDK6 release.
> > >
> > >
> > > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <
> andrew.wang@cloudera.com>
> > > wrote:
> > >
> > >> Hi all, responding to multiple messages here,
> > >>
> > >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> > like
> > >> the story there is improved and still improving.
> > >>
> > >> However, I think we still suffer from this at least on the HDFS side.
> We
> > >> have a single JAR for all of HDFS, and our clients need to have all
> the
> > fun
> > >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava
> at
> > >> the front of the classpath and the HDFS client still works okay, but
> > this
> > >> is more happy coincidence than anything else. While we're leaking
> deps,
> > >> we're in a scary situation.
> > >>
> > >> API compat to me means that an app should be able to run on a new
> minor
> > >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> > like
> > >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> > and
> > >> have nothing break. If we muck with the classpath, my understanding is
> > that
> > >> this could break.
> > >>
> > >> Owen, bumping the minimum JDK version in a minor release like this
> > should
> > >> be a one-time exception as Tucu stated. A number of people have
> pointed
> > out
> > >> how painful a forced JDK upgrade is for end users, and it's not
> > something
> > >> we should be springing on them in a minor release unless we're *very*
> > >> confident like in this case.
> > >>
> > >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> > on
> > >> JDK7 across the CDH stack, so I think that's an indication that most
> > >> ecosystem projects are ready to make the jump. Is that sufficient in
> > your
> > >> mind?
> > >>
> > >> For the record, I'm also +1 on the Tucu plan. Is it too late to do
> this
> > for
> > >> 2.5? I'll offer to help out with some of the mechanics.
> > >>
> > >> Thanks,
> > >> Andrew
> > >>
> > >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> > cnauroth@hortonworks.com>
> > >> wrote:
> > >>
> > >>> I understood the plan for avoiding JDK7-specific features in our
> code,
> > >> and
> > >>> your suggestion to add an extra Jenkins job is a great way to guard
> > >> against
> > >>> that.  The thing I haven't seen discussed yet is how downstream
> > projects
> > >>> will continue to consume our built artifacts.  If a downstream
> project
> > >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class
> files,
> > >> but
> > >>> their project is still building with 1.6, then it would be a nasty
> > >>> surprise.
> > >>>
> > >>> These are the options I see:
> > >>>
> > >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> > >>> feasible, unless all other ecosystem projects have moved to JDK7
> > already.
> > >>> If not, then waiting on a single long pole project would hold up our
> > >>> migration indefinitely.
> > >>>
> > >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > >>> ecosystem upgrades.  I find this undesirable, because in a certain
> > sense,
> > >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> > that
> > >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> > format.)
> > >>>
> > >>> 3. Just declare a clean break on some version (your earlier email
> said
> > >> 2.5)
> > >>> and start publishing artifacts built with JDK7 and no -target option.
> > >>> Overall, this is my preferred option.  However, as a side effect,
> this
> > >>> sets us up for longer-term maintenance and patch releases off of the
> > 2.4
> > >>> branch if a downstream project that's still on 1.6 needs to pick up a
> > >>> critical bug fix.
> > >>>
> > >>> Of course, this is all a moot point if all the downstream ecosystem
> > >>> projects have already made the switch to JDK7.  I don't know the
> status
> > >> of
> > >>> that off the top of my head.  Maybe someone else out there knows?  If
> > >> not,
> > >>> then I expect I can free up enough in a few weeks to volunteer for
> > >> tracking
> > >>> down that information.
> > >>>
> > >>> Chris Nauroth
> > >>> Hortonworks
> > >>> http://hortonworks.com/
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > >>> wrote:
> > >>>
> > >>>> Chris,
> > >>>>
> > >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > >>> are
> > >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> > >> jdk6
> > >>>> both at compile and runtime.
> > >>>>
> > >>>> you need to compile with jdk6 to ensure you are not running into
> that
> > >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> > >>> jenkins
> > >>>> job.
> > >>>>
> > >>>>
> > >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > >> cnauroth@hortonworks.com
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > >>> the
> > >>>>> proposals and catching up on the discussion in this thread.
> > >>>>>
> > >>>>> Has anyone yet considered how to coordinate this change with
> > >> downstream
> > >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> > >>> first
> > >>>>> before we make the move?  Would we switch to JDK7, but run javac
> > >>> -target
> > >>>>> 1.6 to maintain compatibility for downstream projects during an
> > >> interim
> > >>>>> period?
> > >>>>>
> > >>>>> Chris Nauroth
> > >>>>> Hortonworks
> > >>>>> http://hortonworks.com/
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <
> omalley@apache.org>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > >>> tucu@cloudera.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> After reading this thread and thinking a bit about it, I think it
> > >>>>> should
> > >>>>>> be
> > >>>>>>> OK such move up to JDK7 in Hadoop
> > >>>>>>
> > >>>>>>
> > >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> > >> incompatible
> > >>>>> change
> > >>>>>> and is fine in the 2 branch. (Although I think it is would *not*
> be
> > >>>>>> appropriate for a patch release.) Of course we need to do it with
> > >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> > >> is
> > >>> a
> > >>>>> good
> > >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> > >> I
> > >>>>> would
> > >>>>>> push back on that.
> > >>>>>>
> > >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> > >>> line
> > >>>>> for
> > >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> > >> been
> > >>>> in
> > >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> > >> stream
> > >>>>> users.
> > >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> > >>>> before
> > >>>>> we
> > >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> > >>>>>>
> > >>>>>> .. Owen
> > >>>>>
> > >>>>> --
> > >>>>> CONFIDENTIALITY NOTICE
> > >>>>> NOTICE: This message is intended for the use of the individual or
> > >>> entity
> > >>>> to
> > >>>>> which it is addressed and may contain information that is
> > >> confidential,
> > >>>>> privileged and exempt from disclosure under applicable law. If the
> > >>> reader
> > >>>>> of this message is not the intended recipient, you are hereby
> > >> notified
> > >>>> that
> > >>>>> any printing, copying, dissemination, distribution, disclosure or
> > >>>>> forwarding of this communication is strictly prohibited. If you
> have
> > >>>>> received this communication in error, please contact the sender
> > >>>> immediately
> > >>>>> and delete it from your system. Thank You.
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Alejandro
> > >>>
> > >>> --
> > >>> CONFIDENTIALITY NOTICE
> > >>> NOTICE: This message is intended for the use of the individual or
> > entity
> > >> to
> > >>> which it is addressed and may contain information that is
> confidential,
> > >>> privileged and exempt from disclosure under applicable law. If the
> > reader
> > >>> of this message is not the intended recipient, you are hereby
> notified
> > >> that
> > >>> any printing, copying, dissemination, distribution, disclosure or
> > >>> forwarding of this communication is strictly prohibited. If you have
> > >>> received this communication in error, please contact the sender
> > >> immediately
> > >>> and delete it from your system. Thank You.
> > >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
Following up on ecosystem, I just took a look at the Apache trunk pom.xml
files for HBase, Flume and Oozie.  All are specifying 1.6 for source and
target in the maven-compiler-plugin configuration, so there may be
additional follow-up required here.  (For example, if HBase has made a
statement that its client will continue to support JDK6, then it wouldn't
be practical for them to link to a JDK7 version of hadoop-common.)

+1 for the whole plan though.  We can work through these details.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Fri, Jun 27, 2014 at 3:10 PM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> +1 to making 2.6 the last JDK6 release.
>
> If we want, 2.7 could be a parallel release or one soon after 2.6. We could
> upgrade other dependencies that require JDK7 as well.
>
>
> On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com>
> wrote:
>
> > Thanks everyone for the discussion. Looks like we have come to a
> pragmatic
> > and progressive conclusion.
> >
> > In terms of execution of the consensus plan, I think a little bit of
> > caution is in order.
> >
> > Let's give downstream projects more of a runway.
> >
> > I propose we inform HBase, Pig, Hive etc. that we are considering making
> > 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> > are comfortable we can pull the trigger in 2.7.
> >
> > thanks,
> > Arun
> >
> >
> > > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> > wrote:
> > >
> > > As someone else already mentioned, we should announce one future
> release
> > > (may be, 2.5) as the last JDK6-based release before making the move to
> > JDK7.
> > >
> > > I am comfortable calling 2.5 the last JDK6 release.
> > >
> > >
> > > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <
> andrew.wang@cloudera.com>
> > > wrote:
> > >
> > >> Hi all, responding to multiple messages here,
> > >>
> > >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> > like
> > >> the story there is improved and still improving.
> > >>
> > >> However, I think we still suffer from this at least on the HDFS side.
> We
> > >> have a single JAR for all of HDFS, and our clients need to have all
> the
> > fun
> > >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava
> at
> > >> the front of the classpath and the HDFS client still works okay, but
> > this
> > >> is more happy coincidence than anything else. While we're leaking
> deps,
> > >> we're in a scary situation.
> > >>
> > >> API compat to me means that an app should be able to run on a new
> minor
> > >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> > like
> > >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> > and
> > >> have nothing break. If we muck with the classpath, my understanding is
> > that
> > >> this could break.
> > >>
> > >> Owen, bumping the minimum JDK version in a minor release like this
> > should
> > >> be a one-time exception as Tucu stated. A number of people have
> pointed
> > out
> > >> how painful a forced JDK upgrade is for end users, and it's not
> > something
> > >> we should be springing on them in a minor release unless we're *very*
> > >> confident like in this case.
> > >>
> > >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> > on
> > >> JDK7 across the CDH stack, so I think that's an indication that most
> > >> ecosystem projects are ready to make the jump. Is that sufficient in
> > your
> > >> mind?
> > >>
> > >> For the record, I'm also +1 on the Tucu plan. Is it too late to do
> this
> > for
> > >> 2.5? I'll offer to help out with some of the mechanics.
> > >>
> > >> Thanks,
> > >> Andrew
> > >>
> > >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> > cnauroth@hortonworks.com>
> > >> wrote:
> > >>
> > >>> I understood the plan for avoiding JDK7-specific features in our
> code,
> > >> and
> > >>> your suggestion to add an extra Jenkins job is a great way to guard
> > >> against
> > >>> that.  The thing I haven't seen discussed yet is how downstream
> > projects
> > >>> will continue to consume our built artifacts.  If a downstream
> project
> > >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class
> files,
> > >> but
> > >>> their project is still building with 1.6, then it would be a nasty
> > >>> surprise.
> > >>>
> > >>> These are the options I see:
> > >>>
> > >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> > >>> feasible, unless all other ecosystem projects have moved to JDK7
> > already.
> > >>> If not, then waiting on a single long pole project would hold up our
> > >>> migration indefinitely.
> > >>>
> > >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > >>> ecosystem upgrades.  I find this undesirable, because in a certain
> > sense,
> > >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> > that
> > >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> > format.)
> > >>>
> > >>> 3. Just declare a clean break on some version (your earlier email
> said
> > >> 2.5)
> > >>> and start publishing artifacts built with JDK7 and no -target option.
> > >>> Overall, this is my preferred option.  However, as a side effect,
> this
> > >>> sets us up for longer-term maintenance and patch releases off of the
> > 2.4
> > >>> branch if a downstream project that's still on 1.6 needs to pick up a
> > >>> critical bug fix.
> > >>>
> > >>> Of course, this is all a moot point if all the downstream ecosystem
> > >>> projects have already made the switch to JDK7.  I don't know the
> status
> > >> of
> > >>> that off the top of my head.  Maybe someone else out there knows?  If
> > >> not,
> > >>> then I expect I can free up enough in a few weeks to volunteer for
> > >> tracking
> > >>> down that information.
> > >>>
> > >>> Chris Nauroth
> > >>> Hortonworks
> > >>> http://hortonworks.com/
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > >>> wrote:
> > >>>
> > >>>> Chris,
> > >>>>
> > >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > >>> are
> > >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> > >> jdk6
> > >>>> both at compile and runtime.
> > >>>>
> > >>>> you need to compile with jdk6 to ensure you are not running into
> that
> > >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> > >>> jenkins
> > >>>> job.
> > >>>>
> > >>>>
> > >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > >> cnauroth@hortonworks.com
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > >>> the
> > >>>>> proposals and catching up on the discussion in this thread.
> > >>>>>
> > >>>>> Has anyone yet considered how to coordinate this change with
> > >> downstream
> > >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> > >>> first
> > >>>>> before we make the move?  Would we switch to JDK7, but run javac
> > >>> -target
> > >>>>> 1.6 to maintain compatibility for downstream projects during an
> > >> interim
> > >>>>> period?
> > >>>>>
> > >>>>> Chris Nauroth
> > >>>>> Hortonworks
> > >>>>> http://hortonworks.com/
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <
> omalley@apache.org>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > >>> tucu@cloudera.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> After reading this thread and thinking a bit about it, I think it
> > >>>>> should
> > >>>>>> be
> > >>>>>>> OK such move up to JDK7 in Hadoop
> > >>>>>>
> > >>>>>>
> > >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> > >> incompatible
> > >>>>> change
> > >>>>>> and is fine in the 2 branch. (Although I think it is would *not*
> be
> > >>>>>> appropriate for a patch release.) Of course we need to do it with
> > >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> > >> is
> > >>> a
> > >>>>> good
> > >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> > >> I
> > >>>>> would
> > >>>>>> push back on that.
> > >>>>>>
> > >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> > >>> line
> > >>>>> for
> > >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> > >> been
> > >>>> in
> > >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> > >> stream
> > >>>>> users.
> > >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> > >>>> before
> > >>>>> we
> > >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> > >>>>>>
> > >>>>>> .. Owen
> > >>>>>
> > >>>>> --
> > >>>>> CONFIDENTIALITY NOTICE
> > >>>>> NOTICE: This message is intended for the use of the individual or
> > >>> entity
> > >>>> to
> > >>>>> which it is addressed and may contain information that is
> > >> confidential,
> > >>>>> privileged and exempt from disclosure under applicable law. If the
> > >>> reader
> > >>>>> of this message is not the intended recipient, you are hereby
> > >> notified
> > >>>> that
> > >>>>> any printing, copying, dissemination, distribution, disclosure or
> > >>>>> forwarding of this communication is strictly prohibited. If you
> have
> > >>>>> received this communication in error, please contact the sender
> > >>>> immediately
> > >>>>> and delete it from your system. Thank You.
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Alejandro
> > >>>
> > >>> --
> > >>> CONFIDENTIALITY NOTICE
> > >>> NOTICE: This message is intended for the use of the individual or
> > entity
> > >> to
> > >>> which it is addressed and may contain information that is
> confidential,
> > >>> privileged and exempt from disclosure under applicable law. If the
> > reader
> > >>> of this message is not the intended recipient, you are hereby
> notified
> > >> that
> > >>> any printing, copying, dissemination, distribution, disclosure or
> > >>> forwarding of this communication is strictly prohibited. If you have
> > >>> received this communication in error, please contact the sender
> > >> immediately
> > >>> and delete it from your system. Thank You.
> > >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
+1 to making 2.6 the last JDK6 release.

If we want, 2.7 could be a parallel release or one soon after 2.6. We could
upgrade other dependencies that require JDK7 as well.


On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com> wrote:

> Thanks everyone for the discussion. Looks like we have come to a pragmatic
> and progressive conclusion.
>
> In terms of execution of the consensus plan, I think a little bit of
> caution is in order.
>
> Let's give downstream projects more of a runway.
>
> I propose we inform HBase, Pig, Hive etc. that we are considering making
> 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> are comfortable we can pull the trigger in 2.7.
>
> thanks,
> Arun
>
>
> > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
> >
> > As someone else already mentioned, we should announce one future release
> > (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
> >
> > I am comfortable calling 2.5 the last JDK6 release.
> >
> >
> > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> >
> >> Hi all, responding to multiple messages here,
> >>
> >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> >> the story there is improved and still improving.
> >>
> >> However, I think we still suffer from this at least on the HDFS side. We
> >> have a single JAR for all of HDFS, and our clients need to have all the
> fun
> >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> >> the front of the classpath and the HDFS client still works okay, but
> this
> >> is more happy coincidence than anything else. While we're leaking deps,
> >> we're in a scary situation.
> >>
> >> API compat to me means that an app should be able to run on a new minor
> >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> like
> >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> and
> >> have nothing break. If we muck with the classpath, my understanding is
> that
> >> this could break.
> >>
> >> Owen, bumping the minimum JDK version in a minor release like this
> should
> >> be a one-time exception as Tucu stated. A number of people have pointed
> out
> >> how painful a forced JDK upgrade is for end users, and it's not
> something
> >> we should be springing on them in a minor release unless we're *very*
> >> confident like in this case.
> >>
> >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> on
> >> JDK7 across the CDH stack, so I think that's an indication that most
> >> ecosystem projects are ready to make the jump. Is that sufficient in
> your
> >> mind?
> >>
> >> For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> >> 2.5? I'll offer to help out with some of the mechanics.
> >>
> >> Thanks,
> >> Andrew
> >>
> >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> cnauroth@hortonworks.com>
> >> wrote:
> >>
> >>> I understood the plan for avoiding JDK7-specific features in our code,
> >> and
> >>> your suggestion to add an extra Jenkins job is a great way to guard
> >> against
> >>> that.  The thing I haven't seen discussed yet is how downstream
> projects
> >>> will continue to consume our built artifacts.  If a downstream project
> >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> >> but
> >>> their project is still building with 1.6, then it would be a nasty
> >>> surprise.
> >>>
> >>> These are the options I see:
> >>>
> >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> >>> feasible, unless all other ecosystem projects have moved to JDK7
> already.
> >>> If not, then waiting on a single long pole project would hold up our
> >>> migration indefinitely.
> >>>
> >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> >>> ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> >>>
> >>> 3. Just declare a clean break on some version (your earlier email said
> >> 2.5)
> >>> and start publishing artifacts built with JDK7 and no -target option.
> >>> Overall, this is my preferred option.  However, as a side effect, this
> >>> sets us up for longer-term maintenance and patch releases off of the
> 2.4
> >>> branch if a downstream project that's still on 1.6 needs to pick up a
> >>> critical bug fix.
> >>>
> >>> Of course, this is all a moot point if all the downstream ecosystem
> >>> projects have already made the switch to JDK7.  I don't know the status
> >> of
> >>> that off the top of my head.  Maybe someone else out there knows?  If
> >> not,
> >>> then I expect I can free up enough in a few weeks to volunteer for
> >> tracking
> >>> down that information.
> >>>
> >>> Chris Nauroth
> >>> Hortonworks
> >>> http://hortonworks.com/
> >>>
> >>>
> >>>
> >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> >>> wrote:
> >>>
> >>>> Chris,
> >>>>
> >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> >>> are
> >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> >> jdk6
> >>>> both at compile and runtime.
> >>>>
> >>>> you need to compile with jdk6 to ensure you are not running into that
> >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> >>> jenkins
> >>>> job.
> >>>>
> >>>>
> >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> >> cnauroth@hortonworks.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
> >>> the
> >>>>> proposals and catching up on the discussion in this thread.
> >>>>>
> >>>>> Has anyone yet considered how to coordinate this change with
> >> downstream
> >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> >>> first
> >>>>> before we make the move?  Would we switch to JDK7, but run javac
> >>> -target
> >>>>> 1.6 to maintain compatibility for downstream projects during an
> >> interim
> >>>>> period?
> >>>>>
> >>>>> Chris Nauroth
> >>>>> Hortonworks
> >>>>> http://hortonworks.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> >>>>> wrote:
> >>>>>
> >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> >>> tucu@cloudera.com
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> After reading this thread and thinking a bit about it, I think it
> >>>>> should
> >>>>>> be
> >>>>>>> OK such move up to JDK7 in Hadoop
> >>>>>>
> >>>>>>
> >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> >> incompatible
> >>>>> change
> >>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
> >>>>>> appropriate for a patch release.) Of course we need to do it with
> >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> >> is
> >>> a
> >>>>> good
> >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> >> I
> >>>>> would
> >>>>>> push back on that.
> >>>>>>
> >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> >>> line
> >>>>> for
> >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> >> been
> >>>> in
> >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> >> stream
> >>>>> users.
> >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> >>>> before
> >>>>> we
> >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> >>>>>>
> >>>>>> .. Owen
> >>>>>
> >>>>> --
> >>>>> CONFIDENTIALITY NOTICE
> >>>>> NOTICE: This message is intended for the use of the individual or
> >>> entity
> >>>> to
> >>>>> which it is addressed and may contain information that is
> >> confidential,
> >>>>> privileged and exempt from disclosure under applicable law. If the
> >>> reader
> >>>>> of this message is not the intended recipient, you are hereby
> >> notified
> >>>> that
> >>>>> any printing, copying, dissemination, distribution, disclosure or
> >>>>> forwarding of this communication is strictly prohibited. If you have
> >>>>> received this communication in error, please contact the sender
> >>>> immediately
> >>>>> and delete it from your system. Thank You.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Alejandro
> >>>
> >>> --
> >>> CONFIDENTIALITY NOTICE
> >>> NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >>> which it is addressed and may contain information that is confidential,
> >>> privileged and exempt from disclosure under applicable law. If the
> reader
> >>> of this message is not the intended recipient, you are hereby notified
> >> that
> >>> any printing, copying, dissemination, distribution, disclosure or
> >>> forwarding of this communication is strictly prohibited. If you have
> >>> received this communication in error, please contact the sender
> >> immediately
> >>> and delete it from your system. Thank You.
> >>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
+1 to making 2.6 the last JDK6 release.

If we want, 2.7 could be a parallel release or one soon after 2.6. We could
upgrade other dependencies that require JDK7 as well.


On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com> wrote:

> Thanks everyone for the discussion. Looks like we have come to a pragmatic
> and progressive conclusion.
>
> In terms of execution of the consensus plan, I think a little bit of
> caution is in order.
>
> Let's give downstream projects more of a runway.
>
> I propose we inform HBase, Pig, Hive etc. that we are considering making
> 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> are comfortable we can pull the trigger in 2.7.
>
> thanks,
> Arun
>
>
> > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
> >
> > As someone else already mentioned, we should announce one future release
> > (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
> >
> > I am comfortable calling 2.5 the last JDK6 release.
> >
> >
> > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> >
> >> Hi all, responding to multiple messages here,
> >>
> >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> >> the story there is improved and still improving.
> >>
> >> However, I think we still suffer from this at least on the HDFS side. We
> >> have a single JAR for all of HDFS, and our clients need to have all the
> fun
> >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> >> the front of the classpath and the HDFS client still works okay, but
> this
> >> is more happy coincidence than anything else. While we're leaking deps,
> >> we're in a scary situation.
> >>
> >> API compat to me means that an app should be able to run on a new minor
> >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> like
> >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> and
> >> have nothing break. If we muck with the classpath, my understanding is
> that
> >> this could break.
> >>
> >> Owen, bumping the minimum JDK version in a minor release like this
> should
> >> be a one-time exception as Tucu stated. A number of people have pointed
> out
> >> how painful a forced JDK upgrade is for end users, and it's not
> something
> >> we should be springing on them in a minor release unless we're *very*
> >> confident like in this case.
> >>
> >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> on
> >> JDK7 across the CDH stack, so I think that's an indication that most
> >> ecosystem projects are ready to make the jump. Is that sufficient in
> your
> >> mind?
> >>
> >> For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> >> 2.5? I'll offer to help out with some of the mechanics.
> >>
> >> Thanks,
> >> Andrew
> >>
> >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> cnauroth@hortonworks.com>
> >> wrote:
> >>
> >>> I understood the plan for avoiding JDK7-specific features in our code,
> >> and
> >>> your suggestion to add an extra Jenkins job is a great way to guard
> >> against
> >>> that.  The thing I haven't seen discussed yet is how downstream
> projects
> >>> will continue to consume our built artifacts.  If a downstream project
> >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> >> but
> >>> their project is still building with 1.6, then it would be a nasty
> >>> surprise.
> >>>
> >>> These are the options I see:
> >>>
> >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> >>> feasible, unless all other ecosystem projects have moved to JDK7
> already.
> >>> If not, then waiting on a single long pole project would hold up our
> >>> migration indefinitely.
> >>>
> >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> >>> ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> >>>
> >>> 3. Just declare a clean break on some version (your earlier email said
> >> 2.5)
> >>> and start publishing artifacts built with JDK7 and no -target option.
> >>> Overall, this is my preferred option.  However, as a side effect, this
> >>> sets us up for longer-term maintenance and patch releases off of the
> 2.4
> >>> branch if a downstream project that's still on 1.6 needs to pick up a
> >>> critical bug fix.
> >>>
> >>> Of course, this is all a moot point if all the downstream ecosystem
> >>> projects have already made the switch to JDK7.  I don't know the status
> >> of
> >>> that off the top of my head.  Maybe someone else out there knows?  If
> >> not,
> >>> then I expect I can free up enough in a few weeks to volunteer for
> >> tracking
> >>> down that information.
> >>>
> >>> Chris Nauroth
> >>> Hortonworks
> >>> http://hortonworks.com/
> >>>
> >>>
> >>>
> >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> >>> wrote:
> >>>
> >>>> Chris,
> >>>>
> >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> >>> are
> >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> >> jdk6
> >>>> both at compile and runtime.
> >>>>
> >>>> you need to compile with jdk6 to ensure you are not running into that
> >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> >>> jenkins
> >>>> job.
> >>>>
> >>>>
> >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> >> cnauroth@hortonworks.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
> >>> the
> >>>>> proposals and catching up on the discussion in this thread.
> >>>>>
> >>>>> Has anyone yet considered how to coordinate this change with
> >> downstream
> >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> >>> first
> >>>>> before we make the move?  Would we switch to JDK7, but run javac
> >>> -target
> >>>>> 1.6 to maintain compatibility for downstream projects during an
> >> interim
> >>>>> period?
> >>>>>
> >>>>> Chris Nauroth
> >>>>> Hortonworks
> >>>>> http://hortonworks.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> >>>>> wrote:
> >>>>>
> >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> >>> tucu@cloudera.com
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> After reading this thread and thinking a bit about it, I think it
> >>>>> should
> >>>>>> be
> >>>>>>> OK such move up to JDK7 in Hadoop
> >>>>>>
> >>>>>>
> >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> >> incompatible
> >>>>> change
> >>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
> >>>>>> appropriate for a patch release.) Of course we need to do it with
> >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> >> is
> >>> a
> >>>>> good
> >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> >> I
> >>>>> would
> >>>>>> push back on that.
> >>>>>>
> >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> >>> line
> >>>>> for
> >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> >> been
> >>>> in
> >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> >> stream
> >>>>> users.
> >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> >>>> before
> >>>>> we
> >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> >>>>>>
> >>>>>> .. Owen
> >>>>>
> >>>>> --
> >>>>> CONFIDENTIALITY NOTICE
> >>>>> NOTICE: This message is intended for the use of the individual or
> >>> entity
> >>>> to
> >>>>> which it is addressed and may contain information that is
> >> confidential,
> >>>>> privileged and exempt from disclosure under applicable law. If the
> >>> reader
> >>>>> of this message is not the intended recipient, you are hereby
> >> notified
> >>>> that
> >>>>> any printing, copying, dissemination, distribution, disclosure or
> >>>>> forwarding of this communication is strictly prohibited. If you have
> >>>>> received this communication in error, please contact the sender
> >>>> immediately
> >>>>> and delete it from your system. Thank You.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Alejandro
> >>>
> >>> --
> >>> CONFIDENTIALITY NOTICE
> >>> NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >>> which it is addressed and may contain information that is confidential,
> >>> privileged and exempt from disclosure under applicable law. If the
> reader
> >>> of this message is not the intended recipient, you are hereby notified
> >> that
> >>> any printing, copying, dissemination, distribution, disclosure or
> >>> forwarding of this communication is strictly prohibited. If you have
> >>> received this communication in error, please contact the sender
> >> immediately
> >>> and delete it from your system. Thank You.
> >>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
+1 to making 2.6 the last JDK6 release.

If we want, 2.7 could be a parallel release or one soon after 2.6. We could
upgrade other dependencies that require JDK7 as well.


On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com> wrote:

> Thanks everyone for the discussion. Looks like we have come to a pragmatic
> and progressive conclusion.
>
> In terms of execution of the consensus plan, I think a little bit of
> caution is in order.
>
> Let's give downstream projects more of a runway.
>
> I propose we inform HBase, Pig, Hive etc. that we are considering making
> 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> are comfortable we can pull the trigger in 2.7.
>
> thanks,
> Arun
>
>
> > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
> >
> > As someone else already mentioned, we should announce one future release
> > (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
> >
> > I am comfortable calling 2.5 the last JDK6 release.
> >
> >
> > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> >
> >> Hi all, responding to multiple messages here,
> >>
> >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> >> the story there is improved and still improving.
> >>
> >> However, I think we still suffer from this at least on the HDFS side. We
> >> have a single JAR for all of HDFS, and our clients need to have all the
> fun
> >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> >> the front of the classpath and the HDFS client still works okay, but
> this
> >> is more happy coincidence than anything else. While we're leaking deps,
> >> we're in a scary situation.
> >>
> >> API compat to me means that an app should be able to run on a new minor
> >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> like
> >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> and
> >> have nothing break. If we muck with the classpath, my understanding is
> that
> >> this could break.
> >>
> >> Owen, bumping the minimum JDK version in a minor release like this
> should
> >> be a one-time exception as Tucu stated. A number of people have pointed
> out
> >> how painful a forced JDK upgrade is for end users, and it's not
> something
> >> we should be springing on them in a minor release unless we're *very*
> >> confident like in this case.
> >>
> >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> on
> >> JDK7 across the CDH stack, so I think that's an indication that most
> >> ecosystem projects are ready to make the jump. Is that sufficient in
> your
> >> mind?
> >>
> >> For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> >> 2.5? I'll offer to help out with some of the mechanics.
> >>
> >> Thanks,
> >> Andrew
> >>
> >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> cnauroth@hortonworks.com>
> >> wrote:
> >>
> >>> I understood the plan for avoiding JDK7-specific features in our code,
> >> and
> >>> your suggestion to add an extra Jenkins job is a great way to guard
> >> against
> >>> that.  The thing I haven't seen discussed yet is how downstream
> projects
> >>> will continue to consume our built artifacts.  If a downstream project
> >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> >> but
> >>> their project is still building with 1.6, then it would be a nasty
> >>> surprise.
> >>>
> >>> These are the options I see:
> >>>
> >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> >>> feasible, unless all other ecosystem projects have moved to JDK7
> already.
> >>> If not, then waiting on a single long pole project would hold up our
> >>> migration indefinitely.
> >>>
> >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> >>> ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> >>>
> >>> 3. Just declare a clean break on some version (your earlier email said
> >> 2.5)
> >>> and start publishing artifacts built with JDK7 and no -target option.
> >>> Overall, this is my preferred option.  However, as a side effect, this
> >>> sets us up for longer-term maintenance and patch releases off of the
> 2.4
> >>> branch if a downstream project that's still on 1.6 needs to pick up a
> >>> critical bug fix.
> >>>
> >>> Of course, this is all a moot point if all the downstream ecosystem
> >>> projects have already made the switch to JDK7.  I don't know the status
> >> of
> >>> that off the top of my head.  Maybe someone else out there knows?  If
> >> not,
> >>> then I expect I can free up enough in a few weeks to volunteer for
> >> tracking
> >>> down that information.
> >>>
> >>> Chris Nauroth
> >>> Hortonworks
> >>> http://hortonworks.com/
> >>>
> >>>
> >>>
> >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> >>> wrote:
> >>>
> >>>> Chris,
> >>>>
> >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> >>> are
> >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> >> jdk6
> >>>> both at compile and runtime.
> >>>>
> >>>> you need to compile with jdk6 to ensure you are not running into that
> >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> >>> jenkins
> >>>> job.
> >>>>
> >>>>
> >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> >> cnauroth@hortonworks.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
> >>> the
> >>>>> proposals and catching up on the discussion in this thread.
> >>>>>
> >>>>> Has anyone yet considered how to coordinate this change with
> >> downstream
> >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> >>> first
> >>>>> before we make the move?  Would we switch to JDK7, but run javac
> >>> -target
> >>>>> 1.6 to maintain compatibility for downstream projects during an
> >> interim
> >>>>> period?
> >>>>>
> >>>>> Chris Nauroth
> >>>>> Hortonworks
> >>>>> http://hortonworks.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> >>>>> wrote:
> >>>>>
> >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> >>> tucu@cloudera.com
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> After reading this thread and thinking a bit about it, I think it
> >>>>> should
> >>>>>> be
> >>>>>>> OK such move up to JDK7 in Hadoop
> >>>>>>
> >>>>>>
> >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> >> incompatible
> >>>>> change
> >>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
> >>>>>> appropriate for a patch release.) Of course we need to do it with
> >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> >> is
> >>> a
> >>>>> good
> >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> >> I
> >>>>> would
> >>>>>> push back on that.
> >>>>>>
> >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> >>> line
> >>>>> for
> >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> >> been
> >>>> in
> >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> >> stream
> >>>>> users.
> >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> >>>> before
> >>>>> we
> >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> >>>>>>
> >>>>>> .. Owen
> >>>>>
> >>>>> --
> >>>>> CONFIDENTIALITY NOTICE
> >>>>> NOTICE: This message is intended for the use of the individual or
> >>> entity
> >>>> to
> >>>>> which it is addressed and may contain information that is
> >> confidential,
> >>>>> privileged and exempt from disclosure under applicable law. If the
> >>> reader
> >>>>> of this message is not the intended recipient, you are hereby
> >> notified
> >>>> that
> >>>>> any printing, copying, dissemination, distribution, disclosure or
> >>>>> forwarding of this communication is strictly prohibited. If you have
> >>>>> received this communication in error, please contact the sender
> >>>> immediately
> >>>>> and delete it from your system. Thank You.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Alejandro
> >>>
> >>> --
> >>> CONFIDENTIALITY NOTICE
> >>> NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >>> which it is addressed and may contain information that is confidential,
> >>> privileged and exempt from disclosure under applicable law. If the
> reader
> >>> of this message is not the intended recipient, you are hereby notified
> >> that
> >>> any printing, copying, dissemination, distribution, disclosure or
> >>> forwarding of this communication is strictly prohibited. If you have
> >>> received this communication in error, please contact the sender
> >> immediately
> >>> and delete it from your system. Thank You.
> >>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
+1 to making 2.6 the last JDK6 release.

If we want, 2.7 could be a parallel release or one soon after 2.6. We could
upgrade other dependencies that require JDK7 as well.


On Fri, Jun 27, 2014 at 3:01 PM, Arun C. Murthy <ac...@hortonworks.com> wrote:

> Thanks everyone for the discussion. Looks like we have come to a pragmatic
> and progressive conclusion.
>
> In terms of execution of the consensus plan, I think a little bit of
> caution is in order.
>
> Let's give downstream projects more of a runway.
>
> I propose we inform HBase, Pig, Hive etc. that we are considering making
> 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they
> are comfortable we can pull the trigger in 2.7.
>
> thanks,
> Arun
>
>
> > On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
> wrote:
> >
> > As someone else already mentioned, we should announce one future release
> > (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
> >
> > I am comfortable calling 2.5 the last JDK6 release.
> >
> >
> > On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> >
> >> Hi all, responding to multiple messages here,
> >>
> >> Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> >> the story there is improved and still improving.
> >>
> >> However, I think we still suffer from this at least on the HDFS side. We
> >> have a single JAR for all of HDFS, and our clients need to have all the
> fun
> >> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> >> the front of the classpath and the HDFS client still works okay, but
> this
> >> is more happy coincidence than anything else. While we're leaking deps,
> >> we're in a scary situation.
> >>
> >> API compat to me means that an app should be able to run on a new minor
> >> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds
> like
> >> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> >> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs
> and
> >> have nothing break. If we muck with the classpath, my understanding is
> that
> >> this could break.
> >>
> >> Owen, bumping the minimum JDK version in a minor release like this
> should
> >> be a one-time exception as Tucu stated. A number of people have pointed
> out
> >> how painful a forced JDK upgrade is for end users, and it's not
> something
> >> we should be springing on them in a minor release unless we're *very*
> >> confident like in this case.
> >>
> >> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized
> on
> >> JDK7 across the CDH stack, so I think that's an indication that most
> >> ecosystem projects are ready to make the jump. Is that sufficient in
> your
> >> mind?
> >>
> >> For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> >> 2.5? I'll offer to help out with some of the mechanics.
> >>
> >> Thanks,
> >> Andrew
> >>
> >> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <
> cnauroth@hortonworks.com>
> >> wrote:
> >>
> >>> I understood the plan for avoiding JDK7-specific features in our code,
> >> and
> >>> your suggestion to add an extra Jenkins job is a great way to guard
> >> against
> >>> that.  The thing I haven't seen discussed yet is how downstream
> projects
> >>> will continue to consume our built artifacts.  If a downstream project
> >>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> >> but
> >>> their project is still building with 1.6, then it would be a nasty
> >>> surprise.
> >>>
> >>> These are the options I see:
> >>>
> >>> 1. Make sure all other projects upgrade first.  This doesn't sound
> >>> feasible, unless all other ecosystem projects have moved to JDK7
> already.
> >>> If not, then waiting on a single long pole project would hold up our
> >>> migration indefinitely.
> >>>
> >>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> >>> ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> >>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> >>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> >>>
> >>> 3. Just declare a clean break on some version (your earlier email said
> >> 2.5)
> >>> and start publishing artifacts built with JDK7 and no -target option.
> >>> Overall, this is my preferred option.  However, as a side effect, this
> >>> sets us up for longer-term maintenance and patch releases off of the
> 2.4
> >>> branch if a downstream project that's still on 1.6 needs to pick up a
> >>> critical bug fix.
> >>>
> >>> Of course, this is all a moot point if all the downstream ecosystem
> >>> projects have already made the switch to JDK7.  I don't know the status
> >> of
> >>> that off the top of my head.  Maybe someone else out there knows?  If
> >> not,
> >>> then I expect I can free up enough in a few weeks to volunteer for
> >> tracking
> >>> down that information.
> >>>
> >>> Chris Nauroth
> >>> Hortonworks
> >>> http://hortonworks.com/
> >>>
> >>>
> >>>
> >>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> >>> wrote:
> >>>
> >>>> Chris,
> >>>>
> >>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> >>> are
> >>>> still using jdk7 libraries and you could use new APIs, thus breaking
> >> jdk6
> >>>> both at compile and runtime.
> >>>>
> >>>> you need to compile with jdk6 to ensure you are not running into that
> >>>> scenario. that is why i was suggesting the nightly jdk6 build/test
> >>> jenkins
> >>>> job.
> >>>>
> >>>>
> >>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> >> cnauroth@hortonworks.com
> >>>>
> >>>> wrote:
> >>>>
> >>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
> >>> the
> >>>>> proposals and catching up on the discussion in this thread.
> >>>>>
> >>>>> Has anyone yet considered how to coordinate this change with
> >> downstream
> >>>>> projects?  Would we request downstream projects to upgrade to JDK7
> >>> first
> >>>>> before we make the move?  Would we switch to JDK7, but run javac
> >>> -target
> >>>>> 1.6 to maintain compatibility for downstream projects during an
> >> interim
> >>>>> period?
> >>>>>
> >>>>> Chris Nauroth
> >>>>> Hortonworks
> >>>>> http://hortonworks.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> >>>>> wrote:
> >>>>>
> >>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> >>> tucu@cloudera.com
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> After reading this thread and thinking a bit about it, I think it
> >>>>> should
> >>>>>> be
> >>>>>>> OK such move up to JDK7 in Hadoop
> >>>>>>
> >>>>>>
> >>>>>> I agree with Alejandro. Changing minimum JDKs is not an
> >> incompatible
> >>>>> change
> >>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
> >>>>>> appropriate for a patch release.) Of course we need to do it with
> >>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
> >> is
> >>> a
> >>>>> good
> >>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
> >> I
> >>>>> would
> >>>>>> push back on that.
> >>>>>>
> >>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
> >>> line
> >>>>> for
> >>>>>> a while before we talk about Hadoop 3. It seems that it has only
> >> been
> >>>> in
> >>>>>> the last 6 months that Hadoop 2 adoption has reached the main
> >> stream
> >>>>> users.
> >>>>>> Our user community needs time to digest the changes in Hadoop 2.x
> >>>> before
> >>>>> we
> >>>>>> fracture the community by starting to discuss Hadoop 3 releases.
> >>>>>>
> >>>>>> .. Owen
> >>>>>
> >>>>> --
> >>>>> CONFIDENTIALITY NOTICE
> >>>>> NOTICE: This message is intended for the use of the individual or
> >>> entity
> >>>> to
> >>>>> which it is addressed and may contain information that is
> >> confidential,
> >>>>> privileged and exempt from disclosure under applicable law. If the
> >>> reader
> >>>>> of this message is not the intended recipient, you are hereby
> >> notified
> >>>> that
> >>>>> any printing, copying, dissemination, distribution, disclosure or
> >>>>> forwarding of this communication is strictly prohibited. If you have
> >>>>> received this communication in error, please contact the sender
> >>>> immediately
> >>>>> and delete it from your system. Thank You.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Alejandro
> >>>
> >>> --
> >>> CONFIDENTIALITY NOTICE
> >>> NOTICE: This message is intended for the use of the individual or
> entity
> >> to
> >>> which it is addressed and may contain information that is confidential,
> >>> privileged and exempt from disclosure under applicable law. If the
> reader
> >>> of this message is not the intended recipient, you are hereby notified
> >> that
> >>> any printing, copying, dissemination, distribution, disclosure or
> >>> forwarding of this communication is strictly prohibited. If you have
> >>> received this communication in error, please contact the sender
> >> immediately
> >>> and delete it from your system. Thank You.
> >>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by "Arun C. Murthy" <ac...@hortonworks.com>.
Thanks everyone for the discussion. Looks like we have come to a pragmatic and progressive conclusion.

In terms of execution of the consensus plan, I think a little bit of caution is in order.

Let's give downstream projects more of a runway.

I propose we inform HBase, Pig, Hive etc. that we are considering making 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they are comfortable we can pull the trigger in 2.7.

thanks,
Arun


> On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com> wrote:
> 
> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to JDK7.
> 
> I am comfortable calling 2.5 the last JDK6 release.
> 
> 
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> Hi all, responding to multiple messages here,
>> 
>> Arun, thanks for the clarification regarding MR classpaths. It sounds like
>> the story there is improved and still improving.
>> 
>> However, I think we still suffer from this at least on the HDFS side. We
>> have a single JAR for all of HDFS, and our clients need to have all the fun
>> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
>> the front of the classpath and the HDFS client still works okay, but this
>> is more happy coincidence than anything else. While we're leaking deps,
>> we're in a scary situation.
>> 
>> API compat to me means that an app should be able to run on a new minor
>> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
>> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
>> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
>> have nothing break. If we muck with the classpath, my understanding is that
>> this could break.
>> 
>> Owen, bumping the minimum JDK version in a minor release like this should
>> be a one-time exception as Tucu stated. A number of people have pointed out
>> how painful a forced JDK upgrade is for end users, and it's not something
>> we should be springing on them in a minor release unless we're *very*
>> confident like in this case.
>> 
>> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
>> JDK7 across the CDH stack, so I think that's an indication that most
>> ecosystem projects are ready to make the jump. Is that sufficient in your
>> mind?
>> 
>> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
>> 2.5? I'll offer to help out with some of the mechanics.
>> 
>> Thanks,
>> Andrew
>> 
>> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
>> wrote:
>> 
>>> I understood the plan for avoiding JDK7-specific features in our code,
>> and
>>> your suggestion to add an extra Jenkins job is a great way to guard
>> against
>>> that.  The thing I haven't seen discussed yet is how downstream projects
>>> will continue to consume our built artifacts.  If a downstream project
>>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
>> but
>>> their project is still building with 1.6, then it would be a nasty
>>> surprise.
>>> 
>>> These are the options I see:
>>> 
>>> 1. Make sure all other projects upgrade first.  This doesn't sound
>>> feasible, unless all other ecosystem projects have moved to JDK7 already.
>>> If not, then waiting on a single long pole project would hold up our
>>> migration indefinitely.
>>> 
>>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
>>> ecosystem upgrades.  I find this undesirable, because in a certain sense,
>>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
>>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>>> 
>>> 3. Just declare a clean break on some version (your earlier email said
>> 2.5)
>>> and start publishing artifacts built with JDK7 and no -target option.
>>> Overall, this is my preferred option.  However, as a side effect, this
>>> sets us up for longer-term maintenance and patch releases off of the 2.4
>>> branch if a downstream project that's still on 1.6 needs to pick up a
>>> critical bug fix.
>>> 
>>> Of course, this is all a moot point if all the downstream ecosystem
>>> projects have already made the switch to JDK7.  I don't know the status
>> of
>>> that off the top of my head.  Maybe someone else out there knows?  If
>> not,
>>> then I expect I can free up enough in a few weeks to volunteer for
>> tracking
>>> down that information.
>>> 
>>> Chris Nauroth
>>> Hortonworks
>>> http://hortonworks.com/
>>> 
>>> 
>>> 
>>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
>>> wrote:
>>> 
>>>> Chris,
>>>> 
>>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
>>> are
>>>> still using jdk7 libraries and you could use new APIs, thus breaking
>> jdk6
>>>> both at compile and runtime.
>>>> 
>>>> you need to compile with jdk6 to ensure you are not running into that
>>>> scenario. that is why i was suggesting the nightly jdk6 build/test
>>> jenkins
>>>> job.
>>>> 
>>>> 
>>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
>> cnauroth@hortonworks.com
>>>> 
>>>> wrote:
>>>> 
>>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
>>> the
>>>>> proposals and catching up on the discussion in this thread.
>>>>> 
>>>>> Has anyone yet considered how to coordinate this change with
>> downstream
>>>>> projects?  Would we request downstream projects to upgrade to JDK7
>>> first
>>>>> before we make the move?  Would we switch to JDK7, but run javac
>>> -target
>>>>> 1.6 to maintain compatibility for downstream projects during an
>> interim
>>>>> period?
>>>>> 
>>>>> Chris Nauroth
>>>>> Hortonworks
>>>>> http://hortonworks.com/
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
>>>>> wrote:
>>>>> 
>>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
>>> tucu@cloudera.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> After reading this thread and thinking a bit about it, I think it
>>>>> should
>>>>>> be
>>>>>>> OK such move up to JDK7 in Hadoop
>>>>>> 
>>>>>> 
>>>>>> I agree with Alejandro. Changing minimum JDKs is not an
>> incompatible
>>>>> change
>>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
>>>>>> appropriate for a patch release.) Of course we need to do it with
>>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
>> is
>>> a
>>>>> good
>>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
>> I
>>>>> would
>>>>>> push back on that.
>>>>>> 
>>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
>>> line
>>>>> for
>>>>>> a while before we talk about Hadoop 3. It seems that it has only
>> been
>>>> in
>>>>>> the last 6 months that Hadoop 2 adoption has reached the main
>> stream
>>>>> users.
>>>>>> Our user community needs time to digest the changes in Hadoop 2.x
>>>> before
>>>>> we
>>>>>> fracture the community by starting to discuss Hadoop 3 releases.
>>>>>> 
>>>>>> .. Owen
>>>>> 
>>>>> --
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>> entity
>>>> to
>>>>> which it is addressed and may contain information that is
>> confidential,
>>>>> privileged and exempt from disclosure under applicable law. If the
>>> reader
>>>>> of this message is not the intended recipient, you are hereby
>> notified
>>>> that
>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>> forwarding of this communication is strictly prohibited. If you have
>>>>> received this communication in error, please contact the sender
>>>> immediately
>>>>> and delete it from your system. Thank You.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Alejandro
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>> immediately
>>> and delete it from your system. Thank You.
>> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by "Arun C. Murthy" <ac...@hortonworks.com>.
Thanks everyone for the discussion. Looks like we have come to a pragmatic and progressive conclusion.

In terms of execution of the consensus plan, I think a little bit of caution is in order.

Let's give downstream projects more of a runway.

I propose we inform HBase, Pig, Hive etc. that we are considering making 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they are comfortable we can pull the trigger in 2.7.

thanks,
Arun


> On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com> wrote:
> 
> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to JDK7.
> 
> I am comfortable calling 2.5 the last JDK6 release.
> 
> 
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> Hi all, responding to multiple messages here,
>> 
>> Arun, thanks for the clarification regarding MR classpaths. It sounds like
>> the story there is improved and still improving.
>> 
>> However, I think we still suffer from this at least on the HDFS side. We
>> have a single JAR for all of HDFS, and our clients need to have all the fun
>> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
>> the front of the classpath and the HDFS client still works okay, but this
>> is more happy coincidence than anything else. While we're leaking deps,
>> we're in a scary situation.
>> 
>> API compat to me means that an app should be able to run on a new minor
>> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
>> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
>> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
>> have nothing break. If we muck with the classpath, my understanding is that
>> this could break.
>> 
>> Owen, bumping the minimum JDK version in a minor release like this should
>> be a one-time exception as Tucu stated. A number of people have pointed out
>> how painful a forced JDK upgrade is for end users, and it's not something
>> we should be springing on them in a minor release unless we're *very*
>> confident like in this case.
>> 
>> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
>> JDK7 across the CDH stack, so I think that's an indication that most
>> ecosystem projects are ready to make the jump. Is that sufficient in your
>> mind?
>> 
>> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
>> 2.5? I'll offer to help out with some of the mechanics.
>> 
>> Thanks,
>> Andrew
>> 
>> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
>> wrote:
>> 
>>> I understood the plan for avoiding JDK7-specific features in our code,
>> and
>>> your suggestion to add an extra Jenkins job is a great way to guard
>> against
>>> that.  The thing I haven't seen discussed yet is how downstream projects
>>> will continue to consume our built artifacts.  If a downstream project
>>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
>> but
>>> their project is still building with 1.6, then it would be a nasty
>>> surprise.
>>> 
>>> These are the options I see:
>>> 
>>> 1. Make sure all other projects upgrade first.  This doesn't sound
>>> feasible, unless all other ecosystem projects have moved to JDK7 already.
>>> If not, then waiting on a single long pole project would hold up our
>>> migration indefinitely.
>>> 
>>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
>>> ecosystem upgrades.  I find this undesirable, because in a certain sense,
>>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
>>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>>> 
>>> 3. Just declare a clean break on some version (your earlier email said
>> 2.5)
>>> and start publishing artifacts built with JDK7 and no -target option.
>>> Overall, this is my preferred option.  However, as a side effect, this
>>> sets us up for longer-term maintenance and patch releases off of the 2.4
>>> branch if a downstream project that's still on 1.6 needs to pick up a
>>> critical bug fix.
>>> 
>>> Of course, this is all a moot point if all the downstream ecosystem
>>> projects have already made the switch to JDK7.  I don't know the status
>> of
>>> that off the top of my head.  Maybe someone else out there knows?  If
>> not,
>>> then I expect I can free up enough in a few weeks to volunteer for
>> tracking
>>> down that information.
>>> 
>>> Chris Nauroth
>>> Hortonworks
>>> http://hortonworks.com/
>>> 
>>> 
>>> 
>>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
>>> wrote:
>>> 
>>>> Chris,
>>>> 
>>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
>>> are
>>>> still using jdk7 libraries and you could use new APIs, thus breaking
>> jdk6
>>>> both at compile and runtime.
>>>> 
>>>> you need to compile with jdk6 to ensure you are not running into that
>>>> scenario. that is why i was suggesting the nightly jdk6 build/test
>>> jenkins
>>>> job.
>>>> 
>>>> 
>>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
>> cnauroth@hortonworks.com
>>>> 
>>>> wrote:
>>>> 
>>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
>>> the
>>>>> proposals and catching up on the discussion in this thread.
>>>>> 
>>>>> Has anyone yet considered how to coordinate this change with
>> downstream
>>>>> projects?  Would we request downstream projects to upgrade to JDK7
>>> first
>>>>> before we make the move?  Would we switch to JDK7, but run javac
>>> -target
>>>>> 1.6 to maintain compatibility for downstream projects during an
>> interim
>>>>> period?
>>>>> 
>>>>> Chris Nauroth
>>>>> Hortonworks
>>>>> http://hortonworks.com/
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
>>>>> wrote:
>>>>> 
>>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
>>> tucu@cloudera.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> After reading this thread and thinking a bit about it, I think it
>>>>> should
>>>>>> be
>>>>>>> OK such move up to JDK7 in Hadoop
>>>>>> 
>>>>>> 
>>>>>> I agree with Alejandro. Changing minimum JDKs is not an
>> incompatible
>>>>> change
>>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
>>>>>> appropriate for a patch release.) Of course we need to do it with
>>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
>> is
>>> a
>>>>> good
>>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
>> I
>>>>> would
>>>>>> push back on that.
>>>>>> 
>>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
>>> line
>>>>> for
>>>>>> a while before we talk about Hadoop 3. It seems that it has only
>> been
>>>> in
>>>>>> the last 6 months that Hadoop 2 adoption has reached the main
>> stream
>>>>> users.
>>>>>> Our user community needs time to digest the changes in Hadoop 2.x
>>>> before
>>>>> we
>>>>>> fracture the community by starting to discuss Hadoop 3 releases.
>>>>>> 
>>>>>> .. Owen
>>>>> 
>>>>> --
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>> entity
>>>> to
>>>>> which it is addressed and may contain information that is
>> confidential,
>>>>> privileged and exempt from disclosure under applicable law. If the
>>> reader
>>>>> of this message is not the intended recipient, you are hereby
>> notified
>>>> that
>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>> forwarding of this communication is strictly prohibited. If you have
>>>>> received this communication in error, please contact the sender
>>>> immediately
>>>>> and delete it from your system. Thank You.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Alejandro
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>> immediately
>>> and delete it from your system. Thank You.
>> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
FYI I also just updated the wiki page with a Proposal D, aka Tucu plan,
which I think is essentially Proposal C but tabling JDK8 plans for now.

https://wiki.apache.org/hadoop/MovingToJdk7and8

Karthik, thanks for ringing in re: 2.5. I guess there's nothing urgently
required, the Jenkins stuff just needs to happen before 2.6. Still, I'm
happy to help with anything.

Thanks,
Andrew


On Fri, Jun 27, 2014 at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
>
> I am comfortable calling 2.5 the last JDK6 release.
>
>
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all, responding to multiple messages here,
> >
> > Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> > the story there is improved and still improving.
> >
> > However, I think we still suffer from this at least on the HDFS side. We
> > have a single JAR for all of HDFS, and our clients need to have all the
> fun
> > deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> > the front of the classpath and the HDFS client still works okay, but this
> > is more happy coincidence than anything else. While we're leaking deps,
> > we're in a scary situation.
> >
> > API compat to me means that an app should be able to run on a new minor
> > version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> > it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> > have nothing break. If we muck with the classpath, my understanding is
> that
> > this could break.
> >
> > Owen, bumping the minimum JDK version in a minor release like this should
> > be a one-time exception as Tucu stated. A number of people have pointed
> out
> > how painful a forced JDK upgrade is for end users, and it's not something
> > we should be springing on them in a minor release unless we're *very*
> > confident like in this case.
> >
> > Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> > JDK7 across the CDH stack, so I think that's an indication that most
> > ecosystem projects are ready to make the jump. Is that sufficient in your
> > mind?
> >
> > For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> > 2.5? I'll offer to help out with some of the mechanics.
> >
> > Thanks,
> > Andrew
> >
> > On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I understood the plan for avoiding JDK7-specific features in our code,
> > and
> > > your suggestion to add an extra Jenkins job is a great way to guard
> > against
> > > that.  The thing I haven't seen discussed yet is how downstream
> projects
> > > will continue to consume our built artifacts.  If a downstream project
> > > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> > but
> > > their project is still building with 1.6, then it would be a nasty
> > > surprise.
> > >
> > > These are the options I see:
> > >
> > > 1. Make sure all other projects upgrade first.  This doesn't sound
> > > feasible, unless all other ecosystem projects have moved to JDK7
> already.
> > >  If not, then waiting on a single long pole project would hold up our
> > > migration indefinitely.
> > >
> > > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > > ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> > > it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> > > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> > >
> > > 3. Just declare a clean break on some version (your earlier email said
> > 2.5)
> > > and start publishing artifacts built with JDK7 and no -target option.
> > >  Overall, this is my preferred option.  However, as a side effect, this
> > > sets us up for longer-term maintenance and patch releases off of the
> 2.4
> > > branch if a downstream project that's still on 1.6 needs to pick up a
> > > critical bug fix.
> > >
> > > Of course, this is all a moot point if all the downstream ecosystem
> > > projects have already made the switch to JDK7.  I don't know the status
> > of
> > > that off the top of my head.  Maybe someone else out there knows?  If
> > not,
> > > then I expect I can free up enough in a few weeks to volunteer for
> > tracking
> > > down that information.
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > Chris,
> > > >
> > > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > > are
> > > > still using jdk7 libraries and you could use new APIs, thus breaking
> > jdk6
> > > > both at compile and runtime.
> > > >
> > > > you need to compile with jdk6 to ensure you are not running into that
> > > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > > jenkins
> > > > job.
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > cnauroth@hortonworks.com
> > > >
> > > > wrote:
> > > >
> > > > > I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > > the
> > > > > proposals and catching up on the discussion in this thread.
> > > > >
> > > > > Has anyone yet considered how to coordinate this change with
> > downstream
> > > > > projects?  Would we request downstream projects to upgrade to JDK7
> > > first
> > > > > before we make the move?  Would we switch to JDK7, but run javac
> > > -target
> > > > > 1.6 to maintain compatibility for downstream projects during an
> > interim
> > > > > period?
> > > > >
> > > > > Chris Nauroth
> > > > > Hortonworks
> > > > > http://hortonworks.com/
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <omalley@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > > tucu@cloudera.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > After reading this thread and thinking a bit about it, I think
> it
> > > > > should
> > > > > > be
> > > > > > > OK such move up to JDK7 in Hadoop
> > > > > >
> > > > > >
> > > > > > I agree with Alejandro. Changing minimum JDKs is not an
> > incompatible
> > > > > change
> > > > > > and is fine in the 2 branch. (Although I think it is would *not*
> be
> > > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> > is
> > > a
> > > > > good
> > > > > > thing. Moving to Java 8 as a minimum seems much too aggressive
> and
> > I
> > > > > would
> > > > > > push back on that.
> > > > > >
> > > > > > I'm also think that we need to let the dust settle on the Hadoop
> 2
> > > line
> > > > > for
> > > > > > a while before we talk about Hadoop 3. It seems that it has only
> > been
> > > > in
> > > > > > the last 6 months that Hadoop 2 adoption has reached the main
> > stream
> > > > > users.
> > > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > > before
> > > > > we
> > > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > > >
> > > > > > .. Owen
> > > > > >
> > > > >
> > > > > --
> > > > > CONFIDENTIALITY NOTICE
> > > > > NOTICE: This message is intended for the use of the individual or
> > > entity
> > > > to
> > > > > which it is addressed and may contain information that is
> > confidential,
> > > > > privileged and exempt from disclosure under applicable law. If the
> > > reader
> > > > > of this message is not the intended recipient, you are hereby
> > notified
> > > > that
> > > > > any printing, copying, dissemination, distribution, disclosure or
> > > > > forwarding of this communication is strictly prohibited. If you
> have
> > > > > received this communication in error, please contact the sender
> > > > immediately
> > > > > and delete it from your system. Thank You.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Alejandro
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
FYI I also just updated the wiki page with a Proposal D, aka Tucu plan,
which I think is essentially Proposal C but tabling JDK8 plans for now.

https://wiki.apache.org/hadoop/MovingToJdk7and8

Karthik, thanks for ringing in re: 2.5. I guess there's nothing urgently
required, the Jenkins stuff just needs to happen before 2.6. Still, I'm
happy to help with anything.

Thanks,
Andrew


On Fri, Jun 27, 2014 at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
>
> I am comfortable calling 2.5 the last JDK6 release.
>
>
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all, responding to multiple messages here,
> >
> > Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> > the story there is improved and still improving.
> >
> > However, I think we still suffer from this at least on the HDFS side. We
> > have a single JAR for all of HDFS, and our clients need to have all the
> fun
> > deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> > the front of the classpath and the HDFS client still works okay, but this
> > is more happy coincidence than anything else. While we're leaking deps,
> > we're in a scary situation.
> >
> > API compat to me means that an app should be able to run on a new minor
> > version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> > it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> > have nothing break. If we muck with the classpath, my understanding is
> that
> > this could break.
> >
> > Owen, bumping the minimum JDK version in a minor release like this should
> > be a one-time exception as Tucu stated. A number of people have pointed
> out
> > how painful a forced JDK upgrade is for end users, and it's not something
> > we should be springing on them in a minor release unless we're *very*
> > confident like in this case.
> >
> > Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> > JDK7 across the CDH stack, so I think that's an indication that most
> > ecosystem projects are ready to make the jump. Is that sufficient in your
> > mind?
> >
> > For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> > 2.5? I'll offer to help out with some of the mechanics.
> >
> > Thanks,
> > Andrew
> >
> > On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I understood the plan for avoiding JDK7-specific features in our code,
> > and
> > > your suggestion to add an extra Jenkins job is a great way to guard
> > against
> > > that.  The thing I haven't seen discussed yet is how downstream
> projects
> > > will continue to consume our built artifacts.  If a downstream project
> > > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> > but
> > > their project is still building with 1.6, then it would be a nasty
> > > surprise.
> > >
> > > These are the options I see:
> > >
> > > 1. Make sure all other projects upgrade first.  This doesn't sound
> > > feasible, unless all other ecosystem projects have moved to JDK7
> already.
> > >  If not, then waiting on a single long pole project would hold up our
> > > migration indefinitely.
> > >
> > > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > > ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> > > it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> > > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> > >
> > > 3. Just declare a clean break on some version (your earlier email said
> > 2.5)
> > > and start publishing artifacts built with JDK7 and no -target option.
> > >  Overall, this is my preferred option.  However, as a side effect, this
> > > sets us up for longer-term maintenance and patch releases off of the
> 2.4
> > > branch if a downstream project that's still on 1.6 needs to pick up a
> > > critical bug fix.
> > >
> > > Of course, this is all a moot point if all the downstream ecosystem
> > > projects have already made the switch to JDK7.  I don't know the status
> > of
> > > that off the top of my head.  Maybe someone else out there knows?  If
> > not,
> > > then I expect I can free up enough in a few weeks to volunteer for
> > tracking
> > > down that information.
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > Chris,
> > > >
> > > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > > are
> > > > still using jdk7 libraries and you could use new APIs, thus breaking
> > jdk6
> > > > both at compile and runtime.
> > > >
> > > > you need to compile with jdk6 to ensure you are not running into that
> > > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > > jenkins
> > > > job.
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > cnauroth@hortonworks.com
> > > >
> > > > wrote:
> > > >
> > > > > I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > > the
> > > > > proposals and catching up on the discussion in this thread.
> > > > >
> > > > > Has anyone yet considered how to coordinate this change with
> > downstream
> > > > > projects?  Would we request downstream projects to upgrade to JDK7
> > > first
> > > > > before we make the move?  Would we switch to JDK7, but run javac
> > > -target
> > > > > 1.6 to maintain compatibility for downstream projects during an
> > interim
> > > > > period?
> > > > >
> > > > > Chris Nauroth
> > > > > Hortonworks
> > > > > http://hortonworks.com/
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <omalley@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > > tucu@cloudera.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > After reading this thread and thinking a bit about it, I think
> it
> > > > > should
> > > > > > be
> > > > > > > OK such move up to JDK7 in Hadoop
> > > > > >
> > > > > >
> > > > > > I agree with Alejandro. Changing minimum JDKs is not an
> > incompatible
> > > > > change
> > > > > > and is fine in the 2 branch. (Although I think it is would *not*
> be
> > > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> > is
> > > a
> > > > > good
> > > > > > thing. Moving to Java 8 as a minimum seems much too aggressive
> and
> > I
> > > > > would
> > > > > > push back on that.
> > > > > >
> > > > > > I'm also think that we need to let the dust settle on the Hadoop
> 2
> > > line
> > > > > for
> > > > > > a while before we talk about Hadoop 3. It seems that it has only
> > been
> > > > in
> > > > > > the last 6 months that Hadoop 2 adoption has reached the main
> > stream
> > > > > users.
> > > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > > before
> > > > > we
> > > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > > >
> > > > > > .. Owen
> > > > > >
> > > > >
> > > > > --
> > > > > CONFIDENTIALITY NOTICE
> > > > > NOTICE: This message is intended for the use of the individual or
> > > entity
> > > > to
> > > > > which it is addressed and may contain information that is
> > confidential,
> > > > > privileged and exempt from disclosure under applicable law. If the
> > > reader
> > > > > of this message is not the intended recipient, you are hereby
> > notified
> > > > that
> > > > > any printing, copying, dissemination, distribution, disclosure or
> > > > > forwarding of this communication is strictly prohibited. If you
> have
> > > > > received this communication in error, please contact the sender
> > > > immediately
> > > > > and delete it from your system. Thank You.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Alejandro
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by "Arun C. Murthy" <ac...@hortonworks.com>.
Thanks everyone for the discussion. Looks like we have come to a pragmatic and progressive conclusion.

In terms of execution of the consensus plan, I think a little bit of caution is in order.

Let's give downstream projects more of a runway.

I propose we inform HBase, Pig, Hive etc. that we are considering making 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they are comfortable we can pull the trigger in 2.7.

thanks,
Arun


> On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com> wrote:
> 
> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to JDK7.
> 
> I am comfortable calling 2.5 the last JDK6 release.
> 
> 
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> Hi all, responding to multiple messages here,
>> 
>> Arun, thanks for the clarification regarding MR classpaths. It sounds like
>> the story there is improved and still improving.
>> 
>> However, I think we still suffer from this at least on the HDFS side. We
>> have a single JAR for all of HDFS, and our clients need to have all the fun
>> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
>> the front of the classpath and the HDFS client still works okay, but this
>> is more happy coincidence than anything else. While we're leaking deps,
>> we're in a scary situation.
>> 
>> API compat to me means that an app should be able to run on a new minor
>> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
>> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
>> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
>> have nothing break. If we muck with the classpath, my understanding is that
>> this could break.
>> 
>> Owen, bumping the minimum JDK version in a minor release like this should
>> be a one-time exception as Tucu stated. A number of people have pointed out
>> how painful a forced JDK upgrade is for end users, and it's not something
>> we should be springing on them in a minor release unless we're *very*
>> confident like in this case.
>> 
>> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
>> JDK7 across the CDH stack, so I think that's an indication that most
>> ecosystem projects are ready to make the jump. Is that sufficient in your
>> mind?
>> 
>> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
>> 2.5? I'll offer to help out with some of the mechanics.
>> 
>> Thanks,
>> Andrew
>> 
>> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
>> wrote:
>> 
>>> I understood the plan for avoiding JDK7-specific features in our code,
>> and
>>> your suggestion to add an extra Jenkins job is a great way to guard
>> against
>>> that.  The thing I haven't seen discussed yet is how downstream projects
>>> will continue to consume our built artifacts.  If a downstream project
>>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
>> but
>>> their project is still building with 1.6, then it would be a nasty
>>> surprise.
>>> 
>>> These are the options I see:
>>> 
>>> 1. Make sure all other projects upgrade first.  This doesn't sound
>>> feasible, unless all other ecosystem projects have moved to JDK7 already.
>>> If not, then waiting on a single long pole project would hold up our
>>> migration indefinitely.
>>> 
>>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
>>> ecosystem upgrades.  I find this undesirable, because in a certain sense,
>>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
>>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>>> 
>>> 3. Just declare a clean break on some version (your earlier email said
>> 2.5)
>>> and start publishing artifacts built with JDK7 and no -target option.
>>> Overall, this is my preferred option.  However, as a side effect, this
>>> sets us up for longer-term maintenance and patch releases off of the 2.4
>>> branch if a downstream project that's still on 1.6 needs to pick up a
>>> critical bug fix.
>>> 
>>> Of course, this is all a moot point if all the downstream ecosystem
>>> projects have already made the switch to JDK7.  I don't know the status
>> of
>>> that off the top of my head.  Maybe someone else out there knows?  If
>> not,
>>> then I expect I can free up enough in a few weeks to volunteer for
>> tracking
>>> down that information.
>>> 
>>> Chris Nauroth
>>> Hortonworks
>>> http://hortonworks.com/
>>> 
>>> 
>>> 
>>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
>>> wrote:
>>> 
>>>> Chris,
>>>> 
>>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
>>> are
>>>> still using jdk7 libraries and you could use new APIs, thus breaking
>> jdk6
>>>> both at compile and runtime.
>>>> 
>>>> you need to compile with jdk6 to ensure you are not running into that
>>>> scenario. that is why i was suggesting the nightly jdk6 build/test
>>> jenkins
>>>> job.
>>>> 
>>>> 
>>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
>> cnauroth@hortonworks.com
>>>> 
>>>> wrote:
>>>> 
>>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
>>> the
>>>>> proposals and catching up on the discussion in this thread.
>>>>> 
>>>>> Has anyone yet considered how to coordinate this change with
>> downstream
>>>>> projects?  Would we request downstream projects to upgrade to JDK7
>>> first
>>>>> before we make the move?  Would we switch to JDK7, but run javac
>>> -target
>>>>> 1.6 to maintain compatibility for downstream projects during an
>> interim
>>>>> period?
>>>>> 
>>>>> Chris Nauroth
>>>>> Hortonworks
>>>>> http://hortonworks.com/
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
>>>>> wrote:
>>>>> 
>>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
>>> tucu@cloudera.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> After reading this thread and thinking a bit about it, I think it
>>>>> should
>>>>>> be
>>>>>>> OK such move up to JDK7 in Hadoop
>>>>>> 
>>>>>> 
>>>>>> I agree with Alejandro. Changing minimum JDKs is not an
>> incompatible
>>>>> change
>>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
>>>>>> appropriate for a patch release.) Of course we need to do it with
>>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
>> is
>>> a
>>>>> good
>>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
>> I
>>>>> would
>>>>>> push back on that.
>>>>>> 
>>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
>>> line
>>>>> for
>>>>>> a while before we talk about Hadoop 3. It seems that it has only
>> been
>>>> in
>>>>>> the last 6 months that Hadoop 2 adoption has reached the main
>> stream
>>>>> users.
>>>>>> Our user community needs time to digest the changes in Hadoop 2.x
>>>> before
>>>>> we
>>>>>> fracture the community by starting to discuss Hadoop 3 releases.
>>>>>> 
>>>>>> .. Owen
>>>>> 
>>>>> --
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>> entity
>>>> to
>>>>> which it is addressed and may contain information that is
>> confidential,
>>>>> privileged and exempt from disclosure under applicable law. If the
>>> reader
>>>>> of this message is not the intended recipient, you are hereby
>> notified
>>>> that
>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>> forwarding of this communication is strictly prohibited. If you have
>>>>> received this communication in error, please contact the sender
>>>> immediately
>>>>> and delete it from your system. Thank You.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Alejandro
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>> immediately
>>> and delete it from your system. Thank You.
>> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
FYI I also just updated the wiki page with a Proposal D, aka Tucu plan,
which I think is essentially Proposal C but tabling JDK8 plans for now.

https://wiki.apache.org/hadoop/MovingToJdk7and8

Karthik, thanks for ringing in re: 2.5. I guess there's nothing urgently
required, the Jenkins stuff just needs to happen before 2.6. Still, I'm
happy to help with anything.

Thanks,
Andrew


On Fri, Jun 27, 2014 at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
>
> I am comfortable calling 2.5 the last JDK6 release.
>
>
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all, responding to multiple messages here,
> >
> > Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> > the story there is improved and still improving.
> >
> > However, I think we still suffer from this at least on the HDFS side. We
> > have a single JAR for all of HDFS, and our clients need to have all the
> fun
> > deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> > the front of the classpath and the HDFS client still works okay, but this
> > is more happy coincidence than anything else. While we're leaking deps,
> > we're in a scary situation.
> >
> > API compat to me means that an app should be able to run on a new minor
> > version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> > it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> > have nothing break. If we muck with the classpath, my understanding is
> that
> > this could break.
> >
> > Owen, bumping the minimum JDK version in a minor release like this should
> > be a one-time exception as Tucu stated. A number of people have pointed
> out
> > how painful a forced JDK upgrade is for end users, and it's not something
> > we should be springing on them in a minor release unless we're *very*
> > confident like in this case.
> >
> > Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> > JDK7 across the CDH stack, so I think that's an indication that most
> > ecosystem projects are ready to make the jump. Is that sufficient in your
> > mind?
> >
> > For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> > 2.5? I'll offer to help out with some of the mechanics.
> >
> > Thanks,
> > Andrew
> >
> > On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I understood the plan for avoiding JDK7-specific features in our code,
> > and
> > > your suggestion to add an extra Jenkins job is a great way to guard
> > against
> > > that.  The thing I haven't seen discussed yet is how downstream
> projects
> > > will continue to consume our built artifacts.  If a downstream project
> > > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> > but
> > > their project is still building with 1.6, then it would be a nasty
> > > surprise.
> > >
> > > These are the options I see:
> > >
> > > 1. Make sure all other projects upgrade first.  This doesn't sound
> > > feasible, unless all other ecosystem projects have moved to JDK7
> already.
> > >  If not, then waiting on a single long pole project would hold up our
> > > migration indefinitely.
> > >
> > > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > > ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> > > it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> > > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> > >
> > > 3. Just declare a clean break on some version (your earlier email said
> > 2.5)
> > > and start publishing artifacts built with JDK7 and no -target option.
> > >  Overall, this is my preferred option.  However, as a side effect, this
> > > sets us up for longer-term maintenance and patch releases off of the
> 2.4
> > > branch if a downstream project that's still on 1.6 needs to pick up a
> > > critical bug fix.
> > >
> > > Of course, this is all a moot point if all the downstream ecosystem
> > > projects have already made the switch to JDK7.  I don't know the status
> > of
> > > that off the top of my head.  Maybe someone else out there knows?  If
> > not,
> > > then I expect I can free up enough in a few weeks to volunteer for
> > tracking
> > > down that information.
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > Chris,
> > > >
> > > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > > are
> > > > still using jdk7 libraries and you could use new APIs, thus breaking
> > jdk6
> > > > both at compile and runtime.
> > > >
> > > > you need to compile with jdk6 to ensure you are not running into that
> > > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > > jenkins
> > > > job.
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > cnauroth@hortonworks.com
> > > >
> > > > wrote:
> > > >
> > > > > I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > > the
> > > > > proposals and catching up on the discussion in this thread.
> > > > >
> > > > > Has anyone yet considered how to coordinate this change with
> > downstream
> > > > > projects?  Would we request downstream projects to upgrade to JDK7
> > > first
> > > > > before we make the move?  Would we switch to JDK7, but run javac
> > > -target
> > > > > 1.6 to maintain compatibility for downstream projects during an
> > interim
> > > > > period?
> > > > >
> > > > > Chris Nauroth
> > > > > Hortonworks
> > > > > http://hortonworks.com/
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <omalley@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > > tucu@cloudera.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > After reading this thread and thinking a bit about it, I think
> it
> > > > > should
> > > > > > be
> > > > > > > OK such move up to JDK7 in Hadoop
> > > > > >
> > > > > >
> > > > > > I agree with Alejandro. Changing minimum JDKs is not an
> > incompatible
> > > > > change
> > > > > > and is fine in the 2 branch. (Although I think it is would *not*
> be
> > > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> > is
> > > a
> > > > > good
> > > > > > thing. Moving to Java 8 as a minimum seems much too aggressive
> and
> > I
> > > > > would
> > > > > > push back on that.
> > > > > >
> > > > > > I'm also think that we need to let the dust settle on the Hadoop
> 2
> > > line
> > > > > for
> > > > > > a while before we talk about Hadoop 3. It seems that it has only
> > been
> > > > in
> > > > > > the last 6 months that Hadoop 2 adoption has reached the main
> > stream
> > > > > users.
> > > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > > before
> > > > > we
> > > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > > >
> > > > > > .. Owen
> > > > > >
> > > > >
> > > > > --
> > > > > CONFIDENTIALITY NOTICE
> > > > > NOTICE: This message is intended for the use of the individual or
> > > entity
> > > > to
> > > > > which it is addressed and may contain information that is
> > confidential,
> > > > > privileged and exempt from disclosure under applicable law. If the
> > > reader
> > > > > of this message is not the intended recipient, you are hereby
> > notified
> > > > that
> > > > > any printing, copying, dissemination, distribution, disclosure or
> > > > > forwarding of this communication is strictly prohibited. If you
> have
> > > > > received this communication in error, please contact the sender
> > > > immediately
> > > > > and delete it from your system. Thank You.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Alejandro
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
FYI I also just updated the wiki page with a Proposal D, aka Tucu plan,
which I think is essentially Proposal C but tabling JDK8 plans for now.

https://wiki.apache.org/hadoop/MovingToJdk7and8

Karthik, thanks for ringing in re: 2.5. I guess there's nothing urgently
required, the Jenkins stuff just needs to happen before 2.6. Still, I'm
happy to help with anything.

Thanks,
Andrew


On Fri, Jun 27, 2014 at 11:34 AM, Karthik Kambatla <ka...@cloudera.com>
wrote:

> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to
> JDK7.
>
> I am comfortable calling 2.5 the last JDK6 release.
>
>
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all, responding to multiple messages here,
> >
> > Arun, thanks for the clarification regarding MR classpaths. It sounds
> like
> > the story there is improved and still improving.
> >
> > However, I think we still suffer from this at least on the HDFS side. We
> > have a single JAR for all of HDFS, and our clients need to have all the
> fun
> > deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> > the front of the classpath and the HDFS client still works okay, but this
> > is more happy coincidence than anything else. While we're leaking deps,
> > we're in a scary situation.
> >
> > API compat to me means that an app should be able to run on a new minor
> > version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> > it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> > should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> > have nothing break. If we muck with the classpath, my understanding is
> that
> > this could break.
> >
> > Owen, bumping the minimum JDK version in a minor release like this should
> > be a one-time exception as Tucu stated. A number of people have pointed
> out
> > how painful a forced JDK upgrade is for end users, and it's not something
> > we should be springing on them in a minor release unless we're *very*
> > confident like in this case.
> >
> > Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> > JDK7 across the CDH stack, so I think that's an indication that most
> > ecosystem projects are ready to make the jump. Is that sufficient in your
> > mind?
> >
> > For the record, I'm also +1 on the Tucu plan. Is it too late to do this
> for
> > 2.5? I'll offer to help out with some of the mechanics.
> >
> > Thanks,
> > Andrew
> >
> > On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I understood the plan for avoiding JDK7-specific features in our code,
> > and
> > > your suggestion to add an extra Jenkins job is a great way to guard
> > against
> > > that.  The thing I haven't seen discussed yet is how downstream
> projects
> > > will continue to consume our built artifacts.  If a downstream project
> > > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> > but
> > > their project is still building with 1.6, then it would be a nasty
> > > surprise.
> > >
> > > These are the options I see:
> > >
> > > 1. Make sure all other projects upgrade first.  This doesn't sound
> > > feasible, unless all other ecosystem projects have moved to JDK7
> already.
> > >  If not, then waiting on a single long pole project would hold up our
> > > migration indefinitely.
> > >
> > > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > > ecosystem upgrades.  I find this undesirable, because in a certain
> sense,
> > > it still leaves a bit of 1.6 lingering in the project.  (I'll assume
> that
> > > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode
> format.)
> > >
> > > 3. Just declare a clean break on some version (your earlier email said
> > 2.5)
> > > and start publishing artifacts built with JDK7 and no -target option.
> > >  Overall, this is my preferred option.  However, as a side effect, this
> > > sets us up for longer-term maintenance and patch releases off of the
> 2.4
> > > branch if a downstream project that's still on 1.6 needs to pick up a
> > > critical bug fix.
> > >
> > > Of course, this is all a moot point if all the downstream ecosystem
> > > projects have already made the switch to JDK7.  I don't know the status
> > of
> > > that off the top of my head.  Maybe someone else out there knows?  If
> > not,
> > > then I expect I can free up enough in a few weeks to volunteer for
> > tracking
> > > down that information.
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > Chris,
> > > >
> > > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient,
> you
> > > are
> > > > still using jdk7 libraries and you could use new APIs, thus breaking
> > jdk6
> > > > both at compile and runtime.
> > > >
> > > > you need to compile with jdk6 to ensure you are not running into that
> > > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > > jenkins
> > > > job.
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> > cnauroth@hortonworks.com
> > > >
> > > > wrote:
> > > >
> > > > > I'm also +1 for getting us to JDK7 within the 2.x line after
> reading
> > > the
> > > > > proposals and catching up on the discussion in this thread.
> > > > >
> > > > > Has anyone yet considered how to coordinate this change with
> > downstream
> > > > > projects?  Would we request downstream projects to upgrade to JDK7
> > > first
> > > > > before we make the move?  Would we switch to JDK7, but run javac
> > > -target
> > > > > 1.6 to maintain compatibility for downstream projects during an
> > interim
> > > > > period?
> > > > >
> > > > > Chris Nauroth
> > > > > Hortonworks
> > > > > http://hortonworks.com/
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <omalley@apache.org
> >
> > > > wrote:
> > > > >
> > > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > > tucu@cloudera.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > After reading this thread and thinking a bit about it, I think
> it
> > > > > should
> > > > > > be
> > > > > > > OK such move up to JDK7 in Hadoop
> > > > > >
> > > > > >
> > > > > > I agree with Alejandro. Changing minimum JDKs is not an
> > incompatible
> > > > > change
> > > > > > and is fine in the 2 branch. (Although I think it is would *not*
> be
> > > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> > is
> > > a
> > > > > good
> > > > > > thing. Moving to Java 8 as a minimum seems much too aggressive
> and
> > I
> > > > > would
> > > > > > push back on that.
> > > > > >
> > > > > > I'm also think that we need to let the dust settle on the Hadoop
> 2
> > > line
> > > > > for
> > > > > > a while before we talk about Hadoop 3. It seems that it has only
> > been
> > > > in
> > > > > > the last 6 months that Hadoop 2 adoption has reached the main
> > stream
> > > > > users.
> > > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > > before
> > > > > we
> > > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > > >
> > > > > > .. Owen
> > > > > >
> > > > >
> > > > > --
> > > > > CONFIDENTIALITY NOTICE
> > > > > NOTICE: This message is intended for the use of the individual or
> > > entity
> > > > to
> > > > > which it is addressed and may contain information that is
> > confidential,
> > > > > privileged and exempt from disclosure under applicable law. If the
> > > reader
> > > > > of this message is not the intended recipient, you are hereby
> > notified
> > > > that
> > > > > any printing, copying, dissemination, distribution, disclosure or
> > > > > forwarding of this communication is strictly prohibited. If you
> have
> > > > > received this communication in error, please contact the sender
> > > > immediately
> > > > > and delete it from your system. Thank You.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Alejandro
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by "Arun C. Murthy" <ac...@hortonworks.com>.
Thanks everyone for the discussion. Looks like we have come to a pragmatic and progressive conclusion.

In terms of execution of the consensus plan, I think a little bit of caution is in order.

Let's give downstream projects more of a runway.

I propose we inform HBase, Pig, Hive etc. that we are considering making 2.6 (not 2.5) the last JDK6 release and solicit their feedback. Once they are comfortable we can pull the trigger in 2.7.

thanks,
Arun


> On Jun 27, 2014, at 11:34 AM, Karthik Kambatla <ka...@cloudera.com> wrote:
> 
> As someone else already mentioned, we should announce one future release
> (may be, 2.5) as the last JDK6-based release before making the move to JDK7.
> 
> I am comfortable calling 2.5 the last JDK6 release.
> 
> 
> On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> Hi all, responding to multiple messages here,
>> 
>> Arun, thanks for the clarification regarding MR classpaths. It sounds like
>> the story there is improved and still improving.
>> 
>> However, I think we still suffer from this at least on the HDFS side. We
>> have a single JAR for all of HDFS, and our clients need to have all the fun
>> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
>> the front of the classpath and the HDFS client still works okay, but this
>> is more happy coincidence than anything else. While we're leaking deps,
>> we're in a scary situation.
>> 
>> API compat to me means that an app should be able to run on a new minor
>> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
>> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
>> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
>> have nothing break. If we muck with the classpath, my understanding is that
>> this could break.
>> 
>> Owen, bumping the minimum JDK version in a minor release like this should
>> be a one-time exception as Tucu stated. A number of people have pointed out
>> how painful a forced JDK upgrade is for end users, and it's not something
>> we should be springing on them in a minor release unless we're *very*
>> confident like in this case.
>> 
>> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
>> JDK7 across the CDH stack, so I think that's an indication that most
>> ecosystem projects are ready to make the jump. Is that sufficient in your
>> mind?
>> 
>> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
>> 2.5? I'll offer to help out with some of the mechanics.
>> 
>> Thanks,
>> Andrew
>> 
>> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
>> wrote:
>> 
>>> I understood the plan for avoiding JDK7-specific features in our code,
>> and
>>> your suggestion to add an extra Jenkins job is a great way to guard
>> against
>>> that.  The thing I haven't seen discussed yet is how downstream projects
>>> will continue to consume our built artifacts.  If a downstream project
>>> upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
>> but
>>> their project is still building with 1.6, then it would be a nasty
>>> surprise.
>>> 
>>> These are the options I see:
>>> 
>>> 1. Make sure all other projects upgrade first.  This doesn't sound
>>> feasible, unless all other ecosystem projects have moved to JDK7 already.
>>> If not, then waiting on a single long pole project would hold up our
>>> migration indefinitely.
>>> 
>>> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
>>> ecosystem upgrades.  I find this undesirable, because in a certain sense,
>>> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
>>> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>>> 
>>> 3. Just declare a clean break on some version (your earlier email said
>> 2.5)
>>> and start publishing artifacts built with JDK7 and no -target option.
>>> Overall, this is my preferred option.  However, as a side effect, this
>>> sets us up for longer-term maintenance and patch releases off of the 2.4
>>> branch if a downstream project that's still on 1.6 needs to pick up a
>>> critical bug fix.
>>> 
>>> Of course, this is all a moot point if all the downstream ecosystem
>>> projects have already made the switch to JDK7.  I don't know the status
>> of
>>> that off the top of my head.  Maybe someone else out there knows?  If
>> not,
>>> then I expect I can free up enough in a few weeks to volunteer for
>> tracking
>>> down that information.
>>> 
>>> Chris Nauroth
>>> Hortonworks
>>> http://hortonworks.com/
>>> 
>>> 
>>> 
>>> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
>>> wrote:
>>> 
>>>> Chris,
>>>> 
>>>> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
>>> are
>>>> still using jdk7 libraries and you could use new APIs, thus breaking
>> jdk6
>>>> both at compile and runtime.
>>>> 
>>>> you need to compile with jdk6 to ensure you are not running into that
>>>> scenario. that is why i was suggesting the nightly jdk6 build/test
>>> jenkins
>>>> job.
>>>> 
>>>> 
>>>> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
>> cnauroth@hortonworks.com
>>>> 
>>>> wrote:
>>>> 
>>>>> I'm also +1 for getting us to JDK7 within the 2.x line after reading
>>> the
>>>>> proposals and catching up on the discussion in this thread.
>>>>> 
>>>>> Has anyone yet considered how to coordinate this change with
>> downstream
>>>>> projects?  Would we request downstream projects to upgrade to JDK7
>>> first
>>>>> before we make the move?  Would we switch to JDK7, but run javac
>>> -target
>>>>> 1.6 to maintain compatibility for downstream projects during an
>> interim
>>>>> period?
>>>>> 
>>>>> Chris Nauroth
>>>>> Hortonworks
>>>>> http://hortonworks.com/
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
>>>>> wrote:
>>>>> 
>>>>>> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
>>> tucu@cloudera.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> After reading this thread and thinking a bit about it, I think it
>>>>> should
>>>>>> be
>>>>>>> OK such move up to JDK7 in Hadoop
>>>>>> 
>>>>>> 
>>>>>> I agree with Alejandro. Changing minimum JDKs is not an
>> incompatible
>>>>> change
>>>>>> and is fine in the 2 branch. (Although I think it is would *not* be
>>>>>> appropriate for a patch release.) Of course we need to do it with
>>>>>> forethought and testing, but moving off of JDK 6, which is EOL'ed
>> is
>>> a
>>>>> good
>>>>>> thing. Moving to Java 8 as a minimum seems much too aggressive and
>> I
>>>>> would
>>>>>> push back on that.
>>>>>> 
>>>>>> I'm also think that we need to let the dust settle on the Hadoop 2
>>> line
>>>>> for
>>>>>> a while before we talk about Hadoop 3. It seems that it has only
>> been
>>>> in
>>>>>> the last 6 months that Hadoop 2 adoption has reached the main
>> stream
>>>>> users.
>>>>>> Our user community needs time to digest the changes in Hadoop 2.x
>>>> before
>>>>> we
>>>>>> fracture the community by starting to discuss Hadoop 3 releases.
>>>>>> 
>>>>>> .. Owen
>>>>> 
>>>>> --
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>> entity
>>>> to
>>>>> which it is addressed and may contain information that is
>> confidential,
>>>>> privileged and exempt from disclosure under applicable law. If the
>>> reader
>>>>> of this message is not the intended recipient, you are hereby
>> notified
>>>> that
>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>> forwarding of this communication is strictly prohibited. If you have
>>>>> received this communication in error, please contact the sender
>>>> immediately
>>>>> and delete it from your system. Thank You.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Alejandro
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>> immediately
>>> and delete it from your system. Thank You.
>> 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
As someone else already mentioned, we should announce one future release
(may be, 2.5) as the last JDK6-based release before making the move to JDK7.

I am comfortable calling 2.5 the last JDK6 release.


On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
> Owen, bumping the minimum JDK version in a minor release like this should
> be a one-time exception as Tucu stated. A number of people have pointed out
> how painful a forced JDK upgrade is for end users, and it's not something
> we should be springing on them in a minor release unless we're *very*
> confident like in this case.
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
> Thanks,
> Andrew
>
> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I understood the plan for avoiding JDK7-specific features in our code,
> and
> > your suggestion to add an extra Jenkins job is a great way to guard
> against
> > that.  The thing I haven't seen discussed yet is how downstream projects
> > will continue to consume our built artifacts.  If a downstream project
> > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> but
> > their project is still building with 1.6, then it would be a nasty
> > surprise.
> >
> > These are the options I see:
> >
> > 1. Make sure all other projects upgrade first.  This doesn't sound
> > feasible, unless all other ecosystem projects have moved to JDK7 already.
> >  If not, then waiting on a single long pole project would hold up our
> > migration indefinitely.
> >
> > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > ecosystem upgrades.  I find this undesirable, because in a certain sense,
> > it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
> >
> > 3. Just declare a clean break on some version (your earlier email said
> 2.5)
> > and start publishing artifacts built with JDK7 and no -target option.
> >  Overall, this is my preferred option.  However, as a side effect, this
> > sets us up for longer-term maintenance and patch releases off of the 2.4
> > branch if a downstream project that's still on 1.6 needs to pick up a
> > critical bug fix.
> >
> > Of course, this is all a moot point if all the downstream ecosystem
> > projects have already made the switch to JDK7.  I don't know the status
> of
> > that off the top of my head.  Maybe someone else out there knows?  If
> not,
> > then I expect I can free up enough in a few weeks to volunteer for
> tracking
> > down that information.
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > Chris,
> > >
> > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> > are
> > > still using jdk7 libraries and you could use new APIs, thus breaking
> jdk6
> > > both at compile and runtime.
> > >
> > > you need to compile with jdk6 to ensure you are not running into that
> > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > jenkins
> > > job.
> > >
> > >
> > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> cnauroth@hortonworks.com
> > >
> > > wrote:
> > >
> > > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> > the
> > > > proposals and catching up on the discussion in this thread.
> > > >
> > > > Has anyone yet considered how to coordinate this change with
> downstream
> > > > projects?  Would we request downstream projects to upgrade to JDK7
> > first
> > > > before we make the move?  Would we switch to JDK7, but run javac
> > -target
> > > > 1.6 to maintain compatibility for downstream projects during an
> interim
> > > > period?
> > > >
> > > > Chris Nauroth
> > > > Hortonworks
> > > > http://hortonworks.com/
> > > >
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > > wrote:
> > > >
> > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > tucu@cloudera.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > After reading this thread and thinking a bit about it, I think it
> > > > should
> > > > > be
> > > > > > OK such move up to JDK7 in Hadoop
> > > > >
> > > > >
> > > > > I agree with Alejandro. Changing minimum JDKs is not an
> incompatible
> > > > change
> > > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> is
> > a
> > > > good
> > > > > thing. Moving to Java 8 as a minimum seems much too aggressive and
> I
> > > > would
> > > > > push back on that.
> > > > >
> > > > > I'm also think that we need to let the dust settle on the Hadoop 2
> > line
> > > > for
> > > > > a while before we talk about Hadoop 3. It seems that it has only
> been
> > > in
> > > > > the last 6 months that Hadoop 2 adoption has reached the main
> stream
> > > > users.
> > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > before
> > > > we
> > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > >
> > > > > .. Owen
> > > > >
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> > >
> > >
> > > --
> > > Alejandro
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
As someone else already mentioned, we should announce one future release
(may be, 2.5) as the last JDK6-based release before making the move to JDK7.

I am comfortable calling 2.5 the last JDK6 release.


On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
> Owen, bumping the minimum JDK version in a minor release like this should
> be a one-time exception as Tucu stated. A number of people have pointed out
> how painful a forced JDK upgrade is for end users, and it's not something
> we should be springing on them in a minor release unless we're *very*
> confident like in this case.
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
> Thanks,
> Andrew
>
> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I understood the plan for avoiding JDK7-specific features in our code,
> and
> > your suggestion to add an extra Jenkins job is a great way to guard
> against
> > that.  The thing I haven't seen discussed yet is how downstream projects
> > will continue to consume our built artifacts.  If a downstream project
> > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> but
> > their project is still building with 1.6, then it would be a nasty
> > surprise.
> >
> > These are the options I see:
> >
> > 1. Make sure all other projects upgrade first.  This doesn't sound
> > feasible, unless all other ecosystem projects have moved to JDK7 already.
> >  If not, then waiting on a single long pole project would hold up our
> > migration indefinitely.
> >
> > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > ecosystem upgrades.  I find this undesirable, because in a certain sense,
> > it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
> >
> > 3. Just declare a clean break on some version (your earlier email said
> 2.5)
> > and start publishing artifacts built with JDK7 and no -target option.
> >  Overall, this is my preferred option.  However, as a side effect, this
> > sets us up for longer-term maintenance and patch releases off of the 2.4
> > branch if a downstream project that's still on 1.6 needs to pick up a
> > critical bug fix.
> >
> > Of course, this is all a moot point if all the downstream ecosystem
> > projects have already made the switch to JDK7.  I don't know the status
> of
> > that off the top of my head.  Maybe someone else out there knows?  If
> not,
> > then I expect I can free up enough in a few weeks to volunteer for
> tracking
> > down that information.
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > Chris,
> > >
> > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> > are
> > > still using jdk7 libraries and you could use new APIs, thus breaking
> jdk6
> > > both at compile and runtime.
> > >
> > > you need to compile with jdk6 to ensure you are not running into that
> > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > jenkins
> > > job.
> > >
> > >
> > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> cnauroth@hortonworks.com
> > >
> > > wrote:
> > >
> > > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> > the
> > > > proposals and catching up on the discussion in this thread.
> > > >
> > > > Has anyone yet considered how to coordinate this change with
> downstream
> > > > projects?  Would we request downstream projects to upgrade to JDK7
> > first
> > > > before we make the move?  Would we switch to JDK7, but run javac
> > -target
> > > > 1.6 to maintain compatibility for downstream projects during an
> interim
> > > > period?
> > > >
> > > > Chris Nauroth
> > > > Hortonworks
> > > > http://hortonworks.com/
> > > >
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > > wrote:
> > > >
> > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > tucu@cloudera.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > After reading this thread and thinking a bit about it, I think it
> > > > should
> > > > > be
> > > > > > OK such move up to JDK7 in Hadoop
> > > > >
> > > > >
> > > > > I agree with Alejandro. Changing minimum JDKs is not an
> incompatible
> > > > change
> > > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> is
> > a
> > > > good
> > > > > thing. Moving to Java 8 as a minimum seems much too aggressive and
> I
> > > > would
> > > > > push back on that.
> > > > >
> > > > > I'm also think that we need to let the dust settle on the Hadoop 2
> > line
> > > > for
> > > > > a while before we talk about Hadoop 3. It seems that it has only
> been
> > > in
> > > > > the last 6 months that Hadoop 2 adoption has reached the main
> stream
> > > > users.
> > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > before
> > > > we
> > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > >
> > > > > .. Owen
> > > > >
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> > >
> > >
> > > --
> > > Alejandro
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
Guava is a separate problem and I think we should have a separate
discussion "what can we do about guava"? That's more traumatic than a JDK
update, I fear, as the guava releases care a lot less about compatibility.
I don't worry about JDK updates removing classes like "StringBuffer"
because "StringBuilder" is better.


On 27 June 2014 19:26, Andrew Wang <an...@cloudera.com> wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>

very good point.


>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
>
I think this is possible by having the app upload all the JARs...I need to
experiment here myself.

>
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
>
+1, we've had no complaints about things not working on Java 7. It's been
out a long time. IF you look at our own code, the main thing that broke
were tests -due to junit test case ordering- and not much else.



> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
As someone else already mentioned, we should announce one future release
(may be, 2.5) as the last JDK6-based release before making the move to JDK7.

I am comfortable calling 2.5 the last JDK6 release.


On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
> Owen, bumping the minimum JDK version in a minor release like this should
> be a one-time exception as Tucu stated. A number of people have pointed out
> how painful a forced JDK upgrade is for end users, and it's not something
> we should be springing on them in a minor release unless we're *very*
> confident like in this case.
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
> Thanks,
> Andrew
>
> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I understood the plan for avoiding JDK7-specific features in our code,
> and
> > your suggestion to add an extra Jenkins job is a great way to guard
> against
> > that.  The thing I haven't seen discussed yet is how downstream projects
> > will continue to consume our built artifacts.  If a downstream project
> > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> but
> > their project is still building with 1.6, then it would be a nasty
> > surprise.
> >
> > These are the options I see:
> >
> > 1. Make sure all other projects upgrade first.  This doesn't sound
> > feasible, unless all other ecosystem projects have moved to JDK7 already.
> >  If not, then waiting on a single long pole project would hold up our
> > migration indefinitely.
> >
> > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > ecosystem upgrades.  I find this undesirable, because in a certain sense,
> > it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
> >
> > 3. Just declare a clean break on some version (your earlier email said
> 2.5)
> > and start publishing artifacts built with JDK7 and no -target option.
> >  Overall, this is my preferred option.  However, as a side effect, this
> > sets us up for longer-term maintenance and patch releases off of the 2.4
> > branch if a downstream project that's still on 1.6 needs to pick up a
> > critical bug fix.
> >
> > Of course, this is all a moot point if all the downstream ecosystem
> > projects have already made the switch to JDK7.  I don't know the status
> of
> > that off the top of my head.  Maybe someone else out there knows?  If
> not,
> > then I expect I can free up enough in a few weeks to volunteer for
> tracking
> > down that information.
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > Chris,
> > >
> > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> > are
> > > still using jdk7 libraries and you could use new APIs, thus breaking
> jdk6
> > > both at compile and runtime.
> > >
> > > you need to compile with jdk6 to ensure you are not running into that
> > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > jenkins
> > > job.
> > >
> > >
> > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> cnauroth@hortonworks.com
> > >
> > > wrote:
> > >
> > > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> > the
> > > > proposals and catching up on the discussion in this thread.
> > > >
> > > > Has anyone yet considered how to coordinate this change with
> downstream
> > > > projects?  Would we request downstream projects to upgrade to JDK7
> > first
> > > > before we make the move?  Would we switch to JDK7, but run javac
> > -target
> > > > 1.6 to maintain compatibility for downstream projects during an
> interim
> > > > period?
> > > >
> > > > Chris Nauroth
> > > > Hortonworks
> > > > http://hortonworks.com/
> > > >
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > > wrote:
> > > >
> > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > tucu@cloudera.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > After reading this thread and thinking a bit about it, I think it
> > > > should
> > > > > be
> > > > > > OK such move up to JDK7 in Hadoop
> > > > >
> > > > >
> > > > > I agree with Alejandro. Changing minimum JDKs is not an
> incompatible
> > > > change
> > > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> is
> > a
> > > > good
> > > > > thing. Moving to Java 8 as a minimum seems much too aggressive and
> I
> > > > would
> > > > > push back on that.
> > > > >
> > > > > I'm also think that we need to let the dust settle on the Hadoop 2
> > line
> > > > for
> > > > > a while before we talk about Hadoop 3. It seems that it has only
> been
> > > in
> > > > > the last 6 months that Hadoop 2 adoption has reached the main
> stream
> > > > users.
> > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > before
> > > > we
> > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > >
> > > > > .. Owen
> > > > >
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> > >
> > >
> > > --
> > > Alejandro
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Karthik Kambatla <ka...@cloudera.com>.
As someone else already mentioned, we should announce one future release
(may be, 2.5) as the last JDK6-based release before making the move to JDK7.

I am comfortable calling 2.5 the last JDK6 release.


On Fri, Jun 27, 2014 at 11:26 AM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
> Owen, bumping the minimum JDK version in a minor release like this should
> be a one-time exception as Tucu stated. A number of people have pointed out
> how painful a forced JDK upgrade is for end users, and it's not something
> we should be springing on them in a minor release unless we're *very*
> confident like in this case.
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
> Thanks,
> Andrew
>
> On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I understood the plan for avoiding JDK7-specific features in our code,
> and
> > your suggestion to add an extra Jenkins job is a great way to guard
> against
> > that.  The thing I haven't seen discussed yet is how downstream projects
> > will continue to consume our built artifacts.  If a downstream project
> > upgrades to pick up a bug fix, and the jar switches to 1.7 class files,
> but
> > their project is still building with 1.6, then it would be a nasty
> > surprise.
> >
> > These are the options I see:
> >
> > 1. Make sure all other projects upgrade first.  This doesn't sound
> > feasible, unless all other ecosystem projects have moved to JDK7 already.
> >  If not, then waiting on a single long pole project would hold up our
> > migration indefinitely.
> >
> > 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> > ecosystem upgrades.  I find this undesirable, because in a certain sense,
> > it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> > end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
> >
> > 3. Just declare a clean break on some version (your earlier email said
> 2.5)
> > and start publishing artifacts built with JDK7 and no -target option.
> >  Overall, this is my preferred option.  However, as a side effect, this
> > sets us up for longer-term maintenance and patch releases off of the 2.4
> > branch if a downstream project that's still on 1.6 needs to pick up a
> > critical bug fix.
> >
> > Of course, this is all a moot point if all the downstream ecosystem
> > projects have already made the switch to JDK7.  I don't know the status
> of
> > that off the top of my head.  Maybe someone else out there knows?  If
> not,
> > then I expect I can free up enough in a few weeks to volunteer for
> tracking
> > down that information.
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > Chris,
> > >
> > > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> > are
> > > still using jdk7 libraries and you could use new APIs, thus breaking
> jdk6
> > > both at compile and runtime.
> > >
> > > you need to compile with jdk6 to ensure you are not running into that
> > > scenario. that is why i was suggesting the nightly jdk6 build/test
> > jenkins
> > > job.
> > >
> > >
> > > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <
> cnauroth@hortonworks.com
> > >
> > > wrote:
> > >
> > > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> > the
> > > > proposals and catching up on the discussion in this thread.
> > > >
> > > > Has anyone yet considered how to coordinate this change with
> downstream
> > > > projects?  Would we request downstream projects to upgrade to JDK7
> > first
> > > > before we make the move?  Would we switch to JDK7, but run javac
> > -target
> > > > 1.6 to maintain compatibility for downstream projects during an
> interim
> > > > period?
> > > >
> > > > Chris Nauroth
> > > > Hortonworks
> > > > http://hortonworks.com/
> > > >
> > > >
> > > >
> > > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > > wrote:
> > > >
> > > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> > tucu@cloudera.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > After reading this thread and thinking a bit about it, I think it
> > > > should
> > > > > be
> > > > > > OK such move up to JDK7 in Hadoop
> > > > >
> > > > >
> > > > > I agree with Alejandro. Changing minimum JDKs is not an
> incompatible
> > > > change
> > > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > > appropriate for a patch release.) Of course we need to do it with
> > > > > forethought and testing, but moving off of JDK 6, which is EOL'ed
> is
> > a
> > > > good
> > > > > thing. Moving to Java 8 as a minimum seems much too aggressive and
> I
> > > > would
> > > > > push back on that.
> > > > >
> > > > > I'm also think that we need to let the dust settle on the Hadoop 2
> > line
> > > > for
> > > > > a while before we talk about Hadoop 3. It seems that it has only
> been
> > > in
> > > > > the last 6 months that Hadoop 2 adoption has reached the main
> stream
> > > > users.
> > > > > Our user community needs time to digest the changes in Hadoop 2.x
> > > before
> > > > we
> > > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > > >
> > > > > .. Owen
> > > > >
> > > >
> > > > --
> > > > CONFIDENTIALITY NOTICE
> > > > NOTICE: This message is intended for the use of the individual or
> > entity
> > > to
> > > > which it is addressed and may contain information that is
> confidential,
> > > > privileged and exempt from disclosure under applicable law. If the
> > reader
> > > > of this message is not the intended recipient, you are hereby
> notified
> > > that
> > > > any printing, copying, dissemination, distribution, disclosure or
> > > > forwarding of this communication is strictly prohibited. If you have
> > > > received this communication in error, please contact the sender
> > > immediately
> > > > and delete it from your system. Thank You.
> > > >
> > >
> > >
> > >
> > > --
> > > Alejandro
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
Guava is a separate problem and I think we should have a separate
discussion "what can we do about guava"? That's more traumatic than a JDK
update, I fear, as the guava releases care a lot less about compatibility.
I don't worry about JDK updates removing classes like "StringBuffer"
because "StringBuilder" is better.


On 27 June 2014 19:26, Andrew Wang <an...@cloudera.com> wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>

very good point.


>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
>
I think this is possible by having the app upload all the JARs...I need to
experiment here myself.

>
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
>
+1, we've had no complaints about things not working on Java 7. It's been
out a long time. IF you look at our own code, the main thing that broke
were tests -due to junit test case ordering- and not much else.



> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
Guava is a separate problem and I think we should have a separate
discussion "what can we do about guava"? That's more traumatic than a JDK
update, I fear, as the guava releases care a lot less about compatibility.
I don't worry about JDK updates removing classes like "StringBuffer"
because "StringBuilder" is better.


On 27 June 2014 19:26, Andrew Wang <an...@cloudera.com> wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>

very good point.


>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
>
I think this is possible by having the app upload all the JARs...I need to
experiment here myself.

>
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
>
+1, we've had no complaints about things not working on Java 7. It's been
out a long time. IF you look at our own code, the main thing that broke
were tests -due to junit test case ordering- and not much else.



> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
Guava is a separate problem and I think we should have a separate
discussion "what can we do about guava"? That's more traumatic than a JDK
update, I fear, as the guava releases care a lot less about compatibility.
I don't worry about JDK updates removing classes like "StringBuffer"
because "StringBuilder" is better.


On 27 June 2014 19:26, Andrew Wang <an...@cloudera.com> wrote:

> Hi all, responding to multiple messages here,
>
> Arun, thanks for the clarification regarding MR classpaths. It sounds like
> the story there is improved and still improving.
>
> However, I think we still suffer from this at least on the HDFS side. We
> have a single JAR for all of HDFS, and our clients need to have all the fun
> deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
> the front of the classpath and the HDFS client still works okay, but this
> is more happy coincidence than anything else. While we're leaking deps,
> we're in a scary situation.
>

very good point.


>
> API compat to me means that an app should be able to run on a new minor
> version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
> it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
> should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
> have nothing break. If we muck with the classpath, my understanding is that
> this could break.
>
>
I think this is possible by having the app upload all the JARs...I need to
experiment here myself.

>
>
> Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
> JDK7 across the CDH stack, so I think that's an indication that most
> ecosystem projects are ready to make the jump. Is that sufficient in your
> mind?
>
>
+1, we've had no complaints about things not working on Java 7. It's been
out a long time. IF you look at our own code, the main thing that broke
were tests -due to junit test case ordering- and not much else.



> For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
> 2.5? I'll offer to help out with some of the mechanics.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all, responding to multiple messages here,

Arun, thanks for the clarification regarding MR classpaths. It sounds like
the story there is improved and still improving.

However, I think we still suffer from this at least on the HDFS side. We
have a single JAR for all of HDFS, and our clients need to have all the fun
deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
the front of the classpath and the HDFS client still works okay, but this
is more happy coincidence than anything else. While we're leaking deps,
we're in a scary situation.

API compat to me means that an app should be able to run on a new minor
version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
have nothing break. If we muck with the classpath, my understanding is that
this could break.

Owen, bumping the minimum JDK version in a minor release like this should
be a one-time exception as Tucu stated. A number of people have pointed out
how painful a forced JDK upgrade is for end users, and it's not something
we should be springing on them in a minor release unless we're *very*
confident like in this case.

Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
JDK7 across the CDH stack, so I think that's an indication that most
ecosystem projects are ready to make the jump. Is that sufficient in your
mind?

For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
2.5? I'll offer to help out with some of the mechanics.

Thanks,
Andrew

On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I understood the plan for avoiding JDK7-specific features in our code, and
> your suggestion to add an extra Jenkins job is a great way to guard against
> that.  The thing I haven't seen discussed yet is how downstream projects
> will continue to consume our built artifacts.  If a downstream project
> upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
> their project is still building with 1.6, then it would be a nasty
> surprise.
>
> These are the options I see:
>
> 1. Make sure all other projects upgrade first.  This doesn't sound
> feasible, unless all other ecosystem projects have moved to JDK7 already.
>  If not, then waiting on a single long pole project would hold up our
> migration indefinitely.
>
> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> ecosystem upgrades.  I find this undesirable, because in a certain sense,
> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>
> 3. Just declare a clean break on some version (your earlier email said 2.5)
> and start publishing artifacts built with JDK7 and no -target option.
>  Overall, this is my preferred option.  However, as a side effect, this
> sets us up for longer-term maintenance and patch releases off of the 2.4
> branch if a downstream project that's still on 1.6 needs to pick up a
> critical bug fix.
>
> Of course, this is all a moot point if all the downstream ecosystem
> projects have already made the switch to JDK7.  I don't know the status of
> that off the top of my head.  Maybe someone else out there knows?  If not,
> then I expect I can free up enough in a few weeks to volunteer for tracking
> down that information.
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > Chris,
> >
> > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> are
> > still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> > both at compile and runtime.
> >
> > you need to compile with jdk6 to ensure you are not running into that
> > scenario. that is why i was suggesting the nightly jdk6 build/test
> jenkins
> > job.
> >
> >
> > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> the
> > > proposals and catching up on the discussion in this thread.
> > >
> > > Has anyone yet considered how to coordinate this change with downstream
> > > projects?  Would we request downstream projects to upgrade to JDK7
> first
> > > before we make the move?  Would we switch to JDK7, but run javac
> -target
> > > 1.6 to maintain compatibility for downstream projects during an interim
> > > period?
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > wrote:
> > >
> > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > > > wrote:
> > > >
> > > > > After reading this thread and thinking a bit about it, I think it
> > > should
> > > > be
> > > > > OK such move up to JDK7 in Hadoop
> > > >
> > > >
> > > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > > change
> > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > appropriate for a patch release.) Of course we need to do it with
> > > > forethought and testing, but moving off of JDK 6, which is EOL'ed is
> a
> > > good
> > > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > > would
> > > > push back on that.
> > > >
> > > > I'm also think that we need to let the dust settle on the Hadoop 2
> line
> > > for
> > > > a while before we talk about Hadoop 3. It seems that it has only been
> > in
> > > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > > users.
> > > > Our user community needs time to digest the changes in Hadoop 2.x
> > before
> > > we
> > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > >
> > > > .. Owen
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all, responding to multiple messages here,

Arun, thanks for the clarification regarding MR classpaths. It sounds like
the story there is improved and still improving.

However, I think we still suffer from this at least on the HDFS side. We
have a single JAR for all of HDFS, and our clients need to have all the fun
deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
the front of the classpath and the HDFS client still works okay, but this
is more happy coincidence than anything else. While we're leaking deps,
we're in a scary situation.

API compat to me means that an app should be able to run on a new minor
version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
have nothing break. If we muck with the classpath, my understanding is that
this could break.

Owen, bumping the minimum JDK version in a minor release like this should
be a one-time exception as Tucu stated. A number of people have pointed out
how painful a forced JDK upgrade is for end users, and it's not something
we should be springing on them in a minor release unless we're *very*
confident like in this case.

Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
JDK7 across the CDH stack, so I think that's an indication that most
ecosystem projects are ready to make the jump. Is that sufficient in your
mind?

For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
2.5? I'll offer to help out with some of the mechanics.

Thanks,
Andrew

On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I understood the plan for avoiding JDK7-specific features in our code, and
> your suggestion to add an extra Jenkins job is a great way to guard against
> that.  The thing I haven't seen discussed yet is how downstream projects
> will continue to consume our built artifacts.  If a downstream project
> upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
> their project is still building with 1.6, then it would be a nasty
> surprise.
>
> These are the options I see:
>
> 1. Make sure all other projects upgrade first.  This doesn't sound
> feasible, unless all other ecosystem projects have moved to JDK7 already.
>  If not, then waiting on a single long pole project would hold up our
> migration indefinitely.
>
> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> ecosystem upgrades.  I find this undesirable, because in a certain sense,
> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>
> 3. Just declare a clean break on some version (your earlier email said 2.5)
> and start publishing artifacts built with JDK7 and no -target option.
>  Overall, this is my preferred option.  However, as a side effect, this
> sets us up for longer-term maintenance and patch releases off of the 2.4
> branch if a downstream project that's still on 1.6 needs to pick up a
> critical bug fix.
>
> Of course, this is all a moot point if all the downstream ecosystem
> projects have already made the switch to JDK7.  I don't know the status of
> that off the top of my head.  Maybe someone else out there knows?  If not,
> then I expect I can free up enough in a few weeks to volunteer for tracking
> down that information.
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > Chris,
> >
> > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> are
> > still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> > both at compile and runtime.
> >
> > you need to compile with jdk6 to ensure you are not running into that
> > scenario. that is why i was suggesting the nightly jdk6 build/test
> jenkins
> > job.
> >
> >
> > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> the
> > > proposals and catching up on the discussion in this thread.
> > >
> > > Has anyone yet considered how to coordinate this change with downstream
> > > projects?  Would we request downstream projects to upgrade to JDK7
> first
> > > before we make the move?  Would we switch to JDK7, but run javac
> -target
> > > 1.6 to maintain compatibility for downstream projects during an interim
> > > period?
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > wrote:
> > >
> > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > > > wrote:
> > > >
> > > > > After reading this thread and thinking a bit about it, I think it
> > > should
> > > > be
> > > > > OK such move up to JDK7 in Hadoop
> > > >
> > > >
> > > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > > change
> > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > appropriate for a patch release.) Of course we need to do it with
> > > > forethought and testing, but moving off of JDK 6, which is EOL'ed is
> a
> > > good
> > > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > > would
> > > > push back on that.
> > > >
> > > > I'm also think that we need to let the dust settle on the Hadoop 2
> line
> > > for
> > > > a while before we talk about Hadoop 3. It seems that it has only been
> > in
> > > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > > users.
> > > > Our user community needs time to digest the changes in Hadoop 2.x
> > before
> > > we
> > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > >
> > > > .. Owen
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all, responding to multiple messages here,

Arun, thanks for the clarification regarding MR classpaths. It sounds like
the story there is improved and still improving.

However, I think we still suffer from this at least on the HDFS side. We
have a single JAR for all of HDFS, and our clients need to have all the fun
deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
the front of the classpath and the HDFS client still works okay, but this
is more happy coincidence than anything else. While we're leaking deps,
we're in a scary situation.

API compat to me means that an app should be able to run on a new minor
version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
have nothing break. If we muck with the classpath, my understanding is that
this could break.

Owen, bumping the minimum JDK version in a minor release like this should
be a one-time exception as Tucu stated. A number of people have pointed out
how painful a forced JDK upgrade is for end users, and it's not something
we should be springing on them in a minor release unless we're *very*
confident like in this case.

Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
JDK7 across the CDH stack, so I think that's an indication that most
ecosystem projects are ready to make the jump. Is that sufficient in your
mind?

For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
2.5? I'll offer to help out with some of the mechanics.

Thanks,
Andrew

On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I understood the plan for avoiding JDK7-specific features in our code, and
> your suggestion to add an extra Jenkins job is a great way to guard against
> that.  The thing I haven't seen discussed yet is how downstream projects
> will continue to consume our built artifacts.  If a downstream project
> upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
> their project is still building with 1.6, then it would be a nasty
> surprise.
>
> These are the options I see:
>
> 1. Make sure all other projects upgrade first.  This doesn't sound
> feasible, unless all other ecosystem projects have moved to JDK7 already.
>  If not, then waiting on a single long pole project would hold up our
> migration indefinitely.
>
> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> ecosystem upgrades.  I find this undesirable, because in a certain sense,
> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>
> 3. Just declare a clean break on some version (your earlier email said 2.5)
> and start publishing artifacts built with JDK7 and no -target option.
>  Overall, this is my preferred option.  However, as a side effect, this
> sets us up for longer-term maintenance and patch releases off of the 2.4
> branch if a downstream project that's still on 1.6 needs to pick up a
> critical bug fix.
>
> Of course, this is all a moot point if all the downstream ecosystem
> projects have already made the switch to JDK7.  I don't know the status of
> that off the top of my head.  Maybe someone else out there knows?  If not,
> then I expect I can free up enough in a few weeks to volunteer for tracking
> down that information.
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > Chris,
> >
> > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> are
> > still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> > both at compile and runtime.
> >
> > you need to compile with jdk6 to ensure you are not running into that
> > scenario. that is why i was suggesting the nightly jdk6 build/test
> jenkins
> > job.
> >
> >
> > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> the
> > > proposals and catching up on the discussion in this thread.
> > >
> > > Has anyone yet considered how to coordinate this change with downstream
> > > projects?  Would we request downstream projects to upgrade to JDK7
> first
> > > before we make the move?  Would we switch to JDK7, but run javac
> -target
> > > 1.6 to maintain compatibility for downstream projects during an interim
> > > period?
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > wrote:
> > >
> > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > > > wrote:
> > > >
> > > > > After reading this thread and thinking a bit about it, I think it
> > > should
> > > > be
> > > > > OK such move up to JDK7 in Hadoop
> > > >
> > > >
> > > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > > change
> > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > appropriate for a patch release.) Of course we need to do it with
> > > > forethought and testing, but moving off of JDK 6, which is EOL'ed is
> a
> > > good
> > > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > > would
> > > > push back on that.
> > > >
> > > > I'm also think that we need to let the dust settle on the Hadoop 2
> line
> > > for
> > > > a while before we talk about Hadoop 3. It seems that it has only been
> > in
> > > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > > users.
> > > > Our user community needs time to digest the changes in Hadoop 2.x
> > before
> > > we
> > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > >
> > > > .. Owen
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all, responding to multiple messages here,

Arun, thanks for the clarification regarding MR classpaths. It sounds like
the story there is improved and still improving.

However, I think we still suffer from this at least on the HDFS side. We
have a single JAR for all of HDFS, and our clients need to have all the fun
deps like Guava on the classpath. I'm told Spark sticks a newer Guava at
the front of the classpath and the HDFS client still works okay, but this
is more happy coincidence than anything else. While we're leaking deps,
we're in a scary situation.

API compat to me means that an app should be able to run on a new minor
version of Hadoop and not have anything break. MAPREDUCE-4421 sounds like
it allows you to run e.g. 2.3 MR jobs on a 2.4 YARN cluster, but what
should also be possible is running an HDFS 2.3 app with HDFS 2.4 JARs and
have nothing break. If we muck with the classpath, my understanding is that
this could break.

Owen, bumping the minimum JDK version in a minor release like this should
be a one-time exception as Tucu stated. A number of people have pointed out
how painful a forced JDK upgrade is for end users, and it's not something
we should be springing on them in a minor release unless we're *very*
confident like in this case.

Chris, thanks for bringing up the ecosystem. For CDH5, we standardized on
JDK7 across the CDH stack, so I think that's an indication that most
ecosystem projects are ready to make the jump. Is that sufficient in your
mind?

For the record, I'm also +1 on the Tucu plan. Is it too late to do this for
2.5? I'll offer to help out with some of the mechanics.

Thanks,
Andrew

On Wed, Jun 25, 2014 at 4:18 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I understood the plan for avoiding JDK7-specific features in our code, and
> your suggestion to add an extra Jenkins job is a great way to guard against
> that.  The thing I haven't seen discussed yet is how downstream projects
> will continue to consume our built artifacts.  If a downstream project
> upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
> their project is still building with 1.6, then it would be a nasty
> surprise.
>
> These are the options I see:
>
> 1. Make sure all other projects upgrade first.  This doesn't sound
> feasible, unless all other ecosystem projects have moved to JDK7 already.
>  If not, then waiting on a single long pole project would hold up our
> migration indefinitely.
>
> 2. We switch to JDK7, but run javac with -target 1.6 until the whole
> ecosystem upgrades.  I find this undesirable, because in a certain sense,
> it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
> end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)
>
> 3. Just declare a clean break on some version (your earlier email said 2.5)
> and start publishing artifacts built with JDK7 and no -target option.
>  Overall, this is my preferred option.  However, as a side effect, this
> sets us up for longer-term maintenance and patch releases off of the 2.4
> branch if a downstream project that's still on 1.6 needs to pick up a
> critical bug fix.
>
> Of course, this is all a moot point if all the downstream ecosystem
> projects have already made the switch to JDK7.  I don't know the status of
> that off the top of my head.  Maybe someone else out there knows?  If not,
> then I expect I can free up enough in a few weeks to volunteer for tracking
> down that information.
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > Chris,
> >
> > Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you
> are
> > still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> > both at compile and runtime.
> >
> > you need to compile with jdk6 to ensure you are not running into that
> > scenario. that is why i was suggesting the nightly jdk6 build/test
> jenkins
> > job.
> >
> >
> > On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cnauroth@hortonworks.com
> >
> > wrote:
> >
> > > I'm also +1 for getting us to JDK7 within the 2.x line after reading
> the
> > > proposals and catching up on the discussion in this thread.
> > >
> > > Has anyone yet considered how to coordinate this change with downstream
> > > projects?  Would we request downstream projects to upgrade to JDK7
> first
> > > before we make the move?  Would we switch to JDK7, but run javac
> -target
> > > 1.6 to maintain compatibility for downstream projects during an interim
> > > period?
> > >
> > > Chris Nauroth
> > > Hortonworks
> > > http://hortonworks.com/
> > >
> > >
> > >
> > > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> > wrote:
> > >
> > > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> > >
> > > > wrote:
> > > >
> > > > > After reading this thread and thinking a bit about it, I think it
> > > should
> > > > be
> > > > > OK such move up to JDK7 in Hadoop
> > > >
> > > >
> > > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > > change
> > > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > > appropriate for a patch release.) Of course we need to do it with
> > > > forethought and testing, but moving off of JDK 6, which is EOL'ed is
> a
> > > good
> > > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > > would
> > > > push back on that.
> > > >
> > > > I'm also think that we need to let the dust settle on the Hadoop 2
> line
> > > for
> > > > a while before we talk about Hadoop 3. It seems that it has only been
> > in
> > > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > > users.
> > > > Our user community needs time to digest the changes in Hadoop 2.x
> > before
> > > we
> > > > fracture the community by starting to discuss Hadoop 3 releases.
> > > >
> > > > .. Owen
> > > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I understood the plan for avoiding JDK7-specific features in our code, and
your suggestion to add an extra Jenkins job is a great way to guard against
that.  The thing I haven't seen discussed yet is how downstream projects
will continue to consume our built artifacts.  If a downstream project
upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
their project is still building with 1.6, then it would be a nasty surprise.

These are the options I see:

1. Make sure all other projects upgrade first.  This doesn't sound
feasible, unless all other ecosystem projects have moved to JDK7 already.
 If not, then waiting on a single long pole project would hold up our
migration indefinitely.

2. We switch to JDK7, but run javac with -target 1.6 until the whole
ecosystem upgrades.  I find this undesirable, because in a certain sense,
it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)

3. Just declare a clean break on some version (your earlier email said 2.5)
and start publishing artifacts built with JDK7 and no -target option.
 Overall, this is my preferred option.  However, as a side effect, this
sets us up for longer-term maintenance and patch releases off of the 2.4
branch if a downstream project that's still on 1.6 needs to pick up a
critical bug fix.

Of course, this is all a moot point if all the downstream ecosystem
projects have already made the switch to JDK7.  I don't know the status of
that off the top of my head.  Maybe someone else out there knows?  If not,
then I expect I can free up enough in a few weeks to volunteer for tracking
down that information.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> Chris,
>
> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
> still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> both at compile and runtime.
>
> you need to compile with jdk6 to ensure you are not running into that
> scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
> job.
>
>
> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> > proposals and catching up on the discussion in this thread.
> >
> > Has anyone yet considered how to coordinate this change with downstream
> > projects?  Would we request downstream projects to upgrade to JDK7 first
> > before we make the move?  Would we switch to JDK7, but run javac -target
> > 1.6 to maintain compatibility for downstream projects during an interim
> > period?
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> wrote:
> >
> > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > After reading this thread and thinking a bit about it, I think it
> > should
> > > be
> > > > OK such move up to JDK7 in Hadoop
> > >
> > >
> > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > change
> > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > appropriate for a patch release.) Of course we need to do it with
> > > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> > good
> > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > would
> > > push back on that.
> > >
> > > I'm also think that we need to let the dust settle on the Hadoop 2 line
> > for
> > > a while before we talk about Hadoop 3. It seems that it has only been
> in
> > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > users.
> > > Our user community needs time to digest the changes in Hadoop 2.x
> before
> > we
> > > fracture the community by starting to discuss Hadoop 3 releases.
> > >
> > > .. Owen
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I understood the plan for avoiding JDK7-specific features in our code, and
your suggestion to add an extra Jenkins job is a great way to guard against
that.  The thing I haven't seen discussed yet is how downstream projects
will continue to consume our built artifacts.  If a downstream project
upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
their project is still building with 1.6, then it would be a nasty surprise.

These are the options I see:

1. Make sure all other projects upgrade first.  This doesn't sound
feasible, unless all other ecosystem projects have moved to JDK7 already.
 If not, then waiting on a single long pole project would hold up our
migration indefinitely.

2. We switch to JDK7, but run javac with -target 1.6 until the whole
ecosystem upgrades.  I find this undesirable, because in a certain sense,
it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)

3. Just declare a clean break on some version (your earlier email said 2.5)
and start publishing artifacts built with JDK7 and no -target option.
 Overall, this is my preferred option.  However, as a side effect, this
sets us up for longer-term maintenance and patch releases off of the 2.4
branch if a downstream project that's still on 1.6 needs to pick up a
critical bug fix.

Of course, this is all a moot point if all the downstream ecosystem
projects have already made the switch to JDK7.  I don't know the status of
that off the top of my head.  Maybe someone else out there knows?  If not,
then I expect I can free up enough in a few weeks to volunteer for tracking
down that information.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> Chris,
>
> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
> still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> both at compile and runtime.
>
> you need to compile with jdk6 to ensure you are not running into that
> scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
> job.
>
>
> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> > proposals and catching up on the discussion in this thread.
> >
> > Has anyone yet considered how to coordinate this change with downstream
> > projects?  Would we request downstream projects to upgrade to JDK7 first
> > before we make the move?  Would we switch to JDK7, but run javac -target
> > 1.6 to maintain compatibility for downstream projects during an interim
> > period?
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> wrote:
> >
> > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > After reading this thread and thinking a bit about it, I think it
> > should
> > > be
> > > > OK such move up to JDK7 in Hadoop
> > >
> > >
> > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > change
> > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > appropriate for a patch release.) Of course we need to do it with
> > > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> > good
> > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > would
> > > push back on that.
> > >
> > > I'm also think that we need to let the dust settle on the Hadoop 2 line
> > for
> > > a while before we talk about Hadoop 3. It seems that it has only been
> in
> > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > users.
> > > Our user community needs time to digest the changes in Hadoop 2.x
> before
> > we
> > > fracture the community by starting to discuss Hadoop 3 releases.
> > >
> > > .. Owen
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I understood the plan for avoiding JDK7-specific features in our code, and
your suggestion to add an extra Jenkins job is a great way to guard against
that.  The thing I haven't seen discussed yet is how downstream projects
will continue to consume our built artifacts.  If a downstream project
upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
their project is still building with 1.6, then it would be a nasty surprise.

These are the options I see:

1. Make sure all other projects upgrade first.  This doesn't sound
feasible, unless all other ecosystem projects have moved to JDK7 already.
 If not, then waiting on a single long pole project would hold up our
migration indefinitely.

2. We switch to JDK7, but run javac with -target 1.6 until the whole
ecosystem upgrades.  I find this undesirable, because in a certain sense,
it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)

3. Just declare a clean break on some version (your earlier email said 2.5)
and start publishing artifacts built with JDK7 and no -target option.
 Overall, this is my preferred option.  However, as a side effect, this
sets us up for longer-term maintenance and patch releases off of the 2.4
branch if a downstream project that's still on 1.6 needs to pick up a
critical bug fix.

Of course, this is all a moot point if all the downstream ecosystem
projects have already made the switch to JDK7.  I don't know the status of
that off the top of my head.  Maybe someone else out there knows?  If not,
then I expect I can free up enough in a few weeks to volunteer for tracking
down that information.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> Chris,
>
> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
> still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> both at compile and runtime.
>
> you need to compile with jdk6 to ensure you are not running into that
> scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
> job.
>
>
> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> > proposals and catching up on the discussion in this thread.
> >
> > Has anyone yet considered how to coordinate this change with downstream
> > projects?  Would we request downstream projects to upgrade to JDK7 first
> > before we make the move?  Would we switch to JDK7, but run javac -target
> > 1.6 to maintain compatibility for downstream projects during an interim
> > period?
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> wrote:
> >
> > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > After reading this thread and thinking a bit about it, I think it
> > should
> > > be
> > > > OK such move up to JDK7 in Hadoop
> > >
> > >
> > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > change
> > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > appropriate for a patch release.) Of course we need to do it with
> > > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> > good
> > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > would
> > > push back on that.
> > >
> > > I'm also think that we need to let the dust settle on the Hadoop 2 line
> > for
> > > a while before we talk about Hadoop 3. It seems that it has only been
> in
> > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > users.
> > > Our user community needs time to digest the changes in Hadoop 2.x
> before
> > we
> > > fracture the community by starting to discuss Hadoop 3 releases.
> > >
> > > .. Owen
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I understood the plan for avoiding JDK7-specific features in our code, and
your suggestion to add an extra Jenkins job is a great way to guard against
that.  The thing I haven't seen discussed yet is how downstream projects
will continue to consume our built artifacts.  If a downstream project
upgrades to pick up a bug fix, and the jar switches to 1.7 class files, but
their project is still building with 1.6, then it would be a nasty surprise.

These are the options I see:

1. Make sure all other projects upgrade first.  This doesn't sound
feasible, unless all other ecosystem projects have moved to JDK7 already.
 If not, then waiting on a single long pole project would hold up our
migration indefinitely.

2. We switch to JDK7, but run javac with -target 1.6 until the whole
ecosystem upgrades.  I find this undesirable, because in a certain sense,
it still leaves a bit of 1.6 lingering in the project.  (I'll assume that
end-of-life for JDK6 also means end-of-life for the 1.6 bytecode format.)

3. Just declare a clean break on some version (your earlier email said 2.5)
and start publishing artifacts built with JDK7 and no -target option.
 Overall, this is my preferred option.  However, as a side effect, this
sets us up for longer-term maintenance and patch releases off of the 2.4
branch if a downstream project that's still on 1.6 needs to pick up a
critical bug fix.

Of course, this is all a moot point if all the downstream ecosystem
projects have already made the switch to JDK7.  I don't know the status of
that off the top of my head.  Maybe someone else out there knows?  If not,
then I expect I can free up enough in a few weeks to volunteer for tracking
down that information.

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 3:12 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> Chris,
>
> Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
> still using jdk7 libraries and you could use new APIs, thus breaking jdk6
> both at compile and runtime.
>
> you need to compile with jdk6 to ensure you are not running into that
> scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
> job.
>
>
> On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
> wrote:
>
> > I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> > proposals and catching up on the discussion in this thread.
> >
> > Has anyone yet considered how to coordinate this change with downstream
> > projects?  Would we request downstream projects to upgrade to JDK7 first
> > before we make the move?  Would we switch to JDK7, but run javac -target
> > 1.6 to maintain compatibility for downstream projects during an interim
> > period?
> >
> > Chris Nauroth
> > Hortonworks
> > http://hortonworks.com/
> >
> >
> >
> > On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org>
> wrote:
> >
> > > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tucu@cloudera.com
> >
> > > wrote:
> > >
> > > > After reading this thread and thinking a bit about it, I think it
> > should
> > > be
> > > > OK such move up to JDK7 in Hadoop
> > >
> > >
> > > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> > change
> > > and is fine in the 2 branch. (Although I think it is would *not* be
> > > appropriate for a patch release.) Of course we need to do it with
> > > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> > good
> > > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> > would
> > > push back on that.
> > >
> > > I'm also think that we need to let the dust settle on the Hadoop 2 line
> > for
> > > a while before we talk about Hadoop 3. It seems that it has only been
> in
> > > the last 6 months that Hadoop 2 adoption has reached the main stream
> > users.
> > > Our user community needs time to digest the changes in Hadoop 2.x
> before
> > we
> > > fracture the community by starting to discuss Hadoop 3 releases.
> > >
> > > .. Owen
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Chris,

Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
still using jdk7 libraries and you could use new APIs, thus breaking jdk6
both at compile and runtime.

you need to compile with jdk6 to ensure you are not running into that
scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
job.


On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> proposals and catching up on the discussion in this thread.
>
> Has anyone yet considered how to coordinate this change with downstream
> projects?  Would we request downstream projects to upgrade to JDK7 first
> before we make the move?  Would we switch to JDK7, but run javac -target
> 1.6 to maintain compatibility for downstream projects during an interim
> period?
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:
>
> > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > After reading this thread and thinking a bit about it, I think it
> should
> > be
> > > OK such move up to JDK7 in Hadoop
> >
> >
> > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> change
> > and is fine in the 2 branch. (Although I think it is would *not* be
> > appropriate for a patch release.) Of course we need to do it with
> > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> good
> > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> would
> > push back on that.
> >
> > I'm also think that we need to let the dust settle on the Hadoop 2 line
> for
> > a while before we talk about Hadoop 3. It seems that it has only been in
> > the last 6 months that Hadoop 2 adoption has reached the main stream
> users.
> > Our user community needs time to digest the changes in Hadoop 2.x before
> we
> > fracture the community by starting to discuss Hadoop 3 releases.
> >
> > .. Owen
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Chris,

Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
still using jdk7 libraries and you could use new APIs, thus breaking jdk6
both at compile and runtime.

you need to compile with jdk6 to ensure you are not running into that
scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
job.


On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> proposals and catching up on the discussion in this thread.
>
> Has anyone yet considered how to coordinate this change with downstream
> projects?  Would we request downstream projects to upgrade to JDK7 first
> before we make the move?  Would we switch to JDK7, but run javac -target
> 1.6 to maintain compatibility for downstream projects during an interim
> period?
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:
>
> > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > After reading this thread and thinking a bit about it, I think it
> should
> > be
> > > OK such move up to JDK7 in Hadoop
> >
> >
> > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> change
> > and is fine in the 2 branch. (Although I think it is would *not* be
> > appropriate for a patch release.) Of course we need to do it with
> > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> good
> > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> would
> > push back on that.
> >
> > I'm also think that we need to let the dust settle on the Hadoop 2 line
> for
> > a while before we talk about Hadoop 3. It seems that it has only been in
> > the last 6 months that Hadoop 2 adoption has reached the main stream
> users.
> > Our user community needs time to digest the changes in Hadoop 2.x before
> we
> > fracture the community by starting to discuss Hadoop 3 releases.
> >
> > .. Owen
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Chris,

Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
still using jdk7 libraries and you could use new APIs, thus breaking jdk6
both at compile and runtime.

you need to compile with jdk6 to ensure you are not running into that
scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
job.


On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> proposals and catching up on the discussion in this thread.
>
> Has anyone yet considered how to coordinate this change with downstream
> projects?  Would we request downstream projects to upgrade to JDK7 first
> before we make the move?  Would we switch to JDK7, but run javac -target
> 1.6 to maintain compatibility for downstream projects during an interim
> period?
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:
>
> > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > After reading this thread and thinking a bit about it, I think it
> should
> > be
> > > OK such move up to JDK7 in Hadoop
> >
> >
> > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> change
> > and is fine in the 2 branch. (Although I think it is would *not* be
> > appropriate for a patch release.) Of course we need to do it with
> > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> good
> > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> would
> > push back on that.
> >
> > I'm also think that we need to let the dust settle on the Hadoop 2 line
> for
> > a while before we talk about Hadoop 3. It seems that it has only been in
> > the last 6 months that Hadoop 2 adoption has reached the main stream
> users.
> > Our user community needs time to digest the changes in Hadoop 2.x before
> we
> > fracture the community by starting to discuss Hadoop 3 releases.
> >
> > .. Owen
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Chris,

Compiling with jdk7 and doing javac -target 1.6 is not sufficient, you are
still using jdk7 libraries and you could use new APIs, thus breaking jdk6
both at compile and runtime.

you need to compile with jdk6 to ensure you are not running into that
scenario. that is why i was suggesting the nightly jdk6 build/test jenkins
job.


On Wed, Jun 25, 2014 at 2:04 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

> I'm also +1 for getting us to JDK7 within the 2.x line after reading the
> proposals and catching up on the discussion in this thread.
>
> Has anyone yet considered how to coordinate this change with downstream
> projects?  Would we request downstream projects to upgrade to JDK7 first
> before we make the move?  Would we switch to JDK7, but run javac -target
> 1.6 to maintain compatibility for downstream projects during an interim
> period?
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:
>
> > On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> > wrote:
> >
> > > After reading this thread and thinking a bit about it, I think it
> should
> > be
> > > OK such move up to JDK7 in Hadoop
> >
> >
> > I agree with Alejandro. Changing minimum JDKs is not an incompatible
> change
> > and is fine in the 2 branch. (Although I think it is would *not* be
> > appropriate for a patch release.) Of course we need to do it with
> > forethought and testing, but moving off of JDK 6, which is EOL'ed is a
> good
> > thing. Moving to Java 8 as a minimum seems much too aggressive and I
> would
> > push back on that.
> >
> > I'm also think that we need to let the dust settle on the Hadoop 2 line
> for
> > a while before we talk about Hadoop 3. It seems that it has only been in
> > the last 6 months that Hadoop 2 adoption has reached the main stream
> users.
> > Our user community needs time to digest the changes in Hadoop 2.x before
> we
> > fracture the community by starting to discuss Hadoop 3 releases.
> >
> > .. Owen
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I'm also +1 for getting us to JDK7 within the 2.x line after reading the
proposals and catching up on the discussion in this thread.

Has anyone yet considered how to coordinate this change with downstream
projects?  Would we request downstream projects to upgrade to JDK7 first
before we make the move?  Would we switch to JDK7, but run javac -target
1.6 to maintain compatibility for downstream projects during an interim
period?

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:

> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > After reading this thread and thinking a bit about it, I think it should
> be
> > OK such move up to JDK7 in Hadoop
>
>
> I agree with Alejandro. Changing minimum JDKs is not an incompatible change
> and is fine in the 2 branch. (Although I think it is would *not* be
> appropriate for a patch release.) Of course we need to do it with
> forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
> thing. Moving to Java 8 as a minimum seems much too aggressive and I would
> push back on that.
>
> I'm also think that we need to let the dust settle on the Hadoop 2 line for
> a while before we talk about Hadoop 3. It seems that it has only been in
> the last 6 months that Hadoop 2 adoption has reached the main stream users.
> Our user community needs time to digest the changes in Hadoop 2.x before we
> fracture the community by starting to discuss Hadoop 3 releases.
>
> .. Owen
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I'm also +1 for getting us to JDK7 within the 2.x line after reading the
proposals and catching up on the discussion in this thread.

Has anyone yet considered how to coordinate this change with downstream
projects?  Would we request downstream projects to upgrade to JDK7 first
before we make the move?  Would we switch to JDK7, but run javac -target
1.6 to maintain compatibility for downstream projects during an interim
period?

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:

> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > After reading this thread and thinking a bit about it, I think it should
> be
> > OK such move up to JDK7 in Hadoop
>
>
> I agree with Alejandro. Changing minimum JDKs is not an incompatible change
> and is fine in the 2 branch. (Although I think it is would *not* be
> appropriate for a patch release.) Of course we need to do it with
> forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
> thing. Moving to Java 8 as a minimum seems much too aggressive and I would
> push back on that.
>
> I'm also think that we need to let the dust settle on the Hadoop 2 line for
> a while before we talk about Hadoop 3. It seems that it has only been in
> the last 6 months that Hadoop 2 adoption has reached the main stream users.
> Our user community needs time to digest the changes in Hadoop 2.x before we
> fracture the community by starting to discuss Hadoop 3 releases.
>
> .. Owen
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I'm also +1 for getting us to JDK7 within the 2.x line after reading the
proposals and catching up on the discussion in this thread.

Has anyone yet considered how to coordinate this change with downstream
projects?  Would we request downstream projects to upgrade to JDK7 first
before we make the move?  Would we switch to JDK7, but run javac -target
1.6 to maintain compatibility for downstream projects during an interim
period?

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:

> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > After reading this thread and thinking a bit about it, I think it should
> be
> > OK such move up to JDK7 in Hadoop
>
>
> I agree with Alejandro. Changing minimum JDKs is not an incompatible change
> and is fine in the 2 branch. (Although I think it is would *not* be
> appropriate for a patch release.) Of course we need to do it with
> forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
> thing. Moving to Java 8 as a minimum seems much too aggressive and I would
> push back on that.
>
> I'm also think that we need to let the dust settle on the Hadoop 2 line for
> a while before we talk about Hadoop 3. It seems that it has only been in
> the last 6 months that Hadoop 2 adoption has reached the main stream users.
> Our user community needs time to digest the changes in Hadoop 2.x before we
> fracture the community by starting to discuss Hadoop 3 releases.
>
> .. Owen
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Chris Nauroth <cn...@hortonworks.com>.
I'm also +1 for getting us to JDK7 within the 2.x line after reading the
proposals and catching up on the discussion in this thread.

Has anyone yet considered how to coordinate this change with downstream
projects?  Would we request downstream projects to upgrade to JDK7 first
before we make the move?  Would we switch to JDK7, but run javac -target
1.6 to maintain compatibility for downstream projects during an interim
period?

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Wed, Jun 25, 2014 at 9:48 AM, Owen O'Malley <om...@apache.org> wrote:

> On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
>
> > After reading this thread and thinking a bit about it, I think it should
> be
> > OK such move up to JDK7 in Hadoop
>
>
> I agree with Alejandro. Changing minimum JDKs is not an incompatible change
> and is fine in the 2 branch. (Although I think it is would *not* be
> appropriate for a patch release.) Of course we need to do it with
> forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
> thing. Moving to Java 8 as a minimum seems much too aggressive and I would
> push back on that.
>
> I'm also think that we need to let the dust settle on the Hadoop 2 line for
> a while before we talk about Hadoop 3. It seems that it has only been in
> the last 6 months that Hadoop 2 adoption has reached the main stream users.
> Our user community needs time to digest the changes in Hadoop 2.x before we
> fracture the community by starting to discuss Hadoop 3 releases.
>
> .. Owen
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Owen O'Malley <om...@apache.org>.
On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop


I agree with Alejandro. Changing minimum JDKs is not an incompatible change
and is fine in the 2 branch. (Although I think it is would *not* be
appropriate for a patch release.) Of course we need to do it with
forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
thing. Moving to Java 8 as a minimum seems much too aggressive and I would
push back on that.

I'm also think that we need to let the dust settle on the Hadoop 2 line for
a while before we talk about Hadoop 3. It seems that it has only been in
the last 6 months that Hadoop 2 adoption has reached the main stream users.
Our user community needs time to digest the changes in Hadoop 2.x before we
fracture the community by starting to discuss Hadoop 3 releases.

.. Owen

Re: Moving to JDK7, JDK8 and new major releases

Posted by Owen O'Malley <om...@apache.org>.
On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop


I agree with Alejandro. Changing minimum JDKs is not an incompatible change
and is fine in the 2 branch. (Although I think it is would *not* be
appropriate for a patch release.) Of course we need to do it with
forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
thing. Moving to Java 8 as a minimum seems much too aggressive and I would
push back on that.

I'm also think that we need to let the dust settle on the Hadoop 2 line for
a while before we talk about Hadoop 3. It seems that it has only been in
the last 6 months that Hadoop 2 adoption has reached the main stream users.
Our user community needs time to digest the changes in Hadoop 2.x before we
fracture the community by starting to discuss Hadoop 3 releases.

.. Owen

Re: Moving to JDK7, JDK8 and new major releases

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
+1 (non-binding) for 2.5 to be the last release to ensure JDK6.

 >>> My higher-level goal though is to avoid going through this same pain
 >>> again when JDK7 goes EOL. I'd like to do a JDK8-based release
 >>> before then for this reason. This is why I suggested skipping an
 >>> intermediate 2.x+JDK7 release and leapfrogging to 3.0+JDK8.

I'm thinking skipping an intermediate release and leapfrogging to 3.0 
makes it difficult to maintain branch-2. It's only about a half year 
from 2.2 GA, so we should maintain branch-2 and create bug-fix releases 
for long-term even if 3.0+JDK8 is released.

Thanks,
Akira

(2014/06/24 17:56), Steve Loughran wrote:
> +1, though I think 2.5 may be premature if we want to send a warning note
> "last ever". That's an issue for followon "when in branch 2".
>
> Guava and protobuf.jar are two things we have to leave alone, with the
> first being unfortunate, but their attitude to updates is pretty dramatic.
> The latter? We all know how traumatic that can be.
>
> -Steve
>
>
> On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:
>
>> After reading this thread and thinking a bit about it, I think it should be
>> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>>
>> * Existing Hadoop 2 releases and related projects are running
>>    on JDK7 in production.
>> * Commercial vendors of Hadoop have already done lot of
>>    work to ensure Hadoop on JDK7 works while keeping Hadoop
>>    on JDK6 working.
>> * Different from many of the 3rd party libraries used by Hadoop,
>>    JDK is much stricter on backwards compatibility.
>>
>> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
>> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
>> the later if we end up in the same state of affairs)
>>
>> Even for Hadoop 2.5, I think we could do the move:
>>
>> * Create the Hadoop 2.5 release branch.
>> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>>    with JDK6 to ensure not JDK7 language/API  feature creeps
>>    out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
>> * Sanity tests for the Hadoop 2.5.x releases should be done
>>    with JDK7.
>> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
>> * Move all Apache Jenkins jobs to build/test using JDK7.
>> * Starting from Hadoop 2.6 we support JDK7 language/API
>>    features.
>>
>> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
>> JDK6 & JDK7 and that all tests towards the release
>> are done with JDK7.
>>
>> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
>> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
>> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>>
>> Thoughts?
>>
>>
>> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> On dependencies, we've bumped library versions when we think it's safe
>> and
>>> the APIs in the new version are compatible. Or, it's not leaked to the
>> app
>>> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
>>> fall into one of those categories. Steve can do a better job explaining
>>> this to me, but we haven't bumped things like Jetty or Guava because they
>>> are on the classpath and are not compatible. There is this line in the
>>> compat guidelines:
>>>
>>>     - Existing MapReduce, YARN & HDFS applications and frameworks should
>>>     work unmodified within a major release i.e. Apache Hadoop ABI is
>>> supported.
>>>
>>> Since Hadoop apps can and do depend on the Hadoop classpath, the
>> classpath
>>> is effectively part of our API. I'm sure there are user apps out there
>> that
>>> will break if we make incompatible changes to the classpath. I haven't
>> read
>>> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
>> out
>>> there.
>>>
>>> Sticking to the theme of "work unmodified", let's think about the user
>>> effort required to upgrade their JDK. This can be a very expensive task.
>> It
>>> might need approval up and down the org, meaning lots of certification,
>>> testing, and signoff. Considering the amount of user effort involved
>> here,
>>> it really seems like dropping a JDK is something that should only happen
>> in
>>> a major release. Else, there's the potential for nasty surprises in a
>>> supposedly "minor" release.
>>>
>>> That said, we are in an unhappy place right now regarding JDK6, and it's
>>> true that almost everyone's moved off of JDK6 at this point. So, I'd be
>>> okay with an intermediate 2.x release that drops JDK6 support (but no
>>> incompatible changes to the classpath like Guava). This is basically
>> free,
>>> and we could start using JDK7 idioms like multi-catch and new NIO stuff
>> in
>>> Hadoop code (a minor draw I guess).
>>>
>>> My higher-level goal though is to avoid going through this same pain
>> again
>>> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
>>> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
>>> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
>>> the future, and it seems like a better place to focus our efforts. I was
>>> also hoping it'd be realistic to fix our classpath leakage by then, since
>>> then we'd have a nice, tight, future-proofed new major release.
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>>
>>>
>>> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
>>> wrote:
>>>
>>>> Andrew,
>>>>
>>>>   Thanks for starting this thread. I'll edit the wiki to provide more
>>>> context around rolling-upgrades etc. which, as I pointed out in the
>>>> original thread, are key IMHO.
>>>>
>>>> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
>>>> wrote:
>>>>> https://wiki.apache.org/hadoop/MovingToJdk7and8
>>>>>
>>>>> I think based on our current compatibility guidelines, Proposal A is
>>> the
>>>>> most attractive. We're pretty hamstrung by the requirement to keep
>> the
>>>>> classpath the same, which would be solved by either OSGI or shading
>> our
>>>>> deps (but that's a different discussion).
>>>>
>>>> I don't see that anywhere in our current compatibility guidelines.
>>>>
>>>> As you can see from
>>>>
>>>
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> we do not have such a policy (pasted here for convenience):
>>>>
>>>> Java Classpath
>>>>
>>>> User applications built against Hadoop might add all Hadoop jars
>>>> (including Hadoop's library dependencies) to the application's
>> classpath.
>>>> Adding new dependencies or updating the version of existing
>> dependencies
>>>> may interfere with those in applications' classpaths.
>>>>
>>>> Policy
>>>>
>>>> Currently, there is NO policy on when Hadoop's dependencies can change.
>>>>
>>>> Furthermore, we have *already* changed our classpath in hadoop-2.x.
>>> Again,
>>>> as I pointed out in the previous thread, here is the precedent:
>>>>
>>>> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>>
>>>>> Also, this is something we already have done i.e. we updated some of
>>> our
>>>> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
>>>> dramatic as JDK. Here are some examples:
>>>>> https://issues.apache.org/jira/browse/HADOOP-9991
>>>>> https://issues.apache.org/jira/browse/HADOOP-10102
>>>>> https://issues.apache.org/jira/browse/HADOOP-10103
>>>>> https://issues.apache.org/jira/browse/HADOOP-10104
>>>>> https://issues.apache.org/jira/browse/HADOOP-10503
>>>>
>>>> thanks,
>>>> Arun
>>>> --
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>> entity
>>> to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the
>> reader
>>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>> immediately
>>>> and delete it from your system. Thank You.
>>>>
>>>
>>
>>
>>
>> --
>> Alejandro
>>
>


Re: Moving to JDK7, JDK8 and new major releases

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
+1 (non-binding) for 2.5 to be the last release to ensure JDK6.

 >>> My higher-level goal though is to avoid going through this same pain
 >>> again when JDK7 goes EOL. I'd like to do a JDK8-based release
 >>> before then for this reason. This is why I suggested skipping an
 >>> intermediate 2.x+JDK7 release and leapfrogging to 3.0+JDK8.

I'm thinking skipping an intermediate release and leapfrogging to 3.0 
makes it difficult to maintain branch-2. It's only about a half year 
from 2.2 GA, so we should maintain branch-2 and create bug-fix releases 
for long-term even if 3.0+JDK8 is released.

Thanks,
Akira

(2014/06/24 17:56), Steve Loughran wrote:
> +1, though I think 2.5 may be premature if we want to send a warning note
> "last ever". That's an issue for followon "when in branch 2".
>
> Guava and protobuf.jar are two things we have to leave alone, with the
> first being unfortunate, but their attitude to updates is pretty dramatic.
> The latter? We all know how traumatic that can be.
>
> -Steve
>
>
> On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:
>
>> After reading this thread and thinking a bit about it, I think it should be
>> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>>
>> * Existing Hadoop 2 releases and related projects are running
>>    on JDK7 in production.
>> * Commercial vendors of Hadoop have already done lot of
>>    work to ensure Hadoop on JDK7 works while keeping Hadoop
>>    on JDK6 working.
>> * Different from many of the 3rd party libraries used by Hadoop,
>>    JDK is much stricter on backwards compatibility.
>>
>> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
>> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
>> the later if we end up in the same state of affairs)
>>
>> Even for Hadoop 2.5, I think we could do the move:
>>
>> * Create the Hadoop 2.5 release branch.
>> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>>    with JDK6 to ensure not JDK7 language/API  feature creeps
>>    out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
>> * Sanity tests for the Hadoop 2.5.x releases should be done
>>    with JDK7.
>> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
>> * Move all Apache Jenkins jobs to build/test using JDK7.
>> * Starting from Hadoop 2.6 we support JDK7 language/API
>>    features.
>>
>> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
>> JDK6 & JDK7 and that all tests towards the release
>> are done with JDK7.
>>
>> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
>> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
>> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>>
>> Thoughts?
>>
>>
>> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> On dependencies, we've bumped library versions when we think it's safe
>> and
>>> the APIs in the new version are compatible. Or, it's not leaked to the
>> app
>>> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
>>> fall into one of those categories. Steve can do a better job explaining
>>> this to me, but we haven't bumped things like Jetty or Guava because they
>>> are on the classpath and are not compatible. There is this line in the
>>> compat guidelines:
>>>
>>>     - Existing MapReduce, YARN & HDFS applications and frameworks should
>>>     work unmodified within a major release i.e. Apache Hadoop ABI is
>>> supported.
>>>
>>> Since Hadoop apps can and do depend on the Hadoop classpath, the
>> classpath
>>> is effectively part of our API. I'm sure there are user apps out there
>> that
>>> will break if we make incompatible changes to the classpath. I haven't
>> read
>>> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
>> out
>>> there.
>>>
>>> Sticking to the theme of "work unmodified", let's think about the user
>>> effort required to upgrade their JDK. This can be a very expensive task.
>> It
>>> might need approval up and down the org, meaning lots of certification,
>>> testing, and signoff. Considering the amount of user effort involved
>> here,
>>> it really seems like dropping a JDK is something that should only happen
>> in
>>> a major release. Else, there's the potential for nasty surprises in a
>>> supposedly "minor" release.
>>>
>>> That said, we are in an unhappy place right now regarding JDK6, and it's
>>> true that almost everyone's moved off of JDK6 at this point. So, I'd be
>>> okay with an intermediate 2.x release that drops JDK6 support (but no
>>> incompatible changes to the classpath like Guava). This is basically
>> free,
>>> and we could start using JDK7 idioms like multi-catch and new NIO stuff
>> in
>>> Hadoop code (a minor draw I guess).
>>>
>>> My higher-level goal though is to avoid going through this same pain
>> again
>>> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
>>> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
>>> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
>>> the future, and it seems like a better place to focus our efforts. I was
>>> also hoping it'd be realistic to fix our classpath leakage by then, since
>>> then we'd have a nice, tight, future-proofed new major release.
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>>
>>>
>>> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
>>> wrote:
>>>
>>>> Andrew,
>>>>
>>>>   Thanks for starting this thread. I'll edit the wiki to provide more
>>>> context around rolling-upgrades etc. which, as I pointed out in the
>>>> original thread, are key IMHO.
>>>>
>>>> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
>>>> wrote:
>>>>> https://wiki.apache.org/hadoop/MovingToJdk7and8
>>>>>
>>>>> I think based on our current compatibility guidelines, Proposal A is
>>> the
>>>>> most attractive. We're pretty hamstrung by the requirement to keep
>> the
>>>>> classpath the same, which would be solved by either OSGI or shading
>> our
>>>>> deps (but that's a different discussion).
>>>>
>>>> I don't see that anywhere in our current compatibility guidelines.
>>>>
>>>> As you can see from
>>>>
>>>
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> we do not have such a policy (pasted here for convenience):
>>>>
>>>> Java Classpath
>>>>
>>>> User applications built against Hadoop might add all Hadoop jars
>>>> (including Hadoop's library dependencies) to the application's
>> classpath.
>>>> Adding new dependencies or updating the version of existing
>> dependencies
>>>> may interfere with those in applications' classpaths.
>>>>
>>>> Policy
>>>>
>>>> Currently, there is NO policy on when Hadoop's dependencies can change.
>>>>
>>>> Furthermore, we have *already* changed our classpath in hadoop-2.x.
>>> Again,
>>>> as I pointed out in the previous thread, here is the precedent:
>>>>
>>>> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>>
>>>>> Also, this is something we already have done i.e. we updated some of
>>> our
>>>> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
>>>> dramatic as JDK. Here are some examples:
>>>>> https://issues.apache.org/jira/browse/HADOOP-9991
>>>>> https://issues.apache.org/jira/browse/HADOOP-10102
>>>>> https://issues.apache.org/jira/browse/HADOOP-10103
>>>>> https://issues.apache.org/jira/browse/HADOOP-10104
>>>>> https://issues.apache.org/jira/browse/HADOOP-10503
>>>>
>>>> thanks,
>>>> Arun
>>>> --
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>> entity
>>> to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the
>> reader
>>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>> immediately
>>>> and delete it from your system. Thank You.
>>>>
>>>
>>
>>
>>
>> --
>> Alejandro
>>
>


Re: Moving to JDK7, JDK8 and new major releases

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
+1 (non-binding) for 2.5 to be the last release to ensure JDK6.

 >>> My higher-level goal though is to avoid going through this same pain
 >>> again when JDK7 goes EOL. I'd like to do a JDK8-based release
 >>> before then for this reason. This is why I suggested skipping an
 >>> intermediate 2.x+JDK7 release and leapfrogging to 3.0+JDK8.

I'm thinking skipping an intermediate release and leapfrogging to 3.0 
makes it difficult to maintain branch-2. It's only about a half year 
from 2.2 GA, so we should maintain branch-2 and create bug-fix releases 
for long-term even if 3.0+JDK8 is released.

Thanks,
Akira

(2014/06/24 17:56), Steve Loughran wrote:
> +1, though I think 2.5 may be premature if we want to send a warning note
> "last ever". That's an issue for followon "when in branch 2".
>
> Guava and protobuf.jar are two things we have to leave alone, with the
> first being unfortunate, but their attitude to updates is pretty dramatic.
> The latter? We all know how traumatic that can be.
>
> -Steve
>
>
> On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:
>
>> After reading this thread and thinking a bit about it, I think it should be
>> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>>
>> * Existing Hadoop 2 releases and related projects are running
>>    on JDK7 in production.
>> * Commercial vendors of Hadoop have already done lot of
>>    work to ensure Hadoop on JDK7 works while keeping Hadoop
>>    on JDK6 working.
>> * Different from many of the 3rd party libraries used by Hadoop,
>>    JDK is much stricter on backwards compatibility.
>>
>> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
>> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
>> the later if we end up in the same state of affairs)
>>
>> Even for Hadoop 2.5, I think we could do the move:
>>
>> * Create the Hadoop 2.5 release branch.
>> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>>    with JDK6 to ensure not JDK7 language/API  feature creeps
>>    out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
>> * Sanity tests for the Hadoop 2.5.x releases should be done
>>    with JDK7.
>> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
>> * Move all Apache Jenkins jobs to build/test using JDK7.
>> * Starting from Hadoop 2.6 we support JDK7 language/API
>>    features.
>>
>> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
>> JDK6 & JDK7 and that all tests towards the release
>> are done with JDK7.
>>
>> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
>> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
>> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>>
>> Thoughts?
>>
>>
>> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> On dependencies, we've bumped library versions when we think it's safe
>> and
>>> the APIs in the new version are compatible. Or, it's not leaked to the
>> app
>>> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
>>> fall into one of those categories. Steve can do a better job explaining
>>> this to me, but we haven't bumped things like Jetty or Guava because they
>>> are on the classpath and are not compatible. There is this line in the
>>> compat guidelines:
>>>
>>>     - Existing MapReduce, YARN & HDFS applications and frameworks should
>>>     work unmodified within a major release i.e. Apache Hadoop ABI is
>>> supported.
>>>
>>> Since Hadoop apps can and do depend on the Hadoop classpath, the
>> classpath
>>> is effectively part of our API. I'm sure there are user apps out there
>> that
>>> will break if we make incompatible changes to the classpath. I haven't
>> read
>>> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
>> out
>>> there.
>>>
>>> Sticking to the theme of "work unmodified", let's think about the user
>>> effort required to upgrade their JDK. This can be a very expensive task.
>> It
>>> might need approval up and down the org, meaning lots of certification,
>>> testing, and signoff. Considering the amount of user effort involved
>> here,
>>> it really seems like dropping a JDK is something that should only happen
>> in
>>> a major release. Else, there's the potential for nasty surprises in a
>>> supposedly "minor" release.
>>>
>>> That said, we are in an unhappy place right now regarding JDK6, and it's
>>> true that almost everyone's moved off of JDK6 at this point. So, I'd be
>>> okay with an intermediate 2.x release that drops JDK6 support (but no
>>> incompatible changes to the classpath like Guava). This is basically
>> free,
>>> and we could start using JDK7 idioms like multi-catch and new NIO stuff
>> in
>>> Hadoop code (a minor draw I guess).
>>>
>>> My higher-level goal though is to avoid going through this same pain
>> again
>>> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
>>> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
>>> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
>>> the future, and it seems like a better place to focus our efforts. I was
>>> also hoping it'd be realistic to fix our classpath leakage by then, since
>>> then we'd have a nice, tight, future-proofed new major release.
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>>
>>>
>>> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
>>> wrote:
>>>
>>>> Andrew,
>>>>
>>>>   Thanks for starting this thread. I'll edit the wiki to provide more
>>>> context around rolling-upgrades etc. which, as I pointed out in the
>>>> original thread, are key IMHO.
>>>>
>>>> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
>>>> wrote:
>>>>> https://wiki.apache.org/hadoop/MovingToJdk7and8
>>>>>
>>>>> I think based on our current compatibility guidelines, Proposal A is
>>> the
>>>>> most attractive. We're pretty hamstrung by the requirement to keep
>> the
>>>>> classpath the same, which would be solved by either OSGI or shading
>> our
>>>>> deps (but that's a different discussion).
>>>>
>>>> I don't see that anywhere in our current compatibility guidelines.
>>>>
>>>> As you can see from
>>>>
>>>
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> we do not have such a policy (pasted here for convenience):
>>>>
>>>> Java Classpath
>>>>
>>>> User applications built against Hadoop might add all Hadoop jars
>>>> (including Hadoop's library dependencies) to the application's
>> classpath.
>>>> Adding new dependencies or updating the version of existing
>> dependencies
>>>> may interfere with those in applications' classpaths.
>>>>
>>>> Policy
>>>>
>>>> Currently, there is NO policy on when Hadoop's dependencies can change.
>>>>
>>>> Furthermore, we have *already* changed our classpath in hadoop-2.x.
>>> Again,
>>>> as I pointed out in the previous thread, here is the precedent:
>>>>
>>>> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>>
>>>>> Also, this is something we already have done i.e. we updated some of
>>> our
>>>> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
>>>> dramatic as JDK. Here are some examples:
>>>>> https://issues.apache.org/jira/browse/HADOOP-9991
>>>>> https://issues.apache.org/jira/browse/HADOOP-10102
>>>>> https://issues.apache.org/jira/browse/HADOOP-10103
>>>>> https://issues.apache.org/jira/browse/HADOOP-10104
>>>>> https://issues.apache.org/jira/browse/HADOOP-10503
>>>>
>>>> thanks,
>>>> Arun
>>>> --
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>> entity
>>> to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the
>> reader
>>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>> immediately
>>>> and delete it from your system. Thank You.
>>>>
>>>
>>
>>
>>
>> --
>> Alejandro
>>
>


Re: Moving to JDK7, JDK8 and new major releases

Posted by Akira AJISAKA <aj...@oss.nttdata.co.jp>.
+1 (non-binding) for 2.5 to be the last release to ensure JDK6.

 >>> My higher-level goal though is to avoid going through this same pain
 >>> again when JDK7 goes EOL. I'd like to do a JDK8-based release
 >>> before then for this reason. This is why I suggested skipping an
 >>> intermediate 2.x+JDK7 release and leapfrogging to 3.0+JDK8.

I'm thinking skipping an intermediate release and leapfrogging to 3.0 
makes it difficult to maintain branch-2. It's only about a half year 
from 2.2 GA, so we should maintain branch-2 and create bug-fix releases 
for long-term even if 3.0+JDK8 is released.

Thanks,
Akira

(2014/06/24 17:56), Steve Loughran wrote:
> +1, though I think 2.5 may be premature if we want to send a warning note
> "last ever". That's an issue for followon "when in branch 2".
>
> Guava and protobuf.jar are two things we have to leave alone, with the
> first being unfortunate, but their attitude to updates is pretty dramatic.
> The latter? We all know how traumatic that can be.
>
> -Steve
>
>
> On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:
>
>> After reading this thread and thinking a bit about it, I think it should be
>> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>>
>> * Existing Hadoop 2 releases and related projects are running
>>    on JDK7 in production.
>> * Commercial vendors of Hadoop have already done lot of
>>    work to ensure Hadoop on JDK7 works while keeping Hadoop
>>    on JDK6 working.
>> * Different from many of the 3rd party libraries used by Hadoop,
>>    JDK is much stricter on backwards compatibility.
>>
>> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
>> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
>> the later if we end up in the same state of affairs)
>>
>> Even for Hadoop 2.5, I think we could do the move:
>>
>> * Create the Hadoop 2.5 release branch.
>> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>>    with JDK6 to ensure not JDK7 language/API  feature creeps
>>    out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
>> * Sanity tests for the Hadoop 2.5.x releases should be done
>>    with JDK7.
>> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
>> * Move all Apache Jenkins jobs to build/test using JDK7.
>> * Starting from Hadoop 2.6 we support JDK7 language/API
>>    features.
>>
>> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
>> JDK6 & JDK7 and that all tests towards the release
>> are done with JDK7.
>>
>> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
>> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
>> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>>
>> Thoughts?
>>
>>
>> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> On dependencies, we've bumped library versions when we think it's safe
>> and
>>> the APIs in the new version are compatible. Or, it's not leaked to the
>> app
>>> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
>>> fall into one of those categories. Steve can do a better job explaining
>>> this to me, but we haven't bumped things like Jetty or Guava because they
>>> are on the classpath and are not compatible. There is this line in the
>>> compat guidelines:
>>>
>>>     - Existing MapReduce, YARN & HDFS applications and frameworks should
>>>     work unmodified within a major release i.e. Apache Hadoop ABI is
>>> supported.
>>>
>>> Since Hadoop apps can and do depend on the Hadoop classpath, the
>> classpath
>>> is effectively part of our API. I'm sure there are user apps out there
>> that
>>> will break if we make incompatible changes to the classpath. I haven't
>> read
>>> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
>> out
>>> there.
>>>
>>> Sticking to the theme of "work unmodified", let's think about the user
>>> effort required to upgrade their JDK. This can be a very expensive task.
>> It
>>> might need approval up and down the org, meaning lots of certification,
>>> testing, and signoff. Considering the amount of user effort involved
>> here,
>>> it really seems like dropping a JDK is something that should only happen
>> in
>>> a major release. Else, there's the potential for nasty surprises in a
>>> supposedly "minor" release.
>>>
>>> That said, we are in an unhappy place right now regarding JDK6, and it's
>>> true that almost everyone's moved off of JDK6 at this point. So, I'd be
>>> okay with an intermediate 2.x release that drops JDK6 support (but no
>>> incompatible changes to the classpath like Guava). This is basically
>> free,
>>> and we could start using JDK7 idioms like multi-catch and new NIO stuff
>> in
>>> Hadoop code (a minor draw I guess).
>>>
>>> My higher-level goal though is to avoid going through this same pain
>> again
>>> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
>>> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
>>> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
>>> the future, and it seems like a better place to focus our efforts. I was
>>> also hoping it'd be realistic to fix our classpath leakage by then, since
>>> then we'd have a nice, tight, future-proofed new major release.
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>>
>>>
>>> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
>>> wrote:
>>>
>>>> Andrew,
>>>>
>>>>   Thanks for starting this thread. I'll edit the wiki to provide more
>>>> context around rolling-upgrades etc. which, as I pointed out in the
>>>> original thread, are key IMHO.
>>>>
>>>> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
>>>> wrote:
>>>>> https://wiki.apache.org/hadoop/MovingToJdk7and8
>>>>>
>>>>> I think based on our current compatibility guidelines, Proposal A is
>>> the
>>>>> most attractive. We're pretty hamstrung by the requirement to keep
>> the
>>>>> classpath the same, which would be solved by either OSGI or shading
>> our
>>>>> deps (but that's a different discussion).
>>>>
>>>> I don't see that anywhere in our current compatibility guidelines.
>>>>
>>>> As you can see from
>>>>
>>>
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> we do not have such a policy (pasted here for convenience):
>>>>
>>>> Java Classpath
>>>>
>>>> User applications built against Hadoop might add all Hadoop jars
>>>> (including Hadoop's library dependencies) to the application's
>> classpath.
>>>> Adding new dependencies or updating the version of existing
>> dependencies
>>>> may interfere with those in applications' classpaths.
>>>>
>>>> Policy
>>>>
>>>> Currently, there is NO policy on when Hadoop's dependencies can change.
>>>>
>>>> Furthermore, we have *already* changed our classpath in hadoop-2.x.
>>> Again,
>>>> as I pointed out in the previous thread, here is the precedent:
>>>>
>>>> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>>
>>>>> Also, this is something we already have done i.e. we updated some of
>>> our
>>>> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
>>>> dramatic as JDK. Here are some examples:
>>>>> https://issues.apache.org/jira/browse/HADOOP-9991
>>>>> https://issues.apache.org/jira/browse/HADOOP-10102
>>>>> https://issues.apache.org/jira/browse/HADOOP-10103
>>>>> https://issues.apache.org/jira/browse/HADOOP-10104
>>>>> https://issues.apache.org/jira/browse/HADOOP-10503
>>>>
>>>> thanks,
>>>> Arun
>>>> --
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>> entity
>>> to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the
>> reader
>>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>> immediately
>>>> and delete it from your system. Thank You.
>>>>
>>>
>>
>>
>>
>> --
>> Alejandro
>>
>


Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
+1, though I think 2.5 may be premature if we want to send a warning note
"last ever". That's an issue for followon "when in branch 2".

Guava and protobuf.jar are two things we have to leave alone, with the
first being unfortunate, but their attitude to updates is pretty dramatic.
The latter? We all know how traumatic that can be.

-Steve


On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>
> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun Murthy <ac...@hortonworks.com>.
Alejandro,


On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>

+1 - I think we are all on the same page here. Fully agree.


>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>

+1. Agree again - let's just wait/watch.

>From the thread I've become more convinced that (as you've noted before)
that since we are at the bottom of the stack, we need to be more
conservative.

>From http://www.oracle.com/technetwork/java/eol-135779.html, it looks like
April 2015 is the *earliest* Java7 will EOL. Java6 EOL was Feb 2011 and we
are still debating whether we can stop supporting it. So, my guess is that
we will support Java7 at least for a year after it's EOL i.e. till sometime
in early 2016. It's just practical.

Net - We really don't have a good idea when a significant portion of users
will actually migrate to Java 8. W.r.t Java7 this took nearly 3 years after
Java6 EOL. So for now, let's just wait & see how things develop in the
field.


> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
>
I think the mechanics make perfect sense to me. I think we should probably
think a bit more on whether we drop support for JDK6 in hadoop-2.6 or
hadoop-2.7.

I'd like to add one more:
* Sometime soon (within a release or two) after we actually drop support
for Java6 and move branch-2 to JDK7, let's also start testing on Java8.

This way we will be ready for Java8 early regardless of when we stop
support for Java7. Dropping Java7 is a bridge we can cross when we come to
it.


thanks,
Arun


Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>



-- 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
+1, though I think 2.5 may be premature if we want to send a warning note
"last ever". That's an issue for followon "when in branch 2".

Guava and protobuf.jar are two things we have to leave alone, with the
first being unfortunate, but their attitude to updates is pretty dramatic.
The latter? We all know how traumatic that can be.

-Steve


On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>
> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun Murthy <ac...@hortonworks.com>.
Alejandro,


On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>

+1 - I think we are all on the same page here. Fully agree.


>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>

+1. Agree again - let's just wait/watch.

>From the thread I've become more convinced that (as you've noted before)
that since we are at the bottom of the stack, we need to be more
conservative.

>From http://www.oracle.com/technetwork/java/eol-135779.html, it looks like
April 2015 is the *earliest* Java7 will EOL. Java6 EOL was Feb 2011 and we
are still debating whether we can stop supporting it. So, my guess is that
we will support Java7 at least for a year after it's EOL i.e. till sometime
in early 2016. It's just practical.

Net - We really don't have a good idea when a significant portion of users
will actually migrate to Java 8. W.r.t Java7 this took nearly 3 years after
Java6 EOL. So for now, let's just wait & see how things develop in the
field.


> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
>
I think the mechanics make perfect sense to me. I think we should probably
think a bit more on whether we drop support for JDK6 in hadoop-2.6 or
hadoop-2.7.

I'd like to add one more:
* Sometime soon (within a release or two) after we actually drop support
for Java6 and move branch-2 to JDK7, let's also start testing on Java8.

This way we will be ready for Java8 early regardless of when we stop
support for Java7. Dropping Java7 is a bridge we can cross when we come to
it.


thanks,
Arun


Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>



-- 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
+1, though I think 2.5 may be premature if we want to send a warning note
"last ever". That's an issue for followon "when in branch 2".

Guava and protobuf.jar are two things we have to leave alone, with the
first being unfortunate, but their attitude to updates is pretty dramatic.
The latter? We all know how traumatic that can be.

-Steve


On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>
> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
+1, though I think 2.5 may be premature if we want to send a warning note
"last ever". That's an issue for followon "when in branch 2".

Guava and protobuf.jar are two things we have to leave alone, with the
first being unfortunate, but their attitude to updates is pretty dramatic.
The latter? We all know how traumatic that can be.

-Steve


On 24 June 2014 16:44, Alejandro Abdelnur <tu...@cloudera.com> wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>
> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Owen O'Malley <om...@apache.org>.
On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop


I agree with Alejandro. Changing minimum JDKs is not an incompatible change
and is fine in the 2 branch. (Although I think it is would *not* be
appropriate for a patch release.) Of course we need to do it with
forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
thing. Moving to Java 8 as a minimum seems much too aggressive and I would
push back on that.

I'm also think that we need to let the dust settle on the Hadoop 2 line for
a while before we talk about Hadoop 3. It seems that it has only been in
the last 6 months that Hadoop 2 adoption has reached the main stream users.
Our user community needs time to digest the changes in Hadoop 2.x before we
fracture the community by starting to discuss Hadoop 3 releases.

.. Owen

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun Murthy <ac...@hortonworks.com>.
Alejandro,


On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>

+1 - I think we are all on the same page here. Fully agree.


>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>

+1. Agree again - let's just wait/watch.

>From the thread I've become more convinced that (as you've noted before)
that since we are at the bottom of the stack, we need to be more
conservative.

>From http://www.oracle.com/technetwork/java/eol-135779.html, it looks like
April 2015 is the *earliest* Java7 will EOL. Java6 EOL was Feb 2011 and we
are still debating whether we can stop supporting it. So, my guess is that
we will support Java7 at least for a year after it's EOL i.e. till sometime
in early 2016. It's just practical.

Net - We really don't have a good idea when a significant portion of users
will actually migrate to Java 8. W.r.t Java7 this took nearly 3 years after
Java6 EOL. So for now, let's just wait & see how things develop in the
field.


> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
>
I think the mechanics make perfect sense to me. I think we should probably
think a bit more on whether we drop support for JDK6 in hadoop-2.6 or
hadoop-2.7.

I'd like to add one more:
* Sometime soon (within a release or two) after we actually drop support
for Java6 and move branch-2 to JDK7, let's also start testing on Java8.

This way we will be ready for Java8 early regardless of when we stop
support for Java7. Dropping Java7 is a bridge we can cross when we come to
it.


thanks,
Arun


Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>



-- 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Owen O'Malley <om...@apache.org>.
On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop


I agree with Alejandro. Changing minimum JDKs is not an incompatible change
and is fine in the 2 branch. (Although I think it is would *not* be
appropriate for a patch release.) Of course we need to do it with
forethought and testing, but moving off of JDK 6, which is EOL'ed is a good
thing. Moving to Java 8 as a minimum seems much too aggressive and I would
push back on that.

I'm also think that we need to let the dust settle on the Hadoop 2 line for
a while before we talk about Hadoop 3. It seems that it has only been in
the last 6 months that Hadoop 2 adoption has reached the main stream users.
Our user community needs time to digest the changes in Hadoop 2.x before we
fracture the community by starting to discuss Hadoop 3 releases.

.. Owen

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun Murthy <ac...@hortonworks.com>.
Alejandro,


On Tue, Jun 24, 2014 at 4:44 PM, Alejandro Abdelnur <tu...@cloudera.com>
wrote:

> After reading this thread and thinking a bit about it, I think it should be
> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>
> * Existing Hadoop 2 releases and related projects are running
>   on JDK7 in production.
> * Commercial vendors of Hadoop have already done lot of
>   work to ensure Hadoop on JDK7 works while keeping Hadoop
>   on JDK6 working.
> * Different from many of the 3rd party libraries used by Hadoop,
>   JDK is much stricter on backwards compatibility.
>

+1 - I think we are all on the same page here. Fully agree.


>
> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
> the later if we end up in the same state of affairs)
>

+1. Agree again - let's just wait/watch.

>From the thread I've become more convinced that (as you've noted before)
that since we are at the bottom of the stack, we need to be more
conservative.

>From http://www.oracle.com/technetwork/java/eol-135779.html, it looks like
April 2015 is the *earliest* Java7 will EOL. Java6 EOL was Feb 2011 and we
are still debating whether we can stop supporting it. So, my guess is that
we will support Java7 at least for a year after it's EOL i.e. till sometime
in early 2016. It's just practical.

Net - We really don't have a good idea when a significant portion of users
will actually migrate to Java 8. W.r.t Java7 this took nearly 3 years after
Java6 EOL. So for now, let's just wait & see how things develop in the
field.


> Even for Hadoop 2.5, I think we could do the move:
>
> * Create the Hadoop 2.5 release branch.
> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>   with JDK6 to ensure not JDK7 language/API  feature creeps
>   out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
> * Sanity tests for the Hadoop 2.5.x releases should be done
>   with JDK7.
> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
> * Move all Apache Jenkins jobs to build/test using JDK7.
> * Starting from Hadoop 2.6 we support JDK7 language/API
>   features.
>
>
I think the mechanics make perfect sense to me. I think we should probably
think a bit more on whether we drop support for JDK6 in hadoop-2.6 or
hadoop-2.7.

I'd like to add one more:
* Sometime soon (within a release or two) after we actually drop support
for Java6 and move branch-2 to JDK7, let's also start testing on Java8.

This way we will be ready for Java8 early regardless of when we stop
support for Java7. Dropping Java7 is a bridge we can cross when we come to
it.


thanks,
Arun


Effectively what we are ensuring that Hadoop 2.5.x builds and test with
> JDK6 & JDK7 and that all tests towards the release
> are done with JDK7.
>
> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>
> Thoughts?
>
>
> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > On dependencies, we've bumped library versions when we think it's safe
> and
> > the APIs in the new version are compatible. Or, it's not leaked to the
> app
> > classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> > fall into one of those categories. Steve can do a better job explaining
> > this to me, but we haven't bumped things like Jetty or Guava because they
> > are on the classpath and are not compatible. There is this line in the
> > compat guidelines:
> >
> >    - Existing MapReduce, YARN & HDFS applications and frameworks should
> >    work unmodified within a major release i.e. Apache Hadoop ABI is
> > supported.
> >
> > Since Hadoop apps can and do depend on the Hadoop classpath, the
> classpath
> > is effectively part of our API. I'm sure there are user apps out there
> that
> > will break if we make incompatible changes to the classpath. I haven't
> read
> > up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
> out
> > there.
> >
> > Sticking to the theme of "work unmodified", let's think about the user
> > effort required to upgrade their JDK. This can be a very expensive task.
> It
> > might need approval up and down the org, meaning lots of certification,
> > testing, and signoff. Considering the amount of user effort involved
> here,
> > it really seems like dropping a JDK is something that should only happen
> in
> > a major release. Else, there's the potential for nasty surprises in a
> > supposedly "minor" release.
> >
> > That said, we are in an unhappy place right now regarding JDK6, and it's
> > true that almost everyone's moved off of JDK6 at this point. So, I'd be
> > okay with an intermediate 2.x release that drops JDK6 support (but no
> > incompatible changes to the classpath like Guava). This is basically
> free,
> > and we could start using JDK7 idioms like multi-catch and new NIO stuff
> in
> > Hadoop code (a minor draw I guess).
> >
> > My higher-level goal though is to avoid going through this same pain
> again
> > when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> > this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> > release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> > the future, and it seems like a better place to focus our efforts. I was
> > also hoping it'd be realistic to fix our classpath leakage by then, since
> > then we'd have a nice, tight, future-proofed new major release.
> >
> > Thanks,
> > Andrew
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> > wrote:
> >
> > > Andrew,
> > >
> > >  Thanks for starting this thread. I'll edit the wiki to provide more
> > > context around rolling-upgrades etc. which, as I pointed out in the
> > > original thread, are key IMHO.
> > >
> > > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > > wrote:
> > > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > > >
> > > > I think based on our current compatibility guidelines, Proposal A is
> > the
> > > > most attractive. We're pretty hamstrung by the requirement to keep
> the
> > > > classpath the same, which would be solved by either OSGI or shading
> our
> > > > deps (but that's a different discussion).
> > >
> > > I don't see that anywhere in our current compatibility guidelines.
> > >
> > > As you can see from
> > >
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > > we do not have such a policy (pasted here for convenience):
> > >
> > > Java Classpath
> > >
> > > User applications built against Hadoop might add all Hadoop jars
> > > (including Hadoop's library dependencies) to the application's
> classpath.
> > > Adding new dependencies or updating the version of existing
> dependencies
> > > may interfere with those in applications' classpaths.
> > >
> > > Policy
> > >
> > > Currently, there is NO policy on when Hadoop's dependencies can change.
> > >
> > > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> > Again,
> > > as I pointed out in the previous thread, here is the precedent:
> > >
> > > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> > >
> > > > Also, this is something we already have done i.e. we updated some of
> > our
> > > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > > dramatic as JDK. Here are some examples:
> > > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > > https://issues.apache.org/jira/browse/HADOOP-10503
> > >
> > > thanks,
> > > Arun
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
>
>
>
> --
> Alejandro
>



-- 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
After reading this thread and thinking a bit about it, I think it should be
OK such move up to JDK7 in Hadoop 2 for the following reasons:

* Existing Hadoop 2 releases and related projects are running
  on JDK7 in production.
* Commercial vendors of Hadoop have already done lot of
  work to ensure Hadoop on JDK7 works while keeping Hadoop
  on JDK6 working.
* Different from many of the 3rd party libraries used by Hadoop,
  JDK is much stricter on backwards compatibility.

IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
party dependencies and for moving from JDK7 to JDK8 (though it could OK for
the later if we end up in the same state of affairs)

Even for Hadoop 2.5, I think we could do the move:

* Create the Hadoop 2.5 release branch.
* Have one nightly Jenkins job that builds Hadoop 2.5 branch
  with JDK6 to ensure not JDK7 language/API  feature creeps
  out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
* Sanity tests for the Hadoop 2.5.x releases should be done
  with JDK7.
* Apply Steve’s patch to require JDK7 on trunk and branch-2.
* Move all Apache Jenkins jobs to build/test using JDK7.
* Starting from Hadoop 2.6 we support JDK7 language/API
  features.

Effectively what we are ensuring that Hadoop 2.5.x builds and test with
JDK6 & JDK7 and that all tests towards the release
are done with JDK7.

Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
(which it would be quite unlikely) they can reactively upgrade to JDK7.

Thoughts?


On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> On dependencies, we've bumped library versions when we think it's safe and
> the APIs in the new version are compatible. Or, it's not leaked to the app
> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> fall into one of those categories. Steve can do a better job explaining
> this to me, but we haven't bumped things like Jetty or Guava because they
> are on the classpath and are not compatible. There is this line in the
> compat guidelines:
>
>    - Existing MapReduce, YARN & HDFS applications and frameworks should
>    work unmodified within a major release i.e. Apache Hadoop ABI is
> supported.
>
> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.
>
> Sticking to the theme of "work unmodified", let's think about the user
> effort required to upgrade their JDK. This can be a very expensive task. It
> might need approval up and down the org, meaning lots of certification,
> testing, and signoff. Considering the amount of user effort involved here,
> it really seems like dropping a JDK is something that should only happen in
> a major release. Else, there's the potential for nasty surprises in a
> supposedly "minor" release.
>
> That said, we are in an unhappy place right now regarding JDK6, and it's
> true that almost everyone's moved off of JDK6 at this point. So, I'd be
> okay with an intermediate 2.x release that drops JDK6 support (but no
> incompatible changes to the classpath like Guava). This is basically free,
> and we could start using JDK7 idioms like multi-catch and new NIO stuff in
> Hadoop code (a minor draw I guess).
>
> My higher-level goal though is to avoid going through this same pain again
> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> the future, and it seems like a better place to focus our efforts. I was
> also hoping it'd be realistic to fix our classpath leakage by then, since
> then we'd have a nice, tight, future-proofed new major release.
>
> Thanks,
> Andrew
>
>
>
>
> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> > Andrew,
> >
> >  Thanks for starting this thread. I'll edit the wiki to provide more
> > context around rolling-upgrades etc. which, as I pointed out in the
> > original thread, are key IMHO.
> >
> > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > >
> > > I think based on our current compatibility guidelines, Proposal A is
> the
> > > most attractive. We're pretty hamstrung by the requirement to keep the
> > > classpath the same, which would be solved by either OSGI or shading our
> > > deps (but that's a different discussion).
> >
> > I don't see that anywhere in our current compatibility guidelines.
> >
> > As you can see from
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > we do not have such a policy (pasted here for convenience):
> >
> > Java Classpath
> >
> > User applications built against Hadoop might add all Hadoop jars
> > (including Hadoop's library dependencies) to the application's classpath.
> > Adding new dependencies or updating the version of existing dependencies
> > may interfere with those in applications' classpaths.
> >
> > Policy
> >
> > Currently, there is NO policy on when Hadoop's dependencies can change.
> >
> > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> Again,
> > as I pointed out in the previous thread, here is the precedent:
> >
> > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > Also, this is something we already have done i.e. we updated some of
> our
> > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > dramatic as JDK. Here are some examples:
> > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > https://issues.apache.org/jira/browse/HADOOP-10503
> >
> > thanks,
> > Arun
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
After reading this thread and thinking a bit about it, I think it should be
OK such move up to JDK7 in Hadoop 2 for the following reasons:

* Existing Hadoop 2 releases and related projects are running
  on JDK7 in production.
* Commercial vendors of Hadoop have already done lot of
  work to ensure Hadoop on JDK7 works while keeping Hadoop
  on JDK6 working.
* Different from many of the 3rd party libraries used by Hadoop,
  JDK is much stricter on backwards compatibility.

IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
party dependencies and for moving from JDK7 to JDK8 (though it could OK for
the later if we end up in the same state of affairs)

Even for Hadoop 2.5, I think we could do the move:

* Create the Hadoop 2.5 release branch.
* Have one nightly Jenkins job that builds Hadoop 2.5 branch
  with JDK6 to ensure not JDK7 language/API  feature creeps
  out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
* Sanity tests for the Hadoop 2.5.x releases should be done
  with JDK7.
* Apply Steve’s patch to require JDK7 on trunk and branch-2.
* Move all Apache Jenkins jobs to build/test using JDK7.
* Starting from Hadoop 2.6 we support JDK7 language/API
  features.

Effectively what we are ensuring that Hadoop 2.5.x builds and test with
JDK6 & JDK7 and that all tests towards the release
are done with JDK7.

Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
(which it would be quite unlikely) they can reactively upgrade to JDK7.

Thoughts?


On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> On dependencies, we've bumped library versions when we think it's safe and
> the APIs in the new version are compatible. Or, it's not leaked to the app
> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> fall into one of those categories. Steve can do a better job explaining
> this to me, but we haven't bumped things like Jetty or Guava because they
> are on the classpath and are not compatible. There is this line in the
> compat guidelines:
>
>    - Existing MapReduce, YARN & HDFS applications and frameworks should
>    work unmodified within a major release i.e. Apache Hadoop ABI is
> supported.
>
> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.
>
> Sticking to the theme of "work unmodified", let's think about the user
> effort required to upgrade their JDK. This can be a very expensive task. It
> might need approval up and down the org, meaning lots of certification,
> testing, and signoff. Considering the amount of user effort involved here,
> it really seems like dropping a JDK is something that should only happen in
> a major release. Else, there's the potential for nasty surprises in a
> supposedly "minor" release.
>
> That said, we are in an unhappy place right now regarding JDK6, and it's
> true that almost everyone's moved off of JDK6 at this point. So, I'd be
> okay with an intermediate 2.x release that drops JDK6 support (but no
> incompatible changes to the classpath like Guava). This is basically free,
> and we could start using JDK7 idioms like multi-catch and new NIO stuff in
> Hadoop code (a minor draw I guess).
>
> My higher-level goal though is to avoid going through this same pain again
> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> the future, and it seems like a better place to focus our efforts. I was
> also hoping it'd be realistic to fix our classpath leakage by then, since
> then we'd have a nice, tight, future-proofed new major release.
>
> Thanks,
> Andrew
>
>
>
>
> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> > Andrew,
> >
> >  Thanks for starting this thread. I'll edit the wiki to provide more
> > context around rolling-upgrades etc. which, as I pointed out in the
> > original thread, are key IMHO.
> >
> > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > >
> > > I think based on our current compatibility guidelines, Proposal A is
> the
> > > most attractive. We're pretty hamstrung by the requirement to keep the
> > > classpath the same, which would be solved by either OSGI or shading our
> > > deps (but that's a different discussion).
> >
> > I don't see that anywhere in our current compatibility guidelines.
> >
> > As you can see from
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > we do not have such a policy (pasted here for convenience):
> >
> > Java Classpath
> >
> > User applications built against Hadoop might add all Hadoop jars
> > (including Hadoop's library dependencies) to the application's classpath.
> > Adding new dependencies or updating the version of existing dependencies
> > may interfere with those in applications' classpaths.
> >
> > Policy
> >
> > Currently, there is NO policy on when Hadoop's dependencies can change.
> >
> > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> Again,
> > as I pointed out in the previous thread, here is the precedent:
> >
> > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > Also, this is something we already have done i.e. we updated some of
> our
> > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > dramatic as JDK. Here are some examples:
> > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > https://issues.apache.org/jira/browse/HADOOP-10503
> >
> > thanks,
> > Arun
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Jun 24, 2014, at 4:22 PM, Andrew Wang <an...@cloudera.com> wrote:


> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.

I think there is a some confusion/misunderstanding here.

With hadoop-2 the user is completely in control of his own classpath (we had a similar, but limited capability in hadoop-1 w/ https://issues.apache.org/jira/browse/MAPREDUCE-1938).

Furthermore, it's probably not well known that in hadoop-2 the user application (MR or otherwise) can also pick the JDK version by using JAVA_HOME env for the container. So, in effect, MR applications can continue to use java6 while YARN is running java7 - this hasn't been tested extensively though. This capability did not exist in hadoop-1. We've also made some progress with https://issues.apache.org/jira/browse/MAPREDUCE-1700 to defuse user jar-deps from MR system jars. https://issues.apache.org/jira/browse/MAPREDUCE-4421 also helps by ensuring MR applications can pick exact version of MR jars they were compiled against; and not rely on cluster installs.

Hope that helps somewhat.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
After reading this thread and thinking a bit about it, I think it should be
OK such move up to JDK7 in Hadoop 2 for the following reasons:

* Existing Hadoop 2 releases and related projects are running
  on JDK7 in production.
* Commercial vendors of Hadoop have already done lot of
  work to ensure Hadoop on JDK7 works while keeping Hadoop
  on JDK6 working.
* Different from many of the 3rd party libraries used by Hadoop,
  JDK is much stricter on backwards compatibility.

IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
party dependencies and for moving from JDK7 to JDK8 (though it could OK for
the later if we end up in the same state of affairs)

Even for Hadoop 2.5, I think we could do the move:

* Create the Hadoop 2.5 release branch.
* Have one nightly Jenkins job that builds Hadoop 2.5 branch
  with JDK6 to ensure not JDK7 language/API  feature creeps
  out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
* Sanity tests for the Hadoop 2.5.x releases should be done
  with JDK7.
* Apply Steve’s patch to require JDK7 on trunk and branch-2.
* Move all Apache Jenkins jobs to build/test using JDK7.
* Starting from Hadoop 2.6 we support JDK7 language/API
  features.

Effectively what we are ensuring that Hadoop 2.5.x builds and test with
JDK6 & JDK7 and that all tests towards the release
are done with JDK7.

Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
(which it would be quite unlikely) they can reactively upgrade to JDK7.

Thoughts?


On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> On dependencies, we've bumped library versions when we think it's safe and
> the APIs in the new version are compatible. Or, it's not leaked to the app
> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> fall into one of those categories. Steve can do a better job explaining
> this to me, but we haven't bumped things like Jetty or Guava because they
> are on the classpath and are not compatible. There is this line in the
> compat guidelines:
>
>    - Existing MapReduce, YARN & HDFS applications and frameworks should
>    work unmodified within a major release i.e. Apache Hadoop ABI is
> supported.
>
> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.
>
> Sticking to the theme of "work unmodified", let's think about the user
> effort required to upgrade their JDK. This can be a very expensive task. It
> might need approval up and down the org, meaning lots of certification,
> testing, and signoff. Considering the amount of user effort involved here,
> it really seems like dropping a JDK is something that should only happen in
> a major release. Else, there's the potential for nasty surprises in a
> supposedly "minor" release.
>
> That said, we are in an unhappy place right now regarding JDK6, and it's
> true that almost everyone's moved off of JDK6 at this point. So, I'd be
> okay with an intermediate 2.x release that drops JDK6 support (but no
> incompatible changes to the classpath like Guava). This is basically free,
> and we could start using JDK7 idioms like multi-catch and new NIO stuff in
> Hadoop code (a minor draw I guess).
>
> My higher-level goal though is to avoid going through this same pain again
> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> the future, and it seems like a better place to focus our efforts. I was
> also hoping it'd be realistic to fix our classpath leakage by then, since
> then we'd have a nice, tight, future-proofed new major release.
>
> Thanks,
> Andrew
>
>
>
>
> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> > Andrew,
> >
> >  Thanks for starting this thread. I'll edit the wiki to provide more
> > context around rolling-upgrades etc. which, as I pointed out in the
> > original thread, are key IMHO.
> >
> > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > >
> > > I think based on our current compatibility guidelines, Proposal A is
> the
> > > most attractive. We're pretty hamstrung by the requirement to keep the
> > > classpath the same, which would be solved by either OSGI or shading our
> > > deps (but that's a different discussion).
> >
> > I don't see that anywhere in our current compatibility guidelines.
> >
> > As you can see from
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > we do not have such a policy (pasted here for convenience):
> >
> > Java Classpath
> >
> > User applications built against Hadoop might add all Hadoop jars
> > (including Hadoop's library dependencies) to the application's classpath.
> > Adding new dependencies or updating the version of existing dependencies
> > may interfere with those in applications' classpaths.
> >
> > Policy
> >
> > Currently, there is NO policy on when Hadoop's dependencies can change.
> >
> > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> Again,
> > as I pointed out in the previous thread, here is the precedent:
> >
> > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > Also, this is something we already have done i.e. we updated some of
> our
> > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > dramatic as JDK. Here are some examples:
> > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > https://issues.apache.org/jira/browse/HADOOP-10503
> >
> > thanks,
> > Arun
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Jun 24, 2014, at 4:22 PM, Andrew Wang <an...@cloudera.com> wrote:


> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.

I think there is a some confusion/misunderstanding here.

With hadoop-2 the user is completely in control of his own classpath (we had a similar, but limited capability in hadoop-1 w/ https://issues.apache.org/jira/browse/MAPREDUCE-1938).

Furthermore, it's probably not well known that in hadoop-2 the user application (MR or otherwise) can also pick the JDK version by using JAVA_HOME env for the container. So, in effect, MR applications can continue to use java6 while YARN is running java7 - this hasn't been tested extensively though. This capability did not exist in hadoop-1. We've also made some progress with https://issues.apache.org/jira/browse/MAPREDUCE-1700 to defuse user jar-deps from MR system jars. https://issues.apache.org/jira/browse/MAPREDUCE-4421 also helps by ensuring MR applications can pick exact version of MR jars they were compiled against; and not rely on cluster installs.

Hope that helps somewhat.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
After reading this thread and thinking a bit about it, I think it should be
OK such move up to JDK7 in Hadoop 2 for the following reasons:

* Existing Hadoop 2 releases and related projects are running
  on JDK7 in production.
* Commercial vendors of Hadoop have already done lot of
  work to ensure Hadoop on JDK7 works while keeping Hadoop
  on JDK6 working.
* Different from many of the 3rd party libraries used by Hadoop,
  JDK is much stricter on backwards compatibility.

IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
party dependencies and for moving from JDK7 to JDK8 (though it could OK for
the later if we end up in the same state of affairs)

Even for Hadoop 2.5, I think we could do the move:

* Create the Hadoop 2.5 release branch.
* Have one nightly Jenkins job that builds Hadoop 2.5 branch
  with JDK6 to ensure not JDK7 language/API  feature creeps
  out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
* Sanity tests for the Hadoop 2.5.x releases should be done
  with JDK7.
* Apply Steve’s patch to require JDK7 on trunk and branch-2.
* Move all Apache Jenkins jobs to build/test using JDK7.
* Starting from Hadoop 2.6 we support JDK7 language/API
  features.

Effectively what we are ensuring that Hadoop 2.5.x builds and test with
JDK6 & JDK7 and that all tests towards the release
are done with JDK7.

Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
(which it would be quite unlikely) they can reactively upgrade to JDK7.

Thoughts?


On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <an...@cloudera.com>
wrote:

> Hi all,
>
> On dependencies, we've bumped library versions when we think it's safe and
> the APIs in the new version are compatible. Or, it's not leaked to the app
> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
> fall into one of those categories. Steve can do a better job explaining
> this to me, but we haven't bumped things like Jetty or Guava because they
> are on the classpath and are not compatible. There is this line in the
> compat guidelines:
>
>    - Existing MapReduce, YARN & HDFS applications and frameworks should
>    work unmodified within a major release i.e. Apache Hadoop ABI is
> supported.
>
> Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
> is effectively part of our API. I'm sure there are user apps out there that
> will break if we make incompatible changes to the classpath. I haven't read
> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
> there.
>
> Sticking to the theme of "work unmodified", let's think about the user
> effort required to upgrade their JDK. This can be a very expensive task. It
> might need approval up and down the org, meaning lots of certification,
> testing, and signoff. Considering the amount of user effort involved here,
> it really seems like dropping a JDK is something that should only happen in
> a major release. Else, there's the potential for nasty surprises in a
> supposedly "minor" release.
>
> That said, we are in an unhappy place right now regarding JDK6, and it's
> true that almost everyone's moved off of JDK6 at this point. So, I'd be
> okay with an intermediate 2.x release that drops JDK6 support (but no
> incompatible changes to the classpath like Guava). This is basically free,
> and we could start using JDK7 idioms like multi-catch and new NIO stuff in
> Hadoop code (a minor draw I guess).
>
> My higher-level goal though is to avoid going through this same pain again
> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
> the future, and it seems like a better place to focus our efforts. I was
> also hoping it'd be realistic to fix our classpath leakage by then, since
> then we'd have a nice, tight, future-proofed new major release.
>
> Thanks,
> Andrew
>
>
>
>
> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> > Andrew,
> >
> >  Thanks for starting this thread. I'll edit the wiki to provide more
> > context around rolling-upgrades etc. which, as I pointed out in the
> > original thread, are key IMHO.
> >
> > On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> > wrote:
> > > https://wiki.apache.org/hadoop/MovingToJdk7and8
> > >
> > > I think based on our current compatibility guidelines, Proposal A is
> the
> > > most attractive. We're pretty hamstrung by the requirement to keep the
> > > classpath the same, which would be solved by either OSGI or shading our
> > > deps (but that's a different discussion).
> >
> > I don't see that anywhere in our current compatibility guidelines.
> >
> > As you can see from
> >
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> > we do not have such a policy (pasted here for convenience):
> >
> > Java Classpath
> >
> > User applications built against Hadoop might add all Hadoop jars
> > (including Hadoop's library dependencies) to the application's classpath.
> > Adding new dependencies or updating the version of existing dependencies
> > may interfere with those in applications' classpaths.
> >
> > Policy
> >
> > Currently, there is NO policy on when Hadoop's dependencies can change.
> >
> > Furthermore, we have *already* changed our classpath in hadoop-2.x.
> Again,
> > as I pointed out in the previous thread, here is the precedent:
> >
> > On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > Also, this is something we already have done i.e. we updated some of
> our
> > software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> > dramatic as JDK. Here are some examples:
> > > https://issues.apache.org/jira/browse/HADOOP-9991
> > > https://issues.apache.org/jira/browse/HADOOP-10102
> > > https://issues.apache.org/jira/browse/HADOOP-10103
> > > https://issues.apache.org/jira/browse/HADOOP-10104
> > > https://issues.apache.org/jira/browse/HADOOP-10503
> >
> > thanks,
> > Arun
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>



-- 
Alejandro

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all,

On dependencies, we've bumped library versions when we think it's safe and
the APIs in the new version are compatible. Or, it's not leaked to the app
classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
fall into one of those categories. Steve can do a better job explaining
this to me, but we haven't bumped things like Jetty or Guava because they
are on the classpath and are not compatible. There is this line in the
compat guidelines:

   - Existing MapReduce, YARN & HDFS applications and frameworks should
   work unmodified within a major release i.e. Apache Hadoop ABI is supported.

Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
is effectively part of our API. I'm sure there are user apps out there that
will break if we make incompatible changes to the classpath. I haven't read
up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
there.

Sticking to the theme of "work unmodified", let's think about the user
effort required to upgrade their JDK. This can be a very expensive task. It
might need approval up and down the org, meaning lots of certification,
testing, and signoff. Considering the amount of user effort involved here,
it really seems like dropping a JDK is something that should only happen in
a major release. Else, there's the potential for nasty surprises in a
supposedly "minor" release.

That said, we are in an unhappy place right now regarding JDK6, and it's
true that almost everyone's moved off of JDK6 at this point. So, I'd be
okay with an intermediate 2.x release that drops JDK6 support (but no
incompatible changes to the classpath like Guava). This is basically free,
and we could start using JDK7 idioms like multi-catch and new NIO stuff in
Hadoop code (a minor draw I guess).

My higher-level goal though is to avoid going through this same pain again
when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
this reason. This is why I suggested skipping an intermediate 2.x+JDK7
release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
the future, and it seems like a better place to focus our efforts. I was
also hoping it'd be realistic to fix our classpath leakage by then, since
then we'd have a nice, tight, future-proofed new major release.

Thanks,
Andrew




On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
That classpath policy was explicitly added because we can't lock down our
dependencies for security/bug fix reasons, and also because if we do update
something explicitly, their transitive dependencies can change -beyond our
control.

https://issues.apache.org/jira/browse/HADOOP-9555 is an example of this: an
update of ZK explicitly to fix an HA problem. Are there changes in its
dependencies? I don't know. But we didn't have a choice to update if we
wanted NN & RM failover to work reliably, so we have to take any other
changes that went in.

JDK upgrades can be viewed as an extension of this -we are changing the
base platform that Hadoop runs on. More precisely, for the Java 6- >Java 7
update, we are reflecting the fact that nobody is running in production on
Java 6

Do you realise we actually moved to Java 6 in 2008?
https://issues.apache.org/jira/browse/HADOOP-2325 . That was six years ago
-half the names on that list are not active on the project any more.

What we did there was issue a warning in 0.18 that it would be the last
Java 5 version; 0.19  moved up -we can do the same for a Hadoop 2.x release
at some point this year.



On 24 June 2014 11:43, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all,

On dependencies, we've bumped library versions when we think it's safe and
the APIs in the new version are compatible. Or, it's not leaked to the app
classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
fall into one of those categories. Steve can do a better job explaining
this to me, but we haven't bumped things like Jetty or Guava because they
are on the classpath and are not compatible. There is this line in the
compat guidelines:

   - Existing MapReduce, YARN & HDFS applications and frameworks should
   work unmodified within a major release i.e. Apache Hadoop ABI is supported.

Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
is effectively part of our API. I'm sure there are user apps out there that
will break if we make incompatible changes to the classpath. I haven't read
up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
there.

Sticking to the theme of "work unmodified", let's think about the user
effort required to upgrade their JDK. This can be a very expensive task. It
might need approval up and down the org, meaning lots of certification,
testing, and signoff. Considering the amount of user effort involved here,
it really seems like dropping a JDK is something that should only happen in
a major release. Else, there's the potential for nasty surprises in a
supposedly "minor" release.

That said, we are in an unhappy place right now regarding JDK6, and it's
true that almost everyone's moved off of JDK6 at this point. So, I'd be
okay with an intermediate 2.x release that drops JDK6 support (but no
incompatible changes to the classpath like Guava). This is basically free,
and we could start using JDK7 idioms like multi-catch and new NIO stuff in
Hadoop code (a minor draw I guess).

My higher-level goal though is to avoid going through this same pain again
when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
this reason. This is why I suggested skipping an intermediate 2.x+JDK7
release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
the future, and it seems like a better place to focus our efforts. I was
also hoping it'd be realistic to fix our classpath leakage by then, since
then we'd have a nice, tight, future-proofed new major release.

Thanks,
Andrew




On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all,

On dependencies, we've bumped library versions when we think it's safe and
the APIs in the new version are compatible. Or, it's not leaked to the app
classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
fall into one of those categories. Steve can do a better job explaining
this to me, but we haven't bumped things like Jetty or Guava because they
are on the classpath and are not compatible. There is this line in the
compat guidelines:

   - Existing MapReduce, YARN & HDFS applications and frameworks should
   work unmodified within a major release i.e. Apache Hadoop ABI is supported.

Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
is effectively part of our API. I'm sure there are user apps out there that
will break if we make incompatible changes to the classpath. I haven't read
up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
there.

Sticking to the theme of "work unmodified", let's think about the user
effort required to upgrade their JDK. This can be a very expensive task. It
might need approval up and down the org, meaning lots of certification,
testing, and signoff. Considering the amount of user effort involved here,
it really seems like dropping a JDK is something that should only happen in
a major release. Else, there's the potential for nasty surprises in a
supposedly "minor" release.

That said, we are in an unhappy place right now regarding JDK6, and it's
true that almost everyone's moved off of JDK6 at this point. So, I'd be
okay with an intermediate 2.x release that drops JDK6 support (but no
incompatible changes to the classpath like Guava). This is basically free,
and we could start using JDK7 idioms like multi-catch and new NIO stuff in
Hadoop code (a minor draw I guess).

My higher-level goal though is to avoid going through this same pain again
when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
this reason. This is why I suggested skipping an intermediate 2.x+JDK7
release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
the future, and it seems like a better place to focus our efforts. I was
also hoping it'd be realistic to fix our classpath leakage by then, since
then we'd have a nice, tight, future-proofed new major release.

Thanks,
Andrew




On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Andrew Wang <an...@cloudera.com>.
Hi all,

On dependencies, we've bumped library versions when we think it's safe and
the APIs in the new version are compatible. Or, it's not leaked to the app
classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
fall into one of those categories. Steve can do a better job explaining
this to me, but we haven't bumped things like Jetty or Guava because they
are on the classpath and are not compatible. There is this line in the
compat guidelines:

   - Existing MapReduce, YARN & HDFS applications and frameworks should
   work unmodified within a major release i.e. Apache Hadoop ABI is supported.

Since Hadoop apps can and do depend on the Hadoop classpath, the classpath
is effectively part of our API. I'm sure there are user apps out there that
will break if we make incompatible changes to the classpath. I haven't read
up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app out
there.

Sticking to the theme of "work unmodified", let's think about the user
effort required to upgrade their JDK. This can be a very expensive task. It
might need approval up and down the org, meaning lots of certification,
testing, and signoff. Considering the amount of user effort involved here,
it really seems like dropping a JDK is something that should only happen in
a major release. Else, there's the potential for nasty surprises in a
supposedly "minor" release.

That said, we are in an unhappy place right now regarding JDK6, and it's
true that almost everyone's moved off of JDK6 at this point. So, I'd be
okay with an intermediate 2.x release that drops JDK6 support (but no
incompatible changes to the classpath like Guava). This is basically free,
and we could start using JDK7 idioms like multi-catch and new NIO stuff in
Hadoop code (a minor draw I guess).

My higher-level goal though is to avoid going through this same pain again
when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
this reason. This is why I suggested skipping an intermediate 2.x+JDK7
release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
the future, and it seems like a better place to focus our efforts. I was
also hoping it'd be realistic to fix our classpath leakage by then, since
then we'd have a nice, tight, future-proofed new major release.

Thanks,
Andrew




On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Moving to JDK7, JDK8 and new major releases

Posted by Steve Loughran <st...@hortonworks.com>.
That classpath policy was explicitly added because we can't lock down our
dependencies for security/bug fix reasons, and also because if we do update
something explicitly, their transitive dependencies can change -beyond our
control.

https://issues.apache.org/jira/browse/HADOOP-9555 is an example of this: an
update of ZK explicitly to fix an HA problem. Are there changes in its
dependencies? I don't know. But we didn't have a choice to update if we
wanted NN & RM failover to work reliably, so we have to take any other
changes that went in.

JDK upgrades can be viewed as an extension of this -we are changing the
base platform that Hadoop runs on. More precisely, for the Java 6- >Java 7
update, we are reflecting the fact that nobody is running in production on
Java 6

Do you realise we actually moved to Java 6 in 2008?
https://issues.apache.org/jira/browse/HADOOP-2325 . That was six years ago
-half the names on that list are not active on the project any more.

What we did there was issue a warning in 0.18 that it would be the last
Java 5 version; 0.19  moved up -we can do the same for a Hadoop 2.x release
at some point this year.



On 24 June 2014 11:43, Arun C Murthy <ac...@hortonworks.com> wrote:

> Andrew,
>
>  Thanks for starting this thread. I'll edit the wiki to provide more
> context around rolling-upgrades etc. which, as I pointed out in the
> original thread, are key IMHO.
>
> On Jun 24, 2014, at 11:17 AM, Andrew Wang <an...@cloudera.com>
> wrote:
> > https://wiki.apache.org/hadoop/MovingToJdk7and8
> >
> > I think based on our current compatibility guidelines, Proposal A is the
> > most attractive. We're pretty hamstrung by the requirement to keep the
> > classpath the same, which would be solved by either OSGI or shading our
> > deps (but that's a different discussion).
>
> I don't see that anywhere in our current compatibility guidelines.
>
> As you can see from
> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
> we do not have such a policy (pasted here for convenience):
>
> Java Classpath
>
> User applications built against Hadoop might add all Hadoop jars
> (including Hadoop's library dependencies) to the application's classpath.
> Adding new dependencies or updating the version of existing dependencies
> may interfere with those in applications' classpaths.
>
> Policy
>
> Currently, there is NO policy on when Hadoop's dependencies can change.
>
> Furthermore, we have *already* changed our classpath in hadoop-2.x. Again,
> as I pointed out in the previous thread, here is the precedent:
>
> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > Also, this is something we already have done i.e. we updated some of our
> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
> dramatic as JDK. Here are some examples:
> > https://issues.apache.org/jira/browse/HADOOP-9991
> > https://issues.apache.org/jira/browse/HADOOP-10102
> > https://issues.apache.org/jira/browse/HADOOP-10103
> > https://issues.apache.org/jira/browse/HADOOP-10104
> > https://issues.apache.org/jira/browse/HADOOP-10503
>
> thanks,
> Arun
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.