You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Arun C Murthy <ac...@hortonworks.com> on 2014/11/10 01:42:41 UTC

Guava

… has been a constant pain w.r.t compatibility etc.

Should we consider adopting a policy to not use guava in Common/HDFS/YARN? 

MR doesn't matter too much since it's application-side issue, it does hurt end-users though since they still might want a newer guava-version, but at least they can modify MR.

Thoughts?

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Guava

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
I'm usually an advocate for getting rid of unnecessary dependencies
(cough, jetty, cough), but a lot of the things in Guava are really
useful.

Immutable collections, BiMap, Multisets, Arrays#asList, the stuff for
writing hashCode() and equals(), String#Joiner, the list goes on.  We
particularly use the Cache/CacheBuilder stuff a lot in HDFS to get
maps with LRU eviction without writing a lot of boilerplate.  The QJM
stuff uses ListenableFuture a lot, although perhaps we could come up
with our own equivalent for that.

On Mon, Nov 10, 2014 at 9:26 AM, Alejandro Abdelnur <tu...@gmail.com> wrote:
> IMO we should:
>
> 1* have a clean and thin client API JAR (which does not drag any 3rd party
> dependencies, or a well defined small set -i.e. slf4j & log4j-)
> 2* have a client implementation that uses a classloader to isolate client
> impl 3rd party deps from app dependencies.
>
> #2 can be done using a stock URLClassLoader (i would just subclass it to
> forbid packages in the API JAR and exposed 3rd parties to be loaded from
> the app JAR)
>
> #1 is the tricky thing as our current API modules don't have a clean
> API/impl separation.
>
> thx
> PS: If folks are interested in pursing this, I can put together a prototype
> of how  #2 would work (I don't think it will be more than 200 lines of code)

Absolutely, I agree that we should not be using Guava types in public
APIs.  Guava has not been very responsible with backwards
compatibility, that much is clear.

A client / server jar separation is an interesting idea.  But then we
still have to get rid of Guava and other library deps in the client
jars.  I think it would be more work than it seems.  For example, the
HDFS client uses Guava Cache a lot, so we'd have to write our own
version of this.

Can't we just shade this stuff?  Has anyone tried shading Hadoop's Guava?

best,
Colin


>
>
> On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
>> Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
>> its the one we can't seamlessly upgrade. Not unless we do our own fork and
>> reinsert the missing classes.
>>
>> The most common uses in the code are
>>
>> @VisibleForTesting (easily replicated)
>> and the Precondition.check() operations
>>
>> The latter is also easily swapped out, and we could even add the check they
>> forgot:
>> Preconditions.checkArgNotNull(argname, arg)
>>
>>
>> These are easy; its the more complex data structures that matter more.
>>
>> I think for Hadoop 2.7 & java 7 we need to look at this problem and do
>> something. Even if we continue to ship Guava 11 so that the HBase team
>> don't send any (more) death threats, we can/should rework Hadoop to build
>> and run against Guava 16+ too. That's needed to fix some of the recent java
>> 7/8+ changes.
>>
>> -Everything in v11 dropped from v16 MUST  to be implemented with our own
>> versions.
>> -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
>> wherever possible.
>>
>> I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
>> 9 alongside the new baseline java 7. For those later versions we could
>> perhaps mandate Guava 16.
>>
>>
>>
>> On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>> > … has been a constant pain w.r.t compatibility etc.
>> >
>> > Should we consider adopting a policy to not use guava in
>> Common/HDFS/YARN?
>> >
>> > MR doesn't matter too much since it's application-side issue, it does
>> hurt
>> > end-users though since they still might want a newer guava-version, but
>> at
>> > least they can modify MR.
>> >
>> > Thoughts?
>> >
>> > thanks,
>> > Arun
>> >
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the reader
>> > of this message is not the intended recipient, you are hereby notified
>> that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> immediately
>> > and delete it from your system. Thank You.
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>

Re: Guava

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
I'm usually an advocate for getting rid of unnecessary dependencies
(cough, jetty, cough), but a lot of the things in Guava are really
useful.

Immutable collections, BiMap, Multisets, Arrays#asList, the stuff for
writing hashCode() and equals(), String#Joiner, the list goes on.  We
particularly use the Cache/CacheBuilder stuff a lot in HDFS to get
maps with LRU eviction without writing a lot of boilerplate.  The QJM
stuff uses ListenableFuture a lot, although perhaps we could come up
with our own equivalent for that.

On Mon, Nov 10, 2014 at 9:26 AM, Alejandro Abdelnur <tu...@gmail.com> wrote:
> IMO we should:
>
> 1* have a clean and thin client API JAR (which does not drag any 3rd party
> dependencies, or a well defined small set -i.e. slf4j & log4j-)
> 2* have a client implementation that uses a classloader to isolate client
> impl 3rd party deps from app dependencies.
>
> #2 can be done using a stock URLClassLoader (i would just subclass it to
> forbid packages in the API JAR and exposed 3rd parties to be loaded from
> the app JAR)
>
> #1 is the tricky thing as our current API modules don't have a clean
> API/impl separation.
>
> thx
> PS: If folks are interested in pursing this, I can put together a prototype
> of how  #2 would work (I don't think it will be more than 200 lines of code)

Absolutely, I agree that we should not be using Guava types in public
APIs.  Guava has not been very responsible with backwards
compatibility, that much is clear.

A client / server jar separation is an interesting idea.  But then we
still have to get rid of Guava and other library deps in the client
jars.  I think it would be more work than it seems.  For example, the
HDFS client uses Guava Cache a lot, so we'd have to write our own
version of this.

Can't we just shade this stuff?  Has anyone tried shading Hadoop's Guava?

best,
Colin


>
>
> On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
>> Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
>> its the one we can't seamlessly upgrade. Not unless we do our own fork and
>> reinsert the missing classes.
>>
>> The most common uses in the code are
>>
>> @VisibleForTesting (easily replicated)
>> and the Precondition.check() operations
>>
>> The latter is also easily swapped out, and we could even add the check they
>> forgot:
>> Preconditions.checkArgNotNull(argname, arg)
>>
>>
>> These are easy; its the more complex data structures that matter more.
>>
>> I think for Hadoop 2.7 & java 7 we need to look at this problem and do
>> something. Even if we continue to ship Guava 11 so that the HBase team
>> don't send any (more) death threats, we can/should rework Hadoop to build
>> and run against Guava 16+ too. That's needed to fix some of the recent java
>> 7/8+ changes.
>>
>> -Everything in v11 dropped from v16 MUST  to be implemented with our own
>> versions.
>> -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
>> wherever possible.
>>
>> I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
>> 9 alongside the new baseline java 7. For those later versions we could
>> perhaps mandate Guava 16.
>>
>>
>>
>> On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>> > … has been a constant pain w.r.t compatibility etc.
>> >
>> > Should we consider adopting a policy to not use guava in
>> Common/HDFS/YARN?
>> >
>> > MR doesn't matter too much since it's application-side issue, it does
>> hurt
>> > end-users though since they still might want a newer guava-version, but
>> at
>> > least they can modify MR.
>> >
>> > Thoughts?
>> >
>> > thanks,
>> > Arun
>> >
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the reader
>> > of this message is not the intended recipient, you are hereby notified
>> that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> immediately
>> > and delete it from your system. Thank You.
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>

Re: Guava

Posted by Sangjin Lee <sj...@apache.org>.
FYI, we have an existing ApplicationClassLoader implementation that is used
to isolate client/task classes from the rest. If we're going down the route
of classloader isolation on this, it would be good to come up with a
coherent strategy regarding both of these.

As a more practical step, I like the idea of isolating usage of guava that
breaks with guava 16 and later. I assume (but I haven't looked into it)
that it's fairly straightforward to isolate them and fix them. That work
could be done at any time without any version upgrades or impacting users.

On Mon, Nov 10, 2014 at 9:26 AM, Alejandro Abdelnur <tu...@gmail.com>
wrote:

> IMO we should:
>
> 1* have a clean and thin client API JAR (which does not drag any 3rd party
> dependencies, or a well defined small set -i.e. slf4j & log4j-)
> 2* have a client implementation that uses a classloader to isolate client
> impl 3rd party deps from app dependencies.
>
> #2 can be done using a stock URLClassLoader (i would just subclass it to
> forbid packages in the API JAR and exposed 3rd parties to be loaded from
> the app JAR)
>
> #1 is the tricky thing as our current API modules don't have a clean
> API/impl separation.
>
> thx
> PS: If folks are interested in pursing this, I can put together a prototype
> of how  #2 would work (I don't think it will be more than 200 lines of
> code)
>
>
> On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
> > Yes, Guava is a constant pain; there's lots of open JIRAs related to it,
> as
> > its the one we can't seamlessly upgrade. Not unless we do our own fork
> and
> > reinsert the missing classes.
> >
> > The most common uses in the code are
> >
> > @VisibleForTesting (easily replicated)
> > and the Precondition.check() operations
> >
> > The latter is also easily swapped out, and we could even add the check
> they
> > forgot:
> > Preconditions.checkArgNotNull(argname, arg)
> >
> >
> > These are easy; its the more complex data structures that matter more.
> >
> > I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> > something. Even if we continue to ship Guava 11 so that the HBase team
> > don't send any (more) death threats, we can/should rework Hadoop to build
> > and run against Guava 16+ too. That's needed to fix some of the recent
> java
> > 7/8+ changes.
> >
> > -Everything in v11 dropped from v16 MUST  to be implemented with our own
> > versions.
> > -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> > wherever possible.
> >
> > I think for 2.7+ we should add some new profiles to the POM, for Java 8
> and
> > 9 alongside the new baseline java 7. For those later versions we could
> > perhaps mandate Guava 16.
> >
> >
> >
> > On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > ... has been a constant pain w.r.t compatibility etc.
> > >
> > > Should we consider adopting a policy to not use guava in
> > Common/HDFS/YARN?
> > >
> > > MR doesn't matter too much since it's application-side issue, it does
> > hurt
> > > end-users though since they still might want a newer guava-version, but
> > at
> > > least they can modify MR.
> > >
> > > Thoughts?
> > >
> > > thanks,
> > > Arun
> > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Guava

Posted by Sangjin Lee <sj...@apache.org>.
FYI, we have an existing ApplicationClassLoader implementation that is used
to isolate client/task classes from the rest. If we're going down the route
of classloader isolation on this, it would be good to come up with a
coherent strategy regarding both of these.

As a more practical step, I like the idea of isolating usage of guava that
breaks with guava 16 and later. I assume (but I haven't looked into it)
that it's fairly straightforward to isolate them and fix them. That work
could be done at any time without any version upgrades or impacting users.

On Mon, Nov 10, 2014 at 9:26 AM, Alejandro Abdelnur <tu...@gmail.com>
wrote:

> IMO we should:
>
> 1* have a clean and thin client API JAR (which does not drag any 3rd party
> dependencies, or a well defined small set -i.e. slf4j & log4j-)
> 2* have a client implementation that uses a classloader to isolate client
> impl 3rd party deps from app dependencies.
>
> #2 can be done using a stock URLClassLoader (i would just subclass it to
> forbid packages in the API JAR and exposed 3rd parties to be loaded from
> the app JAR)
>
> #1 is the tricky thing as our current API modules don't have a clean
> API/impl separation.
>
> thx
> PS: If folks are interested in pursing this, I can put together a prototype
> of how  #2 would work (I don't think it will be more than 200 lines of
> code)
>
>
> On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
> > Yes, Guava is a constant pain; there's lots of open JIRAs related to it,
> as
> > its the one we can't seamlessly upgrade. Not unless we do our own fork
> and
> > reinsert the missing classes.
> >
> > The most common uses in the code are
> >
> > @VisibleForTesting (easily replicated)
> > and the Precondition.check() operations
> >
> > The latter is also easily swapped out, and we could even add the check
> they
> > forgot:
> > Preconditions.checkArgNotNull(argname, arg)
> >
> >
> > These are easy; its the more complex data structures that matter more.
> >
> > I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> > something. Even if we continue to ship Guava 11 so that the HBase team
> > don't send any (more) death threats, we can/should rework Hadoop to build
> > and run against Guava 16+ too. That's needed to fix some of the recent
> java
> > 7/8+ changes.
> >
> > -Everything in v11 dropped from v16 MUST  to be implemented with our own
> > versions.
> > -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> > wherever possible.
> >
> > I think for 2.7+ we should add some new profiles to the POM, for Java 8
> and
> > 9 alongside the new baseline java 7. For those later versions we could
> > perhaps mandate Guava 16.
> >
> >
> >
> > On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > ... has been a constant pain w.r.t compatibility etc.
> > >
> > > Should we consider adopting a policy to not use guava in
> > Common/HDFS/YARN?
> > >
> > > MR doesn't matter too much since it's application-side issue, it does
> > hurt
> > > end-users though since they still might want a newer guava-version, but
> > at
> > > least they can modify MR.
> > >
> > > Thoughts?
> > >
> > > thanks,
> > > Arun
> > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Guava

Posted by Sangjin Lee <sj...@apache.org>.
FYI, we have an existing ApplicationClassLoader implementation that is used
to isolate client/task classes from the rest. If we're going down the route
of classloader isolation on this, it would be good to come up with a
coherent strategy regarding both of these.

As a more practical step, I like the idea of isolating usage of guava that
breaks with guava 16 and later. I assume (but I haven't looked into it)
that it's fairly straightforward to isolate them and fix them. That work
could be done at any time without any version upgrades or impacting users.

On Mon, Nov 10, 2014 at 9:26 AM, Alejandro Abdelnur <tu...@gmail.com>
wrote:

> IMO we should:
>
> 1* have a clean and thin client API JAR (which does not drag any 3rd party
> dependencies, or a well defined small set -i.e. slf4j & log4j-)
> 2* have a client implementation that uses a classloader to isolate client
> impl 3rd party deps from app dependencies.
>
> #2 can be done using a stock URLClassLoader (i would just subclass it to
> forbid packages in the API JAR and exposed 3rd parties to be loaded from
> the app JAR)
>
> #1 is the tricky thing as our current API modules don't have a clean
> API/impl separation.
>
> thx
> PS: If folks are interested in pursing this, I can put together a prototype
> of how  #2 would work (I don't think it will be more than 200 lines of
> code)
>
>
> On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
> > Yes, Guava is a constant pain; there's lots of open JIRAs related to it,
> as
> > its the one we can't seamlessly upgrade. Not unless we do our own fork
> and
> > reinsert the missing classes.
> >
> > The most common uses in the code are
> >
> > @VisibleForTesting (easily replicated)
> > and the Precondition.check() operations
> >
> > The latter is also easily swapped out, and we could even add the check
> they
> > forgot:
> > Preconditions.checkArgNotNull(argname, arg)
> >
> >
> > These are easy; its the more complex data structures that matter more.
> >
> > I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> > something. Even if we continue to ship Guava 11 so that the HBase team
> > don't send any (more) death threats, we can/should rework Hadoop to build
> > and run against Guava 16+ too. That's needed to fix some of the recent
> java
> > 7/8+ changes.
> >
> > -Everything in v11 dropped from v16 MUST  to be implemented with our own
> > versions.
> > -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> > wherever possible.
> >
> > I think for 2.7+ we should add some new profiles to the POM, for Java 8
> and
> > 9 alongside the new baseline java 7. For those later versions we could
> > perhaps mandate Guava 16.
> >
> >
> >
> > On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > ... has been a constant pain w.r.t compatibility etc.
> > >
> > > Should we consider adopting a policy to not use guava in
> > Common/HDFS/YARN?
> > >
> > > MR doesn't matter too much since it's application-side issue, it does
> > hurt
> > > end-users though since they still might want a newer guava-version, but
> > at
> > > least they can modify MR.
> > >
> > > Thoughts?
> > >
> > > thanks,
> > > Arun
> > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Guava

Posted by Sangjin Lee <sj...@apache.org>.
FYI, we have an existing ApplicationClassLoader implementation that is used
to isolate client/task classes from the rest. If we're going down the route
of classloader isolation on this, it would be good to come up with a
coherent strategy regarding both of these.

As a more practical step, I like the idea of isolating usage of guava that
breaks with guava 16 and later. I assume (but I haven't looked into it)
that it's fairly straightforward to isolate them and fix them. That work
could be done at any time without any version upgrades or impacting users.

On Mon, Nov 10, 2014 at 9:26 AM, Alejandro Abdelnur <tu...@gmail.com>
wrote:

> IMO we should:
>
> 1* have a clean and thin client API JAR (which does not drag any 3rd party
> dependencies, or a well defined small set -i.e. slf4j & log4j-)
> 2* have a client implementation that uses a classloader to isolate client
> impl 3rd party deps from app dependencies.
>
> #2 can be done using a stock URLClassLoader (i would just subclass it to
> forbid packages in the API JAR and exposed 3rd parties to be loaded from
> the app JAR)
>
> #1 is the tricky thing as our current API modules don't have a clean
> API/impl separation.
>
> thx
> PS: If folks are interested in pursing this, I can put together a prototype
> of how  #2 would work (I don't think it will be more than 200 lines of
> code)
>
>
> On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
> > Yes, Guava is a constant pain; there's lots of open JIRAs related to it,
> as
> > its the one we can't seamlessly upgrade. Not unless we do our own fork
> and
> > reinsert the missing classes.
> >
> > The most common uses in the code are
> >
> > @VisibleForTesting (easily replicated)
> > and the Precondition.check() operations
> >
> > The latter is also easily swapped out, and we could even add the check
> they
> > forgot:
> > Preconditions.checkArgNotNull(argname, arg)
> >
> >
> > These are easy; its the more complex data structures that matter more.
> >
> > I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> > something. Even if we continue to ship Guava 11 so that the HBase team
> > don't send any (more) death threats, we can/should rework Hadoop to build
> > and run against Guava 16+ too. That's needed to fix some of the recent
> java
> > 7/8+ changes.
> >
> > -Everything in v11 dropped from v16 MUST  to be implemented with our own
> > versions.
> > -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> > wherever possible.
> >
> > I think for 2.7+ we should add some new profiles to the POM, for Java 8
> and
> > 9 alongside the new baseline java 7. For those later versions we could
> > perhaps mandate Guava 16.
> >
> >
> >
> > On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
> >
> > > ... has been a constant pain w.r.t compatibility etc.
> > >
> > > Should we consider adopting a policy to not use guava in
> > Common/HDFS/YARN?
> > >
> > > MR doesn't matter too much since it's application-side issue, it does
> > hurt
> > > end-users though since they still might want a newer guava-version, but
> > at
> > > least they can modify MR.
> > >
> > > Thoughts?
> > >
> > > thanks,
> > > Arun
> > >
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> > >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>

Re: Guava

Posted by Colin McCabe <cm...@alumni.cmu.edu>.
I'm usually an advocate for getting rid of unnecessary dependencies
(cough, jetty, cough), but a lot of the things in Guava are really
useful.

Immutable collections, BiMap, Multisets, Arrays#asList, the stuff for
writing hashCode() and equals(), String#Joiner, the list goes on.  We
particularly use the Cache/CacheBuilder stuff a lot in HDFS to get
maps with LRU eviction without writing a lot of boilerplate.  The QJM
stuff uses ListenableFuture a lot, although perhaps we could come up
with our own equivalent for that.

On Mon, Nov 10, 2014 at 9:26 AM, Alejandro Abdelnur <tu...@gmail.com> wrote:
> IMO we should:
>
> 1* have a clean and thin client API JAR (which does not drag any 3rd party
> dependencies, or a well defined small set -i.e. slf4j & log4j-)
> 2* have a client implementation that uses a classloader to isolate client
> impl 3rd party deps from app dependencies.
>
> #2 can be done using a stock URLClassLoader (i would just subclass it to
> forbid packages in the API JAR and exposed 3rd parties to be loaded from
> the app JAR)
>
> #1 is the tricky thing as our current API modules don't have a clean
> API/impl separation.
>
> thx
> PS: If folks are interested in pursing this, I can put together a prototype
> of how  #2 would work (I don't think it will be more than 200 lines of code)

Absolutely, I agree that we should not be using Guava types in public
APIs.  Guava has not been very responsible with backwards
compatibility, that much is clear.

A client / server jar separation is an interesting idea.  But then we
still have to get rid of Guava and other library deps in the client
jars.  I think it would be more work than it seems.  For example, the
HDFS client uses Guava Cache a lot, so we'd have to write our own
version of this.

Can't we just shade this stuff?  Has anyone tried shading Hadoop's Guava?

best,
Colin


>
>
> On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
>
>> Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
>> its the one we can't seamlessly upgrade. Not unless we do our own fork and
>> reinsert the missing classes.
>>
>> The most common uses in the code are
>>
>> @VisibleForTesting (easily replicated)
>> and the Precondition.check() operations
>>
>> The latter is also easily swapped out, and we could even add the check they
>> forgot:
>> Preconditions.checkArgNotNull(argname, arg)
>>
>>
>> These are easy; its the more complex data structures that matter more.
>>
>> I think for Hadoop 2.7 & java 7 we need to look at this problem and do
>> something. Even if we continue to ship Guava 11 so that the HBase team
>> don't send any (more) death threats, we can/should rework Hadoop to build
>> and run against Guava 16+ too. That's needed to fix some of the recent java
>> 7/8+ changes.
>>
>> -Everything in v11 dropped from v16 MUST  to be implemented with our own
>> versions.
>> -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
>> wherever possible.
>>
>> I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
>> 9 alongside the new baseline java 7. For those later versions we could
>> perhaps mandate Guava 16.
>>
>>
>>
>> On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>> > … has been a constant pain w.r.t compatibility etc.
>> >
>> > Should we consider adopting a policy to not use guava in
>> Common/HDFS/YARN?
>> >
>> > MR doesn't matter too much since it's application-side issue, it does
>> hurt
>> > end-users though since they still might want a newer guava-version, but
>> at
>> > least they can modify MR.
>> >
>> > Thoughts?
>> >
>> > thanks,
>> > Arun
>> >
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or entity
>> to
>> > which it is addressed and may contain information that is confidential,
>> > privileged and exempt from disclosure under applicable law. If the reader
>> > of this message is not the intended recipient, you are hereby notified
>> that
>> > any printing, copying, dissemination, distribution, disclosure or
>> > forwarding of this communication is strictly prohibited. If you have
>> > received this communication in error, please contact the sender
>> immediately
>> > and delete it from your system. Thank You.
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>

Re: Guava

Posted by Alejandro Abdelnur <tu...@gmail.com>.
IMO we should:

1* have a clean and thin client API JAR (which does not drag any 3rd party
dependencies, or a well defined small set -i.e. slf4j & log4j-)
2* have a client implementation that uses a classloader to isolate client
impl 3rd party deps from app dependencies.

#2 can be done using a stock URLClassLoader (i would just subclass it to
forbid packages in the API JAR and exposed 3rd parties to be loaded from
the app JAR)

#1 is the tricky thing as our current API modules don't have a clean
API/impl separation.

thx
PS: If folks are interested in pursing this, I can put together a prototype
of how  #2 would work (I don't think it will be more than 200 lines of code)


On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
> its the one we can't seamlessly upgrade. Not unless we do our own fork and
> reinsert the missing classes.
>
> The most common uses in the code are
>
> @VisibleForTesting (easily replicated)
> and the Precondition.check() operations
>
> The latter is also easily swapped out, and we could even add the check they
> forgot:
> Preconditions.checkArgNotNull(argname, arg)
>
>
> These are easy; its the more complex data structures that matter more.
>
> I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> something. Even if we continue to ship Guava 11 so that the HBase team
> don't send any (more) death threats, we can/should rework Hadoop to build
> and run against Guava 16+ too. That's needed to fix some of the recent java
> 7/8+ changes.
>
> -Everything in v11 dropped from v16 MUST  to be implemented with our own
> versions.
> -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> wherever possible.
>
> I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
> 9 alongside the new baseline java 7. For those later versions we could
> perhaps mandate Guava 16.
>
>
>
> On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > … has been a constant pain w.r.t compatibility etc.
> >
> > Should we consider adopting a policy to not use guava in
> Common/HDFS/YARN?
> >
> > MR doesn't matter too much since it's application-side issue, it does
> hurt
> > end-users though since they still might want a newer guava-version, but
> at
> > least they can modify MR.
> >
> > Thoughts?
> >
> > thanks,
> > Arun
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Guava

Posted by Alejandro Abdelnur <tu...@gmail.com>.
IMO we should:

1* have a clean and thin client API JAR (which does not drag any 3rd party
dependencies, or a well defined small set -i.e. slf4j & log4j-)
2* have a client implementation that uses a classloader to isolate client
impl 3rd party deps from app dependencies.

#2 can be done using a stock URLClassLoader (i would just subclass it to
forbid packages in the API JAR and exposed 3rd parties to be loaded from
the app JAR)

#1 is the tricky thing as our current API modules don't have a clean
API/impl separation.

thx
PS: If folks are interested in pursing this, I can put together a prototype
of how  #2 would work (I don't think it will be more than 200 lines of code)


On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
> its the one we can't seamlessly upgrade. Not unless we do our own fork and
> reinsert the missing classes.
>
> The most common uses in the code are
>
> @VisibleForTesting (easily replicated)
> and the Precondition.check() operations
>
> The latter is also easily swapped out, and we could even add the check they
> forgot:
> Preconditions.checkArgNotNull(argname, arg)
>
>
> These are easy; its the more complex data structures that matter more.
>
> I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> something. Even if we continue to ship Guava 11 so that the HBase team
> don't send any (more) death threats, we can/should rework Hadoop to build
> and run against Guava 16+ too. That's needed to fix some of the recent java
> 7/8+ changes.
>
> -Everything in v11 dropped from v16 MUST  to be implemented with our own
> versions.
> -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> wherever possible.
>
> I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
> 9 alongside the new baseline java 7. For those later versions we could
> perhaps mandate Guava 16.
>
>
>
> On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > … has been a constant pain w.r.t compatibility etc.
> >
> > Should we consider adopting a policy to not use guava in
> Common/HDFS/YARN?
> >
> > MR doesn't matter too much since it's application-side issue, it does
> hurt
> > end-users though since they still might want a newer guava-version, but
> at
> > least they can modify MR.
> >
> > Thoughts?
> >
> > thanks,
> > Arun
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Guava

Posted by Alejandro Abdelnur <tu...@gmail.com>.
IMO we should:

1* have a clean and thin client API JAR (which does not drag any 3rd party
dependencies, or a well defined small set -i.e. slf4j & log4j-)
2* have a client implementation that uses a classloader to isolate client
impl 3rd party deps from app dependencies.

#2 can be done using a stock URLClassLoader (i would just subclass it to
forbid packages in the API JAR and exposed 3rd parties to be loaded from
the app JAR)

#1 is the tricky thing as our current API modules don't have a clean
API/impl separation.

thx
PS: If folks are interested in pursing this, I can put together a prototype
of how  #2 would work (I don't think it will be more than 200 lines of code)


On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
> its the one we can't seamlessly upgrade. Not unless we do our own fork and
> reinsert the missing classes.
>
> The most common uses in the code are
>
> @VisibleForTesting (easily replicated)
> and the Precondition.check() operations
>
> The latter is also easily swapped out, and we could even add the check they
> forgot:
> Preconditions.checkArgNotNull(argname, arg)
>
>
> These are easy; its the more complex data structures that matter more.
>
> I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> something. Even if we continue to ship Guava 11 so that the HBase team
> don't send any (more) death threats, we can/should rework Hadoop to build
> and run against Guava 16+ too. That's needed to fix some of the recent java
> 7/8+ changes.
>
> -Everything in v11 dropped from v16 MUST  to be implemented with our own
> versions.
> -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> wherever possible.
>
> I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
> 9 alongside the new baseline java 7. For those later versions we could
> perhaps mandate Guava 16.
>
>
>
> On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > … has been a constant pain w.r.t compatibility etc.
> >
> > Should we consider adopting a policy to not use guava in
> Common/HDFS/YARN?
> >
> > MR doesn't matter too much since it's application-side issue, it does
> hurt
> > end-users though since they still might want a newer guava-version, but
> at
> > least they can modify MR.
> >
> > Thoughts?
> >
> > thanks,
> > Arun
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Guava

Posted by Alejandro Abdelnur <tu...@gmail.com>.
IMO we should:

1* have a clean and thin client API JAR (which does not drag any 3rd party
dependencies, or a well defined small set -i.e. slf4j & log4j-)
2* have a client implementation that uses a classloader to isolate client
impl 3rd party deps from app dependencies.

#2 can be done using a stock URLClassLoader (i would just subclass it to
forbid packages in the API JAR and exposed 3rd parties to be loaded from
the app JAR)

#1 is the tricky thing as our current API modules don't have a clean
API/impl separation.

thx
PS: If folks are interested in pursing this, I can put together a prototype
of how  #2 would work (I don't think it will be more than 200 lines of code)


On Mon, Nov 10, 2014 at 5:18 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
> its the one we can't seamlessly upgrade. Not unless we do our own fork and
> reinsert the missing classes.
>
> The most common uses in the code are
>
> @VisibleForTesting (easily replicated)
> and the Precondition.check() operations
>
> The latter is also easily swapped out, and we could even add the check they
> forgot:
> Preconditions.checkArgNotNull(argname, arg)
>
>
> These are easy; its the more complex data structures that matter more.
>
> I think for Hadoop 2.7 & java 7 we need to look at this problem and do
> something. Even if we continue to ship Guava 11 so that the HBase team
> don't send any (more) death threats, we can/should rework Hadoop to build
> and run against Guava 16+ too. That's needed to fix some of the recent java
> 7/8+ changes.
>
> -Everything in v11 dropped from v16 MUST  to be implemented with our own
> versions.
> -anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
> wherever possible.
>
> I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
> 9 alongside the new baseline java 7. For those later versions we could
> perhaps mandate Guava 16.
>
>
>
> On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > … has been a constant pain w.r.t compatibility etc.
> >
> > Should we consider adopting a policy to not use guava in
> Common/HDFS/YARN?
> >
> > MR doesn't matter too much since it's application-side issue, it does
> hurt
> > end-users though since they still might want a newer guava-version, but
> at
> > least they can modify MR.
> >
> > Thoughts?
> >
> > thanks,
> > Arun
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Guava

Posted by Steve Loughran <st...@hortonworks.com>.
Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
its the one we can't seamlessly upgrade. Not unless we do our own fork and
reinsert the missing classes.

The most common uses in the code are

@VisibleForTesting (easily replicated)
and the Precondition.check() operations

The latter is also easily swapped out, and we could even add the check they
forgot:
Preconditions.checkArgNotNull(argname, arg)


These are easy; its the more complex data structures that matter more.

I think for Hadoop 2.7 & java 7 we need to look at this problem and do
something. Even if we continue to ship Guava 11 so that the HBase team
don't send any (more) death threats, we can/should rework Hadoop to build
and run against Guava 16+ too. That's needed to fix some of the recent java
7/8+ changes.

-Everything in v11 dropped from v16 MUST  to be implemented with our own
versions.
-anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
wherever possible.

I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
9 alongside the new baseline java 7. For those later versions we could
perhaps mandate Guava 16.



On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:

> … has been a constant pain w.r.t compatibility etc.
>
> Should we consider adopting a policy to not use guava in Common/HDFS/YARN?
>
> MR doesn't matter too much since it's application-side issue, it does hurt
> end-users though since they still might want a newer guava-version, but at
> least they can modify MR.
>
> Thoughts?
>
> thanks,
> Arun
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Guava

Posted by Steve Loughran <st...@hortonworks.com>.
Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
its the one we can't seamlessly upgrade. Not unless we do our own fork and
reinsert the missing classes.

The most common uses in the code are

@VisibleForTesting (easily replicated)
and the Precondition.check() operations

The latter is also easily swapped out, and we could even add the check they
forgot:
Preconditions.checkArgNotNull(argname, arg)


These are easy; its the more complex data structures that matter more.

I think for Hadoop 2.7 & java 7 we need to look at this problem and do
something. Even if we continue to ship Guava 11 so that the HBase team
don't send any (more) death threats, we can/should rework Hadoop to build
and run against Guava 16+ too. That's needed to fix some of the recent java
7/8+ changes.

-Everything in v11 dropped from v16 MUST  to be implemented with our own
versions.
-anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
wherever possible.

I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
9 alongside the new baseline java 7. For those later versions we could
perhaps mandate Guava 16.



On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:

> … has been a constant pain w.r.t compatibility etc.
>
> Should we consider adopting a policy to not use guava in Common/HDFS/YARN?
>
> MR doesn't matter too much since it's application-side issue, it does hurt
> end-users though since they still might want a newer guava-version, but at
> least they can modify MR.
>
> Thoughts?
>
> thanks,
> Arun
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Guava

Posted by Steve Loughran <st...@hortonworks.com>.
Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
its the one we can't seamlessly upgrade. Not unless we do our own fork and
reinsert the missing classes.

The most common uses in the code are

@VisibleForTesting (easily replicated)
and the Precondition.check() operations

The latter is also easily swapped out, and we could even add the check they
forgot:
Preconditions.checkArgNotNull(argname, arg)


These are easy; its the more complex data structures that matter more.

I think for Hadoop 2.7 & java 7 we need to look at this problem and do
something. Even if we continue to ship Guava 11 so that the HBase team
don't send any (more) death threats, we can/should rework Hadoop to build
and run against Guava 16+ too. That's needed to fix some of the recent java
7/8+ changes.

-Everything in v11 dropped from v16 MUST  to be implemented with our own
versions.
-anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
wherever possible.

I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
9 alongside the new baseline java 7. For those later versions we could
perhaps mandate Guava 16.



On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:

> … has been a constant pain w.r.t compatibility etc.
>
> Should we consider adopting a policy to not use guava in Common/HDFS/YARN?
>
> MR doesn't matter too much since it's application-side issue, it does hurt
> end-users though since they still might want a newer guava-version, but at
> least they can modify MR.
>
> Thoughts?
>
> thanks,
> Arun
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Guava

Posted by Vinayakumar B <vi...@apache.org>.
As Haohui Mai said, removing the dependency on the Guava may not be a good
idea.

But, instead can we use a fixed guava version in Hadoop which is stable as
of now, with a shaded package structure ?
 so that it will not break the application level dependency on another
version of the Guava. Inside Hadoop we can always use the shaded package of
guava
I think similar idea has been proposed in some Jira, I don't remember the
exact Jira number.

Regards,
Vinay

On Mon, Nov 10, 2014 at 7:13 AM, Haohui Mai <hm...@hortonworks.com> wrote:

> Guava did make the lives of Hadoop development easier in many cases -- What
> I've been consistently hearing is that the version of Guava used is Hadoop
> is so old that it starts to hurt the application developers.
>
> I appreciate the value of Guava -- things like CacheMap are fairly
> difficult to implement efficiently and correctly.
>
> I think that creating separate client libraries for Hadoop can largely
> alleviate the problem -- obviously these libraries cannot use Guava, but it
> allows us to use Guava's help on the server side. For example, HDFS-6200 is
> one of the initiatives.
>
> Just my two cents.
>
> Regards,
> Haohui
>
> On Sun, Nov 9, 2014 at 4:42 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> > … has been a constant pain w.r.t compatibility etc.
> >
> > Should we consider adopting a policy to not use guava in
> Common/HDFS/YARN?
> >
> > MR doesn't matter too much since it's application-side issue, it does
> hurt
> > end-users though since they still might want a newer guava-version, but
> at
> > least they can modify MR.
> >
> > Thoughts?
> >
> > thanks,
> > Arun
> >
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Guava

Posted by Haohui Mai <hm...@hortonworks.com>.
Guava did make the lives of Hadoop development easier in many cases -- What
I've been consistently hearing is that the version of Guava used is Hadoop
is so old that it starts to hurt the application developers.

I appreciate the value of Guava -- things like CacheMap are fairly
difficult to implement efficiently and correctly.

I think that creating separate client libraries for Hadoop can largely
alleviate the problem -- obviously these libraries cannot use Guava, but it
allows us to use Guava's help on the server side. For example, HDFS-6200 is
one of the initiatives.

Just my two cents.

Regards,
Haohui

On Sun, Nov 9, 2014 at 4:42 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> … has been a constant pain w.r.t compatibility etc.
>
> Should we consider adopting a policy to not use guava in Common/HDFS/YARN?
>
> MR doesn't matter too much since it's application-side issue, it does hurt
> end-users though since they still might want a newer guava-version, but at
> least they can modify MR.
>
> Thoughts?
>
> thanks,
> Arun
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Guava

Posted by Steve Loughran <st...@hortonworks.com>.
Yes, Guava is a constant pain; there's lots of open JIRAs related to it, as
its the one we can't seamlessly upgrade. Not unless we do our own fork and
reinsert the missing classes.

The most common uses in the code are

@VisibleForTesting (easily replicated)
and the Precondition.check() operations

The latter is also easily swapped out, and we could even add the check they
forgot:
Preconditions.checkArgNotNull(argname, arg)


These are easy; its the more complex data structures that matter more.

I think for Hadoop 2.7 & java 7 we need to look at this problem and do
something. Even if we continue to ship Guava 11 so that the HBase team
don't send any (more) death threats, we can/should rework Hadoop to build
and run against Guava 16+ too. That's needed to fix some of the recent java
7/8+ changes.

-Everything in v11 dropped from v16 MUST  to be implemented with our own
versions.
-anything tagged as deprecated in 11+ SHOULD be replaced by newer stuff,
wherever possible.

I think for 2.7+ we should add some new profiles to the POM, for Java 8 and
9 alongside the new baseline java 7. For those later versions we could
perhaps mandate Guava 16.



On 10 November 2014 00:42, Arun C Murthy <ac...@hortonworks.com> wrote:

> … has been a constant pain w.r.t compatibility etc.
>
> Should we consider adopting a policy to not use guava in Common/HDFS/YARN?
>
> MR doesn't matter too much since it's application-side issue, it does hurt
> end-users though since they still might want a newer guava-version, but at
> least they can modify MR.
>
> Thoughts?
>
> thanks,
> Arun
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Guava

Posted by Haohui Mai <hm...@hortonworks.com>.
Guava did make the lives of Hadoop development easier in many cases -- What
I've been consistently hearing is that the version of Guava used is Hadoop
is so old that it starts to hurt the application developers.

I appreciate the value of Guava -- things like CacheMap are fairly
difficult to implement efficiently and correctly.

I think that creating separate client libraries for Hadoop can largely
alleviate the problem -- obviously these libraries cannot use Guava, but it
allows us to use Guava's help on the server side. For example, HDFS-6200 is
one of the initiatives.

Just my two cents.

Regards,
Haohui

On Sun, Nov 9, 2014 at 4:42 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> … has been a constant pain w.r.t compatibility etc.
>
> Should we consider adopting a policy to not use guava in Common/HDFS/YARN?
>
> MR doesn't matter too much since it's application-side issue, it does hurt
> end-users though since they still might want a newer guava-version, but at
> least they can modify MR.
>
> Thoughts?
>
> thanks,
> Arun
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.