You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by Andrew Purtell <ap...@apache.org> on 2012/06/08 23:54:08 UTC

when dependencies in common to more than one project appear on the classpath

This is a HBase shell nit specifically, and the warning from SLF4J is
purely a warning:

    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in
[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in
[jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for
an explanation.

but there are other cases where dependencies in common to more than
one project will appear on the classpath, sometimes with differing
versions. Jackson comes to mind.

How this would be handled at the OS level is each common dependency
would be factored out into a library package.

Is there something under consideration for Bigtop for issues like this?

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: when dependencies in common to more than one project appear on the classpath

Posted by Andrew Purtell <ap...@apache.org>.
On Fri, Jun 8, 2012 at 3:29 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> Still, I really think we should address this. Here's a JIRA that
> tracks the effort:
>    https://issues.apache.org/jira/browse/BIGTOP-276
> this is clearly post Bigtop 0.4.0 but it could very well make it into
> Bigtop 0.5.0.
>
> Of course, this is half of the problem -- the other half is making sure
> that we can harmonize the versions of all the dependencies between
> the projects. We do that in CDH (via patching poms, etc) but haven't
> done anything like that in Bigtop yet.
>
>> How this would be handled at the OS level is each common dependency
>> would be factored out into a library package.
>
> I'm keeping an eye on what Linux vendors do to address this issue. Here's
> a blog post that is well worth a read. I think whatever we end up implementing
> in Bigtop needs to be aligned (to the best possible extent) with Linux vendors:
>    http://duncan.mac-vicar.com/2012/01/26/on-java-maven-jpp-and-rpm/
>    http://fedoraproject.org/wiki/Packaging:Java#build-classpath
>
> Of course, if anybody has any concrete proposal I'd love for them to
> be articulated on the JIRA or mailing list -- do let me know!

Thanks for the pointers Roman.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: when dependencies in common to more than one project appear on the classpath

Posted by Konstantin Boudnik <co...@apache.org>.
In this case the integration project might have to take a bold step and deal
with transitive dependencies explicitly by excluding them and validating new
dependencies tree through the integration testing.

Sure, this puts an extreme burned on the integration project, but there's a
hope to eventually push this function down the the component projects.

Cos

On Mon, Jun 11, 2012 at 03:34PM, Andrew Purtell wrote:
> On Mon, Jun 11, 2012 at 3:25 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
> > If a project dependent on Hadoop uses a non-compat version of Guava (as per
> > your example), you are just playing russian roulette with your classpath.
> 
> True but the choices and priorities represented by a BOM may not be
> aligned with those of some component project. For example, some are
> making a greater emphasis on testing with an adapting to Hadoop 2 than
> others.
> 
> > IMO, the different projects have to harmonize their component versions,
> > specially because none of them uses implementation-dependencies isolation
> > (ie by classloader means). It seem to me a much easier goal to harmonize
> > than achieve implementation-dependencies isolation. It shouldn't be that
> > difficult, all the projects are cross-pollinated with developers working in
> > multiple of them, but this is just my take.
> 
> Sure maybe it will all work out.
> 
> Best regards,
> 
> ═ ═- Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)

Re: when dependencies in common to more than one project appear on the classpath

Posted by Andrew Purtell <ap...@apache.org>.
On Mon, Jun 11, 2012 at 3:25 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
> If a project dependent on Hadoop uses a non-compat version of Guava (as per
> your example), you are just playing russian roulette with your classpath.

True but the choices and priorities represented by a BOM may not be
aligned with those of some component project. For example, some are
making a greater emphasis on testing with an adapting to Hadoop 2 than
others.

> IMO, the different projects have to harmonize their component versions,
> specially because none of them uses implementation-dependencies isolation
> (ie by classloader means). It seem to me a much easier goal to harmonize
> than achieve implementation-dependencies isolation. It shouldn't be that
> difficult, all the projects are cross-pollinated with developers working in
> multiple of them, but this is just my take.

Sure maybe it will all work out.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: when dependencies in common to more than one project appear on the classpath

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
On Mon, Jun 11, 2012 at 2:21 PM, Andrew Purtell <ap...@apache.org> wrote:

> On Mon, Jun 11, 2012 at 10:13 AM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
> > On Mon, Jun 11, 2012 at 10:10 AM, Andrew Purtell <apurtell@apache.org
> >wrote:
> >> In the SLF4J case it's not enough just to harmonize the version, there
> >> can only be one instance of it on the classpath.Even one case like
> >> this requires the ability to refactor the classpath?
> >
> > IMO a solution for this would be:
> >
> > 1* harmonize all dep versions
> > 2* assuming #1, have a classpath constructor script that dedups all JARs
> in
> > the classpath given a list of JAR dirs
> > 3* use the classpath from #2 to run the app
>
> Assuming #1 in all cases is probably not a viable strategy. Guava is
> an example of how a version change can require nontrivial source
> changes, and different projects will have different priorities.



If a project dependent on Hadoop uses a non-compat version of Guava (as per
your example), you are just playing russian roulette with your classpath.
IMO, the different projects have to harmonize their component versions,
specially because none of them uses implementation-dependencies isolation
(ie by classloader means). It seem to me a much easier goal to harmonize
than achieve implementation-dependencies isolation. It shouldn't be that
difficult, all the projects are cross-pollinated with developers working in
multiple of them, but this is just my take.

thx

A
> number of projects use LimitedPrivate interfaces. Security in general
> has an evolving API. And that's before the more fun aspects of Hadoop
> "cooptition" might become involved.
>
> Given simple dedup is not enough, the classpath constructor script
> would need some kind of dependency tracking and manifest? That should
> be tied in to the capabilities of the OS package manager? Consider
> from http://duncan.mac-vicar.com/2012/01/26/on-java-maven-jpp-and-rpm/:
>
> >>>
> /usr/share/java/foo1.jar
> /usr/share/java/foo2.jar
> /usr/share/java/org.bar/1.0/foo.jar -> /usr/share/java/foo1.jar
> /usr/share/java/org.bar/2.0/foo.jar -> /usr/share/java/foo2.jar
> /usr/share/java/org.bar/1.0/foo.pom
> /usr/share/java/org.bar/2.0/foo.pom
>
> /usr/share/java/foo.jar -> /etc/alternatives/foo.jar
> <<<
>
> A path / symlink based layout like this would be supportable with apt,
> yum, etc.
>
> As a newcomer I'm probably rehashing an old discussion and for that I
> apologize. Just trying to figure out how one might manage an evolving
> deployment given a 3-5 year timeframe.
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)
>



-- 
Alejandro

Re: when dependencies in common to more than one project appear on the classpath

Posted by Andrew Purtell <ap...@apache.org>.
On Mon, Jun 11, 2012 at 10:13 AM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
> On Mon, Jun 11, 2012 at 10:10 AM, Andrew Purtell <ap...@apache.org>wrote:
>> In the SLF4J case it's not enough just to harmonize the version, there
>> can only be one instance of it on the classpath.Even one case like
>> this requires the ability to refactor the classpath?
>
> IMO a solution for this would be:
>
> 1* harmonize all dep versions
> 2* assuming #1, have a classpath constructor script that dedups all JARs in
> the classpath given a list of JAR dirs
> 3* use the classpath from #2 to run the app

Assuming #1 in all cases is probably not a viable strategy. Guava is
an example of how a version change can require nontrivial source
changes, and different projects will have different priorities. A
number of projects use LimitedPrivate interfaces. Security in general
has an evolving API. And that's before the more fun aspects of Hadoop
"cooptition" might become involved.

Given simple dedup is not enough, the classpath constructor script
would need some kind of dependency tracking and manifest? That should
be tied in to the capabilities of the OS package manager? Consider
from http://duncan.mac-vicar.com/2012/01/26/on-java-maven-jpp-and-rpm/:

>>>
/usr/share/java/foo1.jar
/usr/share/java/foo2.jar
/usr/share/java/org.bar/1.0/foo.jar -> /usr/share/java/foo1.jar
/usr/share/java/org.bar/2.0/foo.jar -> /usr/share/java/foo2.jar
/usr/share/java/org.bar/1.0/foo.pom
/usr/share/java/org.bar/2.0/foo.pom

/usr/share/java/foo.jar -> /etc/alternatives/foo.jar
<<<

A path / symlink based layout like this would be supportable with apt, yum, etc.

As a newcomer I'm probably rehashing an old discussion and for that I
apologize. Just trying to figure out how one might manage an evolving
deployment given a 3-5 year timeframe.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: when dependencies in common to more than one project appear on the classpath

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
IMO a solution for this would be:

1* harmonize all dep versions
2* assuming #1, have a classpath constructor script that dedups all JARs in
the classpath given a list of JAR dirs
3* use the classpath from #2 to run the app

hx

On Mon, Jun 11, 2012 at 10:10 AM, Andrew Purtell <ap...@apache.org>wrote:

> On Mon, Jun 11, 2012 at 9:10 AM, Tom White <to...@cloudera.com> wrote:
> > I think it would be valuable for Bigtop to publish a table of
> > component dependencies
> > (https://issues.apache.org/jira/browse/BIGTOP-375). Over time projects
> > might start using such a table to help harmonize version numbers of
> > their dependencies.
>
> I'd agree.
>
> In the SLF4J case it's not enough just to harmonize the version, there
> can only be one instance of it on the classpath.Even one case like
> this requires the ability to refactor the classpath?
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)
>



-- 
Alejandro

Re: when dependencies in common to more than one project appear on the classpath

Posted by Andrew Purtell <ap...@apache.org>.
On Mon, Jun 11, 2012 at 9:10 AM, Tom White <to...@cloudera.com> wrote:
> I think it would be valuable for Bigtop to publish a table of
> component dependencies
> (https://issues.apache.org/jira/browse/BIGTOP-375). Over time projects
> might start using such a table to help harmonize version numbers of
> their dependencies.

I'd agree.

In the SLF4J case it's not enough just to harmonize the version, there
can only be one instance of it on the classpath.Even one case like
this requires the ability to refactor the classpath?

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: when dependencies in common to more than one project appear on the classpath

Posted by Tom White <to...@cloudera.com>.
On Fri, Jun 8, 2012 at 5:29 PM, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> Of course, this is half of the problem -- the other half is making sure
> that we can harmonize the versions of all the dependencies between
> the projects. We do that in CDH (via patching poms, etc) but haven't
> done anything like that in Bigtop yet.
>

I think it would be valuable for Bigtop to publish a table of
component dependencies
(https://issues.apache.org/jira/browse/BIGTOP-375). Over time projects
might start using such a table to help harmonize version numbers of
their dependencies.

Cheers,
Tom

Re: when dependencies in common to more than one project appear on the classpath

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
+David (with whom we just spoke about this very subject):

On Fri, Jun 8, 2012 at 2:54 PM, Andrew Purtell <ap...@apache.org> wrote:
> This is a HBase shell nit specifically, and the warning from SLF4J is
> purely a warning:
>
>    SLF4J: Class path contains multiple SLF4J bindings.
>    SLF4J: Found binding in
> [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>    SLF4J: Found binding in
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for
> an explanation.
>
> but there are other cases where dependencies in common to more than
> one project will appear on the classpath, sometimes with differing
> versions. Jackson comes to mind.

This is very much a problem for Bigtop (and frankly any kind of Hadoop
distribution on the market) although it hasn't caused us much trouble
yet. Still, I really think we should address this. Here's a JIRA that
tracks the effort:
    https://issues.apache.org/jira/browse/BIGTOP-276
this is clearly post Bigtop 0.4.0 but it could very well make it into
Bigtop 0.5.0.

Of course, this is half of the problem -- the other half is making sure
that we can harmonize the versions of all the dependencies between
the projects. We do that in CDH (via patching poms, etc) but haven't
done anything like that in Bigtop yet.

> How this would be handled at the OS level is each common dependency
> would be factored out into a library package.

I'm keeping an eye on what Linux vendors do to address this issue. Here's
a blog post that is well worth a read. I think whatever we end up implementing
in Bigtop needs to be aligned (to the best possible extent) with Linux vendors:
    http://duncan.mac-vicar.com/2012/01/26/on-java-maven-jpp-and-rpm/
    http://fedoraproject.org/wiki/Packaging:Java#build-classpath

Of course, if anybody has any concrete proposal I'd love for them to
be articulated on the JIRA or mailing list -- do let me know!

Thanks,
Roman.