You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by Scott Carey <sc...@richrelevance.com> on 2010/09/21 05:53:23 UTC

Java Packaging - 'shade' jars and licensing implications

I think it would be a great idea for Avro if we took our more critical core dependencies and included their classes, obfuscated, in our jar file so that we don't have to worry about others conflicting with our versions.

I'm currently working on a Maven build for Avro Java, and one thing that we can easily leverage there is the 'shade' plugin.  This basically just migrates the package of a dependency into your namespace at build-time.   For example, see the "Using Paranamer in your application without depending on Paranamer's jar"  portion of http://paranamer.codehaus.org/

A simple example where this would help:  Avro is currently 'incompatible' with Hadoop 0.20+ because Hadoop includes Jackson 1.0.1 and has a very unsophisticated dependency management system.  If we packaged our version of Jackson into Avro, or provided a variation that did, then this would not be an issue for users.

For avro-core, the libraries of interest are Paranamer and Jackson.   Paranamer only adds up to 35K; Jackson is a bit larger at 500K but is absolutely critical for Avro.  Jackson-core is only 150K however -- it might be possible to drop jackson-mapper.


Paranamer is BSD, and Jackson is Apache.  I'm not sure of the implications of repackaging these into Avro.  I'd assume that BSD as usual doesn't care as long as the use is documented and attributed.   Since we're apache I'm guessing it won't be too hard to deal with Jackson either.  But I'm not sure if there is anything extra we have to declare.

Thoughts?

-Scott

Re: Java Packaging - 'shade' jars and licensing implications

Posted by Doug Cutting <cu...@apache.org>.
On 09/21/2010 11:20 AM, Scott Carey wrote:
> My understanding is that paranamer-generator is only used by
> paranamer-ant and paranamer-maven-plugin and not the runtime
> dependency.  Is that correct?  I'm only concerned with shade jars for
> the runtime dependency and avro-core.

Yes.  Paranamer-generator must be used at compile-time by applications 
that use reflection-based RPC.  Avro does not call paranamer-generator 
at runtime.  It calls only paranamer.  But the code that calls paranamer 
(reflect RPC) is useless if the application has not run 
paranamer-generator at compile time.

Doug

Re: Java Packaging - 'shade' jars and licensing implications

Posted by Scott Carey <sc...@richrelevance.com>.
On Sep 21, 2010, at 11:00 AM, Doug Cutting wrote:

> On 09/21/2010 10:25 AM, Scott Carey wrote:
>> Paranamer has no transitive dependencies.
> 
> paranamer-generator requires qdox and asm.  It's required at build-time 
> by applications that use reflection-based RPC to add parameter names to 
> byte code.
> 
> Doug

 My understanding is that paranamer-generator is only used by paranamer-ant and paranamer-maven-plugin and not the runtime dependency.  Is that correct?  I'm only concerned with shade jars for the runtime dependency and avro-core.

Re: Java Packaging - 'shade' jars and licensing implications

Posted by Doug Cutting <cu...@apache.org>.
On 09/21/2010 10:25 AM, Scott Carey wrote:
> Paranamer has no transitive dependencies.

paranamer-generator requires qdox and asm.  It's required at build-time 
by applications that use reflection-based RPC to add parameter names to 
byte code.

Doug

Re: Java Packaging - 'shade' jars and licensing implications

Posted by Scott Carey <sc...@richrelevance.com>.
On Sep 21, 2010, at 10:07 AM, Doug Cutting wrote:

> On 09/20/2010 08:53 PM, Scott Carey wrote:
>> For avro-core, the libraries of interest are Paranamer and Jackson.
>> Paranamer only adds up to 35K; Jackson is a bit larger at 500K but is
>> absolutely critical for Avro.  Jackson-core is only 150K however --
>> it might be possible to drop jackson-mapper.
> 
> That's an interesting feature and could be useful.
> 
> I think it only works when we don't expose shaded classes as a part of 
> our public APIs.  We currently do expose Jackson in a few places: 
> Schema#Field has a constructor that accepts a JsonNode as a 
> representation of the default value; and JsonEncoder has a constructor 
> that accepts a JsonGenerator as output.  The former could probably be 
> easily fixed, but the latter might be harder to replace without losing 
> potentially useful end-user functionality.

Good point, we definitely have to entirely encapsulate such a dependency.  This might be possible with Jackson with some work or some wrappers.

There are two kinds of shading -- one that remaps the packages so that collisions won't happen, and another that just makes uber-jars and encapsulates.  The latter can be used if we expose features in our API, at the risk of colliding.  Thats probably not worth it though; the benefit to us is encapsulation and preventing collisions.

> 
> I'd be much more comfortable with shading if we had automated tests that 
> checked that shaded classes are not referenced in our public APIs.
> 

Hmm, I wonder if there is a good tool for that out there.  Something that shows the packages that your public facing APIs expose.
The shade plugin has another interesting feature:  multiple artifacts.  We could have avro-core.jar and avro-core-shaded.jar.  The 

> Tatu (Jackson's primary developer) has expressed remorse that 1.0.1 has 
> incompatibilities with subsequent 1.x releases.  So, instead of 
> defensive, we could be optimistic, and hope that that this won't occur 
> again in any Jackson 1.x releases.
> 
> Paranamer is only required for reflection-based RPC.  It's classes do 
> not appear in any public Avro APIs.  The paranamer jar itself is small, 
> but its transitive dependencies are large, I think.
> 

Paranamer has no transitive dependencies.   Paranamer-ant does, but I think that is a false runtime dependency of ours.  In my maven-build experiment I removed it and all tests pass.   The build uses the Paranamer Maven plugin to decorate test classes just like the ant tasks in the build.  The code only has Paranamer in its classpath and not paranamer-ant and works.   I don't fully understand the RPC use case however -- when at runtime does it have access to parameter names (source code) to decorate the classes?


>> Paranamer is BSD, and Jackson is Apache.  I'm not sure of the
>> implications of repackaging these into Avro.
> 
> We could include these in Avro without legal issues.
> 
> Doug

Thats good to hear.


Re: Java Packaging - 'shade' jars and licensing implications

Posted by Doug Cutting <cu...@apache.org>.
On 09/20/2010 08:53 PM, Scott Carey wrote:
> For avro-core, the libraries of interest are Paranamer and Jackson.
> Paranamer only adds up to 35K; Jackson is a bit larger at 500K but is
> absolutely critical for Avro.  Jackson-core is only 150K however --
> it might be possible to drop jackson-mapper.

That's an interesting feature and could be useful.

I think it only works when we don't expose shaded classes as a part of 
our public APIs.  We currently do expose Jackson in a few places: 
Schema#Field has a constructor that accepts a JsonNode as a 
representation of the default value; and JsonEncoder has a constructor 
that accepts a JsonGenerator as output.  The former could probably be 
easily fixed, but the latter might be harder to replace without losing 
potentially useful end-user functionality.

I'd be much more comfortable with shading if we had automated tests that 
checked that shaded classes are not referenced in our public APIs.

Tatu (Jackson's primary developer) has expressed remorse that 1.0.1 has 
incompatibilities with subsequent 1.x releases.  So, instead of 
defensive, we could be optimistic, and hope that that this won't occur 
again in any Jackson 1.x releases.

Paranamer is only required for reflection-based RPC.  It's classes do 
not appear in any public Avro APIs.  The paranamer jar itself is small, 
but its transitive dependencies are large, I think.

> Paranamer is BSD, and Jackson is Apache.  I'm not sure of the
> implications of repackaging these into Avro.

We could include these in Avro without legal issues.

Doug