You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Bernardo Gomez Palacio (JIRA)" <ji...@apache.org> on 2014/04/22 03:42:19 UTC
[jira] [Comment Edited] (MESOS-1203) Shade protobuf dependency in Mesos Java library

    [ https://issues.apache.org/jira/browse/MESOS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976283#comment-13976283 ] 

Bernardo Gomez Palacio edited comment on MESOS-1203 at 4/22/14 1:40 AM:
------------------------------------------------------------------------

To your point [~benjaminhindman] if we _shade_ the protobufs people will require to update their code and applications will depend on such _shaded_ jar, either as two Mesos jars or one single dependency which will include the _com.google.protobuf_ code. In my opinion this is already a given for Java applications and personally I rather have that than put the user through a path that might lead to confusion when deciding what will happen if they deploy Mesos along other systems, e.g. Hadoop, which sadly externalize their own version of Protobufs.

>From experience a few months back it was not trivial to understand why I was getting serialization and protobuf errors on an already deployed Mesos 0.16.0 cluster that had been running Spark Applications when I upgraded from Hadoop 1.0.4 to 2.0.3.

*On Objectives and Principles Around Protobufs*
1. Why we use Protobufs? Not saying that we shouldn't just trying to understand our principles of exposure of the Protos and the protobuf version we use. This is not only for Java but other langs as well.
2. We should always simplify the development of frameworks, that said, it is always more important to simplify the operation of the cluster when upgrading to newer versions of the diverse subsystems that might be interacting. Think of the case mentioned above. e.g. What if Hadoop 2.6 uses protobufs 2.5.1 and I want to run it in a alongside a Mesos 0.18.0 cluster. 
3. If this is a problem in Java this could happen in other languages. This ticket flags a problem that is currently occurring with Apache Spark but might as well happen with other Framework/Application. Remember that the problem in Spark occurs due its dependency on Hadoop which in version 1.0.4 pulls protobufs 2.4.1 while on versions 2.2+ pulls protobufs 2.5.0. It will be hard for Spark to force a version of Protobuf that will comply with Mesos and Hadoop when both, Mesos and Hadoop, evolve at their own pace.

*On Shading in Java*
My understanding is that the Maven Shade plugin can be used to remove the awkwardness of "org.apache.mesos.com.google.protobuf" and have a "org.apache.mesos.io.protobuf" namespace instead. This should make less gross the usage of our _shaded_ _protobuf_ library and make it evident that the `ByteString` that the `ExecutorInfo` Builder requires on `setData`  comes from "org.apache.mesos.io.protobuf".


was (Author: berngp):
To your point [~benjaminhindman] if we _shade_ the protobufs people will require to update their code and applications will depend on such _shaded_ jar, either as two Mesos jars or one single dependency which will include the _com.google.protobuf_ code. In my opinion this is already a given for Java applications and personally I rather have that than put the user through a path that might lead to confusion when deciding what will happen if they deploy Mesos along other systems, e.g. Hadoop, which sadly externalize their own version of Protobufs.

>From experience a few months back it was not trivial to understand why I was getting serialization and protobuf errors on an already deployed Mesos 0.16.0 cluster that had been running Spark Applications when I upgraded from Hadoop 1.0.4 to 2.0.3.

*On Objectives and Principles Around Protobufs*
1. Why we use Protobufs? Not saying that we shouldn't just trying to understand our principles on exposure of to the Protos and the protobuf version we use. This is not only for Java but other langs as well.
2. We should always simplify the development of frameworks, that said, it is always more important to simplify the operation of the cluster when upgrading to newer versions of the diverse subsystems that might be interacting. Think of the case mentioned above. e.g. What if Hadoop 2.6 uses protobufs 2.5.1 and I want to run it in a alongside a Mesos 0.18.0 cluster. 
3. If this is a problem in Java this could happen in other languages. Currently this ticket flags a problem that is currently occurring with Apache Spark but might as well happen with other Framework/Application. Remember that the problem in Spark occurs due its dependency on Hadoop which in version 1.0.4 it pulls protobufs 2.4.1 while on versions 2.2+ it pulls protobufs 2.5.0. It will be hard for Spark to force a version of Protobuf that will comply with Mesos and Hadoop when both, Mesos and Hadoop, evolve at their own pace.

*On Shading in Java*
My understanding is that the Maven Shade plugin can be use to remove the awkwardness of "org.apache.mesos.com.google.protobuf" and have a "org.apache.mesos.io.protobuf" namespace instead. This should make less gross the usage of our _shaded_ _protobuf_ library and make it evident that the `ByteString` that the `ExecutorInfo` Builder requires on `setData`  comes from "org.apache.mesos.io.protobuf".

> Shade protobuf dependency in Mesos Java library
> -----------------------------------------------
>
>                 Key: MESOS-1203
>                 URL: https://issues.apache.org/jira/browse/MESOS-1203
>             Project: Mesos
>          Issue Type: Improvement
>          Components: build
>            Reporter: Patrick Wendell
>
> Mesos's Java library uses the protobuf library which is also used by Hadoop. Unfortunately the protobuf library does not provide binary compatiblity between minor versions (for code compiled against 2.4.1 and 2.5.0 cannot run together in a single JVM classlaoder) .
> This makes use of Mesos via it's Java API, something that is required for Spark and I'm assuming other frameworks, fundamentally incompatible for certain Hadoop versions.
> Mesos could shade this jar using the maven shade plug-in. Take a look at the Parquet project for an example of shading:
> https://github.com/Parquet/parquet-format/blob/master/pom.xml#L140
> Without this fix Java users won't be able to use Mesos (< 0.17) with newer versions of Hadoop. Or Mesos 0.17+ with older versions of Hadoop.



--
This message was sent by Atlassian JIRA
(v6.2#6252)