You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by jimfcarroll <ji...@gmail.com> on 2015/06/24 20:21:34 UTC

Problem with version compatibility

Hello all,

I have a strange problem. I have a mesos spark cluster with Spark
1.4.0/Hadoop 2.4.0 installed and a client application use maven to include
the same versions.

However, I'm getting a serialUIDVersion problem on:

ERROR Remoting -
org.apache.spark.storage.BlockManagerMessages$RegisterBlockManager; local
class incompatible: stream classdesc serialVersionUID = 3833981923223309323,
local class serialVersionUID = -1833407448843930116

When I look in the jar file of the spark dependency in my maven repo I see:

spark-core_2.10-1.4.0.jar contains the line:
2917  10-Jun-2015  12:20:48 
org/apache/spark/storage/BlockManagerMessages$RegisterBlockManager$.class

However, on my mesos cluster the jar looks like this:

spark-assembly-1.4.0-hadoop2.4.0.jar contains the line:
3786   2-Jun-2015  18:23:00 
org/apache/spark/storage/BlockManagerMessages$RegisterBlockManager.class

Notice the classes aren't the same (different sizes), but I'm getting them
both from hosted repositories. One is from maven central and the other is
from the download page which points to here:
http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz

Thanks
Jim




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Problem with version compatibility

Posted by jimfcarroll <ji...@gmail.com>.
Yana and Sean,

Thanks for the feedback. I can get it to work a number of ways, I'm just
wondering if there's a preferred means. 

One last question. Is there a reason the deployed Spark install doesn't
contain the same version of several classes as the maven dependency. Is this
intentional?

Thanks again.
Jim




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12900.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Problem with version compatibility

Posted by Yana Kadiyska <ya...@gmail.com>.
Jim, I do something similar to you. I mark all dependencies as provided and
then make sure to drop the same version of spark-assembly in my war as I
have on the executors. I don't remember if dropping in server/lib works, I
think I ran into an issue with that. Would love to know "best practices"
when it comes to Tomcat and Spark

On Thu, Jun 25, 2015 at 11:23 AM, Sean Owen <so...@cloudera.com> wrote:

> Try putting your same Mesos assembly on the classpath of your client
> then, to emulate what spark-submit does. I don't think you merely also
> want to put it on the classpath but make sure nothing else from Spark
> is coming from your app.
>
> In 1.4 there is the 'launcher' API which makes programmatic access a
> lot more feasible but still kinda needs you to get Spark code to your
> driver program, and if it's not the same as on your cluster you'd
> still risk some incompatibilities.
>
> On Thu, Jun 25, 2015 at 6:05 PM, jimfcarroll <ji...@gmail.com>
> wrote:
> > Ah. I've avoided using spark-submit primarily because our use of Spark
> is as
> > part of an analytics library that's meant to be embedded in other
> > applications with their own lifecycle management.
> >
> > One of those application is a REST app running in tomcat which will make
> the
> > use of spark-submit difficult (if not impossible).
> >
> > Also, we're trying to avoid sending jars over the wire per-job and so we
> > install our library (minus the spark dependencies) on the mesos workers
> and
> > refer to it in the spark configuration using
> spark.executor.extraClassPath
> > and if I'm reading SparkSubmit.scala correctly, it looks like the user's
> > assembly ends up sent to the cluster (at least in the case of yarn)
> though I
> > could be wrong on this.
> >
> > Is there a standard way of running an app that's in control of it's own
> > runtime lifecycle without spark-submit?
> >
> > Thanks again.
> > Jim
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12894.html
> > Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> > For additional commands, e-mail: dev-help@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: Problem with version compatibility

Posted by Sean Owen <so...@cloudera.com>.
Try putting your same Mesos assembly on the classpath of your client
then, to emulate what spark-submit does. I don't think you merely also
want to put it on the classpath but make sure nothing else from Spark
is coming from your app.

In 1.4 there is the 'launcher' API which makes programmatic access a
lot more feasible but still kinda needs you to get Spark code to your
driver program, and if it's not the same as on your cluster you'd
still risk some incompatibilities.

On Thu, Jun 25, 2015 at 6:05 PM, jimfcarroll <ji...@gmail.com> wrote:
> Ah. I've avoided using spark-submit primarily because our use of Spark is as
> part of an analytics library that's meant to be embedded in other
> applications with their own lifecycle management.
>
> One of those application is a REST app running in tomcat which will make the
> use of spark-submit difficult (if not impossible).
>
> Also, we're trying to avoid sending jars over the wire per-job and so we
> install our library (minus the spark dependencies) on the mesos workers and
> refer to it in the spark configuration using spark.executor.extraClassPath
> and if I'm reading SparkSubmit.scala correctly, it looks like the user's
> assembly ends up sent to the cluster (at least in the case of yarn) though I
> could be wrong on this.
>
> Is there a standard way of running an app that's in control of it's own
> runtime lifecycle without spark-submit?
>
> Thanks again.
> Jim
>
>
>
>
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12894.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Problem with version compatibility

Posted by jimfcarroll <ji...@gmail.com>.
Ah. I've avoided using spark-submit primarily because our use of Spark is as
part of an analytics library that's meant to be embedded in other
applications with their own lifecycle management.

One of those application is a REST app running in tomcat which will make the
use of spark-submit difficult (if not impossible).

Also, we're trying to avoid sending jars over the wire per-job and so we
install our library (minus the spark dependencies) on the mesos workers and
refer to it in the spark configuration using spark.executor.extraClassPath
and if I'm reading SparkSubmit.scala correctly, it looks like the user's
assembly ends up sent to the cluster (at least in the case of yarn) though I
could be wrong on this.

Is there a standard way of running an app that's in control of it's own
runtime lifecycle without spark-submit?

Thanks again.
Jim




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12894.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Problem with version compatibility

Posted by Sean Owen <so...@cloudera.com>.
Yes spark-submit adds all this for you. You don't bring Spark classes in
your app

On Thu, Jun 25, 2015, 4:01 PM jimfcarroll <ji...@gmail.com> wrote:

> Hi Sean,
>
> I'm packaging spark with my (standalone) driver app using maven. Any
> assemblies that are used on the mesos workers through extending the
> classpath or providing the jars in the driver (via the SparkConf) isn't
> packaged with spark (it seems obvious that would be a mistake).
>
> I need, for example, "RDD" on my classpath in order for my driver app to
> run. Are you saying I need to mark spark as provided in maven and include
> an
> installed distribution's lib directory jars on my classpath?
>
> I'm not using anything but the jar files from a Spark install in my driver
> so that seemed superfluous (and slightly more difficult to manage the
> deployment). Also, even if that's the case, I don't understand why the
> maven
> dependency of the same version of a deployable distribution would have
> different versions of classes in it than the deployable version itself.
>
> Thanks for your patience.
> Jim
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12889.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: Problem with version compatibility

Posted by jimfcarroll <ji...@gmail.com>.
Hi Sean,

I'm packaging spark with my (standalone) driver app using maven. Any
assemblies that are used on the mesos workers through extending the
classpath or providing the jars in the driver (via the SparkConf) isn't
packaged with spark (it seems obvious that would be a mistake).

I need, for example, "RDD" on my classpath in order for my driver app to
run. Are you saying I need to mark spark as provided in maven and include an
installed distribution's lib directory jars on my classpath?

I'm not using anything but the jar files from a Spark install in my driver
so that seemed superfluous (and slightly more difficult to manage the
deployment). Also, even if that's the case, I don't understand why the maven
dependency of the same version of a deployable distribution would have
different versions of classes in it than the deployable version itself.

Thanks for your patience.
Jim




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12889.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Problem with version compatibility

Posted by Sean Owen <so...@cloudera.com>.
-dev +user

That all sounds fine except are you packaging Spark classes with your
app? that's the bit I'm wondering about. You would mark it as a
'provided' dependency in Maven.

On Thu, Jun 25, 2015 at 5:12 AM, jimfcarroll <ji...@gmail.com> wrote:
> Hi Sean,
>
> I'm running a Mesos cluster. My driver app is built using maven against the
> maven 1.4.0 dependency.
>
> The Mesos slave machines have the spark distribution installed from the
> distribution link.
>
> I have a hard time understanding how this isn't a standard app deployment
> but maybe I'm missing something.
>
> If you build a driver app against 1.4.0 using maven and run it against a
> mesos cluster that has the 1.4.0 binary distribution installed, your driver
> wont run right.
>
> I meant to publish this question on the user list so my apologies if it's in
> the wrong place.
>
> Jim
>
>
>
>
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12876.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Problem with version compatibility

Posted by Sean Owen <so...@cloudera.com>.
-dev +user

That all sounds fine except are you packaging Spark classes with your
app? that's the bit I'm wondering about. You would mark it as a
'provided' dependency in Maven.

On Thu, Jun 25, 2015 at 5:12 AM, jimfcarroll <ji...@gmail.com> wrote:
> Hi Sean,
>
> I'm running a Mesos cluster. My driver app is built using maven against the
> maven 1.4.0 dependency.
>
> The Mesos slave machines have the spark distribution installed from the
> distribution link.
>
> I have a hard time understanding how this isn't a standard app deployment
> but maybe I'm missing something.
>
> If you build a driver app against 1.4.0 using maven and run it against a
> mesos cluster that has the 1.4.0 binary distribution installed, your driver
> wont run right.
>
> I meant to publish this question on the user list so my apologies if it's in
> the wrong place.
>
> Jim
>
>
>
>
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12876.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Problem with version compatibility

Posted by jimfcarroll <ji...@gmail.com>.
Hi Sean,

I'm running a Mesos cluster. My driver app is built using maven against the
maven 1.4.0 dependency.

The Mesos slave machines have the spark distribution installed from the
distribution link.

I have a hard time understanding how this isn't a standard app deployment
but maybe I'm missing something. 

If you build a driver app against 1.4.0 using maven and run it against a
mesos cluster that has the 1.4.0 binary distribution installed, your driver
wont run right.

I meant to publish this question on the user list so my apologies if it's in
the wrong place.

Jim




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12876.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Problem with version compatibility

Posted by Sean Owen <so...@cloudera.com>.
They are different classes even. Your problem isn't class-not-found though.
You're also comparing different builds really. You should not be including
Spark code in your app.

On Wed, Jun 24, 2015, 9:48 PM jimfcarroll <ji...@gmail.com> wrote:

> These jars are simply incompatible. You can see this by looking at that
> class
> in both the maven repo for 1.4.0 here:
>
>
> http://central.maven.org/maven2/org/apache/spark/spark-core_2.10/1.4.0/spark-core_2.10-1.4.0.jar
>
> as well as the spark-assembly jar inside the .tgz file you can get from the
> official download here:
>
> http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz
>
> Am I missing something?
>
> Thanks
> Jim
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12863.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: Problem with version compatibility

Posted by jimfcarroll <ji...@gmail.com>.
These jars are simply incompatible. You can see this by looking at that class
in both the maven repo for 1.4.0 here:

http://central.maven.org/maven2/org/apache/spark/spark-core_2.10/1.4.0/spark-core_2.10-1.4.0.jar

as well as the spark-assembly jar inside the .tgz file you can get from the
official download here:

http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz

Am I missing something?

Thanks
Jim




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-with-version-compatibility-tp12861p12863.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org