You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ramkumar Chokkalingam <ra...@gmail.com> on 2013/10/27 09:04:40 UTC

Dependency while creating jar duplicate file.

Hello Spark Community,

I'm trying to convert my project into a single JAR. I used *sbt
assembly*utility to do the same.

This is the error I got,

[error] (*:assembly) deduplicate: different file contents found in the
following:
[error]
/usr/local/spark/ram_examples/rm/lib_managed/jars/org.mortbay.jetty/servlet-api/servlet-api-2.5-20081211.jar:javax/servlet/SingleThreadModel.class
[error]
/usr/local/spark/ram_examples/rm/lib_managed/orbits/org.eclipse.jetty.orbit/javax.servlet/javax.servlet-2.5.0.v201103041518.jar:javax/servlet/SingleThreadModel.class

So when I fix that with, https://github.com/sbt/sbt-assembly#merge-strategy,
i face another dependency,

java.lang.RuntimeException: deduplicate: different file contents found in
the following:
/usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils-core/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
/usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class

This is my build file  http://pastebin.com/5W9f1g1e

While this seems like an error that has been already discussed and
solved<https://groups.google.com/forum/#!searchin/spark-users/java.lang.RuntimeException$3A$20deduplicate$3A$20different$20file$20contents$20found$20in$20the$20following%7Csort:relevance/spark-users/SJRyiK-kDgo/-LeAr5M1ghwJ>,
and discussed here <https://spark-project.atlassian.net/browse/SPARK-395>. I
get this in latest Spark *[version 0.8.0]. *I'm just curious because, this
is clean build of Spark that I'm using and it seems to work fine (with sbt
run/sbt package),but when I use *sbt assembly *I get this error. Am I
missing something while creating JAR ? Any help would be appreciated.
Thanks!


Regards,
R
am

Re: Dependency while creating jar duplicate file.

Posted by Ramkumar Chokkalingam <ra...@gmail.com>.
Yeah, will be noted. I'm adding this ( *retrieveManaged := true *) which
might add all the dependent class files into the classpath. I will add the
"provided" argument as well.




Regards,

Ramkumar Chokkalingam ,
University of Washington.
LinkedIn <http://www.linkedin.com/in/mynameisram>





On Sun, Oct 27, 2013 at 1:39 PM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey Ram,
>
> One other important thing. I'm not sure exactly what your use case is,
> but if you are just planning to make a job that you submit to a Spark
> cluster, you can avoid bundling Spark in your assembly jar in the
> first place.
>
> You should list the dependency as provided. See here:
>
>
> http://spark.incubator.apache.org/docs/latest/quick-start.html#including-your-dependencies
>
> libraryDependencies += "org.apache.spark" %% "spark-core" %
> "0.8.0-incubating" % "provided"
>
> The caveat here is that you'll need to launch your job itself with
> "sbt/sbt run"... or somehow else include Spark in the classpath when
> launching it.
>
> - Patrick
>
> On Sun, Oct 27, 2013 at 1:32 PM, Patrick Wendell <pw...@gmail.com>
> wrote:
> > Forgot to include user list...
> >
> > On Sun, Oct 27, 2013 at 1:16 PM, Patrick Wendell <pw...@gmail.com>
> wrote:
> >> When you are creating an assembly jar you need to deal with all merge
> >> conflicts, including those that arise as a result of transitive
> >> dependencies. Unfortunately there is no way for us to "publish" our
> >> merge strategy. Though this might be something we should mention in
> >> the docs... when people are making an assembly jar including Spark,
> >> they need to merge things correctly themselves.
> >>
> >> - Patrick
> >>
> >> On Sun, Oct 27, 2013 at 10:04 AM, Ramkumar Chokkalingam
> >> <ra...@gmail.com> wrote:
> >>> Hey Patrick,
> >>>
> >>> Thanks for the mail. So, I did solve the issue by using the
> MergeStrategy .
> >>> I was more concerned because Spark was only major dependency - Since
> it was
> >>> a prebuild version and failed, I wanted to check with you.
> >>>
> >>>
> >>> Regards,
> >>>
> >>> Ramkumar Chokkalingam ,
> >>> University of Washington.
> >>> LinkedIn
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Sun, Oct 27, 2013 at 9:33 AM, Patrick Wendell <pw...@gmail.com>
> wrote:
> >>>>
> >>>> Hey Ram,
> >>>>
> >>>> When you create the assembly Jar for your own project, you'll need to
> >>>> deal with all possible conflicts. And this includes various conflicts
> >>>> inside of Spark's dependencies.
> >>>>
> >>>> I've noticed the Apache commons libraries often have conflicts. One
> >>>> thing you could do is set MergeStrategy.first for all class fils with
> >>>> the path org/apache/common/*.
> >>>>
> >>>> You can also add a catch-all default strategy, like the one in the
> >>>> Spark build. In fact, it might make sense to just copy over the policy
> >>>> from the Spark build as a starting point:
> >>>>
> >>>>
> >>>>
> https://github.com/apache/incubator-spark/blob/master/project/SparkBuild.scala#L332
> >>>>
> >>>> - Patrick
> >>>>
> >>>>
> >>>> On Sun, Oct 27, 2013 at 1:04 AM, Ramkumar Chokkalingam
> >>>> <ra...@gmail.com> wrote:
> >>>> >
> >>>> > Hello Spark Community,
> >>>> >
> >>>> > I'm trying to convert my project into a single JAR. I used sbt
> assembly
> >>>> > utility to do the same.
> >>>> >
> >>>> > This is the error I got,
> >>>> >
> >>>> > [error] (*:assembly) deduplicate: different file contents found in
> the
> >>>> > following:
> >>>> > [error]
> >>>> >
> /usr/local/spark/ram_examples/rm/lib_managed/jars/org.mortbay.jetty/servlet-api/servlet-api-2.5-20081211.jar:javax/servlet/SingleThreadModel.class
> >>>> > [error]
> >>>> >
> /usr/local/spark/ram_examples/rm/lib_managed/orbits/org.eclipse.jetty.orbit/javax.servlet/javax.servlet-2.5.0.v201103041518.jar:javax/servlet/SingleThreadModel.class
> >>>> >
> >>>> > So when I fix that with,
> >>>> > https://github.com/sbt/sbt-assembly#merge-strategy, i face another
> >>>> > dependency,
> >>>> >
> >>>> > java.lang.RuntimeException: deduplicate: different file contents
> found
> >>>> > in the following:
> >>>> >
> >>>> >
> /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils-core/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
> >>>> >
> >>>> >
> /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
> >>>> >
> >>>> > This is my build file  http://pastebin.com/5W9f1g1e
> >>>> >
> >>>> > While this seems like an error that has been already discussed and
> >>>> > solved, and discussed here . I get this in latest Spark [version
> 0.8.0]. I'm
> >>>> > just curious because, this is clean build of Spark that I'm using
> and it
> >>>> > seems to work fine (with sbt run/sbt package),but when I use sbt
> assembly I
> >>>> > get this error. Am I missing something while creating JAR ? Any
> help would
> >>>> > be appreciated. Thanks!
> >>>> >
> >>>> >
> >>>> > Regards,
> >>>> > R
> >>>> > am
> >>>> >
> >>>> >
> >>>
> >>>
>

Re: Dependency while creating jar duplicate file.

Posted by Patrick Wendell <pw...@gmail.com>.
Hey Ram,

One other important thing. I'm not sure exactly what your use case is,
but if you are just planning to make a job that you submit to a Spark
cluster, you can avoid bundling Spark in your assembly jar in the
first place.

You should list the dependency as provided. See here:

http://spark.incubator.apache.org/docs/latest/quick-start.html#including-your-dependencies

libraryDependencies += "org.apache.spark" %% "spark-core" %
"0.8.0-incubating" % "provided"

The caveat here is that you'll need to launch your job itself with
"sbt/sbt run"... or somehow else include Spark in the classpath when
launching it.

- Patrick

On Sun, Oct 27, 2013 at 1:32 PM, Patrick Wendell <pw...@gmail.com> wrote:
> Forgot to include user list...
>
> On Sun, Oct 27, 2013 at 1:16 PM, Patrick Wendell <pw...@gmail.com> wrote:
>> When you are creating an assembly jar you need to deal with all merge
>> conflicts, including those that arise as a result of transitive
>> dependencies. Unfortunately there is no way for us to "publish" our
>> merge strategy. Though this might be something we should mention in
>> the docs... when people are making an assembly jar including Spark,
>> they need to merge things correctly themselves.
>>
>> - Patrick
>>
>> On Sun, Oct 27, 2013 at 10:04 AM, Ramkumar Chokkalingam
>> <ra...@gmail.com> wrote:
>>> Hey Patrick,
>>>
>>> Thanks for the mail. So, I did solve the issue by using the MergeStrategy .
>>> I was more concerned because Spark was only major dependency - Since it was
>>> a prebuild version and failed, I wanted to check with you.
>>>
>>>
>>> Regards,
>>>
>>> Ramkumar Chokkalingam ,
>>> University of Washington.
>>> LinkedIn
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Oct 27, 2013 at 9:33 AM, Patrick Wendell <pw...@gmail.com> wrote:
>>>>
>>>> Hey Ram,
>>>>
>>>> When you create the assembly Jar for your own project, you'll need to
>>>> deal with all possible conflicts. And this includes various conflicts
>>>> inside of Spark's dependencies.
>>>>
>>>> I've noticed the Apache commons libraries often have conflicts. One
>>>> thing you could do is set MergeStrategy.first for all class fils with
>>>> the path org/apache/common/*.
>>>>
>>>> You can also add a catch-all default strategy, like the one in the
>>>> Spark build. In fact, it might make sense to just copy over the policy
>>>> from the Spark build as a starting point:
>>>>
>>>>
>>>> https://github.com/apache/incubator-spark/blob/master/project/SparkBuild.scala#L332
>>>>
>>>> - Patrick
>>>>
>>>>
>>>> On Sun, Oct 27, 2013 at 1:04 AM, Ramkumar Chokkalingam
>>>> <ra...@gmail.com> wrote:
>>>> >
>>>> > Hello Spark Community,
>>>> >
>>>> > I'm trying to convert my project into a single JAR. I used sbt assembly
>>>> > utility to do the same.
>>>> >
>>>> > This is the error I got,
>>>> >
>>>> > [error] (*:assembly) deduplicate: different file contents found in the
>>>> > following:
>>>> > [error]
>>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/org.mortbay.jetty/servlet-api/servlet-api-2.5-20081211.jar:javax/servlet/SingleThreadModel.class
>>>> > [error]
>>>> > /usr/local/spark/ram_examples/rm/lib_managed/orbits/org.eclipse.jetty.orbit/javax.servlet/javax.servlet-2.5.0.v201103041518.jar:javax/servlet/SingleThreadModel.class
>>>> >
>>>> > So when I fix that with,
>>>> > https://github.com/sbt/sbt-assembly#merge-strategy, i face another
>>>> > dependency,
>>>> >
>>>> > java.lang.RuntimeException: deduplicate: different file contents found
>>>> > in the following:
>>>> >
>>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils-core/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
>>>> >
>>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
>>>> >
>>>> > This is my build file  http://pastebin.com/5W9f1g1e
>>>> >
>>>> > While this seems like an error that has been already discussed and
>>>> > solved, and discussed here . I get this in latest Spark [version 0.8.0]. I'm
>>>> > just curious because, this is clean build of Spark that I'm using and it
>>>> > seems to work fine (with sbt run/sbt package),but when I use sbt assembly I
>>>> > get this error. Am I missing something while creating JAR ? Any help would
>>>> > be appreciated. Thanks!
>>>> >
>>>> >
>>>> > Regards,
>>>> > R
>>>> > am
>>>> >
>>>> >
>>>
>>>

Re: Dependency while creating jar duplicate file.

Posted by Patrick Wendell <pw...@gmail.com>.
Forgot to include user list...

On Sun, Oct 27, 2013 at 1:16 PM, Patrick Wendell <pw...@gmail.com> wrote:
> When you are creating an assembly jar you need to deal with all merge
> conflicts, including those that arise as a result of transitive
> dependencies. Unfortunately there is no way for us to "publish" our
> merge strategy. Though this might be something we should mention in
> the docs... when people are making an assembly jar including Spark,
> they need to merge things correctly themselves.
>
> - Patrick
>
> On Sun, Oct 27, 2013 at 10:04 AM, Ramkumar Chokkalingam
> <ra...@gmail.com> wrote:
>> Hey Patrick,
>>
>> Thanks for the mail. So, I did solve the issue by using the MergeStrategy .
>> I was more concerned because Spark was only major dependency - Since it was
>> a prebuild version and failed, I wanted to check with you.
>>
>>
>> Regards,
>>
>> Ramkumar Chokkalingam ,
>> University of Washington.
>> LinkedIn
>>
>>
>>
>>
>>
>> On Sun, Oct 27, 2013 at 9:33 AM, Patrick Wendell <pw...@gmail.com> wrote:
>>>
>>> Hey Ram,
>>>
>>> When you create the assembly Jar for your own project, you'll need to
>>> deal with all possible conflicts. And this includes various conflicts
>>> inside of Spark's dependencies.
>>>
>>> I've noticed the Apache commons libraries often have conflicts. One
>>> thing you could do is set MergeStrategy.first for all class fils with
>>> the path org/apache/common/*.
>>>
>>> You can also add a catch-all default strategy, like the one in the
>>> Spark build. In fact, it might make sense to just copy over the policy
>>> from the Spark build as a starting point:
>>>
>>>
>>> https://github.com/apache/incubator-spark/blob/master/project/SparkBuild.scala#L332
>>>
>>> - Patrick
>>>
>>>
>>> On Sun, Oct 27, 2013 at 1:04 AM, Ramkumar Chokkalingam
>>> <ra...@gmail.com> wrote:
>>> >
>>> > Hello Spark Community,
>>> >
>>> > I'm trying to convert my project into a single JAR. I used sbt assembly
>>> > utility to do the same.
>>> >
>>> > This is the error I got,
>>> >
>>> > [error] (*:assembly) deduplicate: different file contents found in the
>>> > following:
>>> > [error]
>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/org.mortbay.jetty/servlet-api/servlet-api-2.5-20081211.jar:javax/servlet/SingleThreadModel.class
>>> > [error]
>>> > /usr/local/spark/ram_examples/rm/lib_managed/orbits/org.eclipse.jetty.orbit/javax.servlet/javax.servlet-2.5.0.v201103041518.jar:javax/servlet/SingleThreadModel.class
>>> >
>>> > So when I fix that with,
>>> > https://github.com/sbt/sbt-assembly#merge-strategy, i face another
>>> > dependency,
>>> >
>>> > java.lang.RuntimeException: deduplicate: different file contents found
>>> > in the following:
>>> >
>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils-core/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
>>> >
>>> > /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
>>> >
>>> > This is my build file  http://pastebin.com/5W9f1g1e
>>> >
>>> > While this seems like an error that has been already discussed and
>>> > solved, and discussed here . I get this in latest Spark [version 0.8.0]. I'm
>>> > just curious because, this is clean build of Spark that I'm using and it
>>> > seems to work fine (with sbt run/sbt package),but when I use sbt assembly I
>>> > get this error. Am I missing something while creating JAR ? Any help would
>>> > be appreciated. Thanks!
>>> >
>>> >
>>> > Regards,
>>> > R
>>> > am
>>> >
>>> >
>>
>>

Re: Dependency while creating jar duplicate file.

Posted by Ramkumar Chokkalingam <ra...@gmail.com>.
Hey Patrick,

Thanks for the mail. So, I did solve the issue by *using the MergeStrategy** .
*I was more concerned because Spark was only major dependency - Since it
was a prebuild version and failed, I wanted to check with you.


Regards,

Ramkumar Chokkalingam ,
University of Washington.
LinkedIn <http://www.linkedin.com/in/mynameisram>





On Sun, Oct 27, 2013 at 9:33 AM, Patrick Wendell <pw...@gmail.com> wrote:

> Hey Ram,
>
> When you create the assembly Jar for your own project, you'll need to
> deal with all possible conflicts. And this includes various conflicts
> inside of Spark's dependencies.
>
> I've noticed the Apache commons libraries often have conflicts. One
> thing you could do is set MergeStrategy.first for all class fils with
> the path org/apache/common/*.
>
> You can also add a catch-all default strategy, like the one in the
> Spark build. In fact, it might make sense to just copy over the policy
> from the Spark build as a starting point:
>
>
> https://github.com/apache/incubator-spark/blob/master/project/SparkBuild.scala#L332
>
> - Patrick
>
>
> On Sun, Oct 27, 2013 at 1:04 AM, Ramkumar Chokkalingam
> <ra...@gmail.com> wrote:
> >
> > Hello Spark Community,
> >
> > I'm trying to convert my project into a single JAR. I used sbt assembly
> utility to do the same.
> >
> > This is the error I got,
> >
> > [error] (*:assembly) deduplicate: different file contents found in the
> following:
> > [error]
> /usr/local/spark/ram_examples/rm/lib_managed/jars/org.mortbay.jetty/servlet-api/servlet-api-2.5-20081211.jar:javax/servlet/SingleThreadModel.class
> > [error]
> /usr/local/spark/ram_examples/rm/lib_managed/orbits/org.eclipse.jetty.orbit/javax.servlet/javax.servlet-2.5.0.v201103041518.jar:javax/servlet/SingleThreadModel.class
> >
> > So when I fix that with,
> https://github.com/sbt/sbt-assembly#merge-strategy, i face another
> dependency,
> >
> > java.lang.RuntimeException: deduplicate: different file contents found
> in the following:
> >
> /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils-core/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
> >
> /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
> >
> > This is my build file  http://pastebin.com/5W9f1g1e
> >
> > While this seems like an error that has been already discussed and
> solved, and discussed here . I get this in latest Spark [version 0.8.0].
> I'm just curious because, this is clean build of Spark that I'm using and
> it seems to work fine (with sbt run/sbt package),but when I use sbt
> assembly I get this error. Am I missing something while creating JAR ? Any
> help would be appreciated. Thanks!
> >
> >
> > Regards,
> > R
> > am
> >
> >
>

Re: Dependency while creating jar duplicate file.

Posted by Patrick Wendell <pw...@gmail.com>.
Hey Ram,

When you create the assembly Jar for your own project, you'll need to
deal with all possible conflicts. And this includes various conflicts
inside of Spark's dependencies.

I've noticed the Apache commons libraries often have conflicts. One
thing you could do is set MergeStrategy.first for all class fils with
the path org/apache/common/*.

You can also add a catch-all default strategy, like the one in the
Spark build. In fact, it might make sense to just copy over the policy
from the Spark build as a starting point:

https://github.com/apache/incubator-spark/blob/master/project/SparkBuild.scala#L332

- Patrick


On Sun, Oct 27, 2013 at 1:04 AM, Ramkumar Chokkalingam
<ra...@gmail.com> wrote:
>
> Hello Spark Community,
>
> I'm trying to convert my project into a single JAR. I used sbt assembly utility to do the same.
>
> This is the error I got,
>
> [error] (*:assembly) deduplicate: different file contents found in the following:
> [error] /usr/local/spark/ram_examples/rm/lib_managed/jars/org.mortbay.jetty/servlet-api/servlet-api-2.5-20081211.jar:javax/servlet/SingleThreadModel.class
> [error] /usr/local/spark/ram_examples/rm/lib_managed/orbits/org.eclipse.jetty.orbit/javax.servlet/javax.servlet-2.5.0.v201103041518.jar:javax/servlet/SingleThreadModel.class
>
> So when I fix that with, https://github.com/sbt/sbt-assembly#merge-strategy, i face another dependency,
>
> java.lang.RuntimeException: deduplicate: different file contents found in the following:
> /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils-core/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
> /usr/local/spark/ram_examples/rm/lib_managed/jars/commons-beanutils/commons-beanutils/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/converters/FloatArrayConverter.class
>
> This is my build file  http://pastebin.com/5W9f1g1e
>
> While this seems like an error that has been already discussed and solved, and discussed here . I get this in latest Spark [version 0.8.0]. I'm just curious because, this is clean build of Spark that I'm using and it seems to work fine (with sbt run/sbt package),but when I use sbt assembly I get this error. Am I missing something while creating JAR ? Any help would be appreciated. Thanks!
>
>
> Regards,
> R
> am
>
>