You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sachin Mittal <sj...@gmail.com> on 2016/07/19 14:09:58 UTC

Building standalone spark application via sbt

Hi,
Can someone please guide me what all jars I need to place in my lib folder
of the project to build a standalone scala application via sbt.

Note I need to provide static dependencies and I cannot download the jars
using libraryDependencies.
So I need to provide all the jars upfront.

So far I found that we need:
spark-core_<version>.jar

Do we also need
spark-sql_<version>.jar
and
hadoop-core-<version>.jar

Is there any jar from spark side I may be missing? What I found that
spark-core needs hadoop-core classes and if I don't add them then sbt was
giving me this error:
[error] bad symbolic reference. A signature in SparkContext.class refers to
term hadoop
[error] in package org.apache which is not available.

So I was just confused on library dependency part when building an
application via sbt. Any inputs here would be helpful.

Thanks
Sachin

Re: Building standalone spark application via sbt

Posted by Sachin Mittal <sj...@gmail.com>.
I got the error during run time. It was for mongo-spark-connector class
files.
My build.sbt is like this

name := "Test Advice Project"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
    "org.mongodb.spark" %% "mongo-spark-connector" % "1.0.0",
    "org.apache.spark" %% "spark-core" % "1.6.1" % "provided",
    "org.apache.spark" %% "spark-sql" % "1.6.1" % "provided"
)

assemblyMergeStrategy in assembly := {
   case PathList("META-INF", xs @ _*) => MergeStrategy.discard
   case x => MergeStrategy.first
}

and I create the fat jar using sbt assembly
I have also added addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
in ~/.sbt/0.13/plugins/plugins.sbt for sbt assembly to work.

I think when you use provided it does not include those jars in your fat
jar.

I am using spark 1.6

Thanks
Sachin



On Wed, Jul 20, 2016 at 11:23 PM, Marco Mistroni <mm...@gmail.com>
wrote:

>  that will work but ideally you should not include any of the
> spark-releated jars as they are provided to you by the spark environment
> whenever you launch your app via spark-submit (this will prevent unexpected
> errors e.g. when you kick off your app using a different version of spark
> where some of the classes has been renamd or movedaround -tbh i don't think
> this is a case that happen often)
>
> Btw did you get the NoClassDefFoundException at compile time or run
> time?if at run time, what is your Spark Version  and what is the spark
> libraries version you used in your sbt?
> are you using a Spark version pre 1.4?
>
> kr
>  marco
>
>
>
>
>
>
> On Wed, Jul 20, 2016 at 6:13 PM, Sachin Mittal <sj...@gmail.com> wrote:
>
>> NoClassDefFound error was for spark classes like say SparkConext.
>> When running a standalone spark application I was not passing external
>> jars using --jars option.
>>
>> However I have fixed this by making a fat jar using sbt assembly plugin.
>>
>> Now all the dependencies are included in that jar and I use that jar in
>> spark-submit
>>
>> Thanks
>> Sachin
>>
>>
>> On Wed, Jul 20, 2016 at 9:42 PM, Marco Mistroni <mm...@gmail.com>
>> wrote:
>>
>>> Hello Sachin
>>>   pls paste the NoClassDefFound Exception so we can see what's failing,
>>> aslo please advise how are you running your Spark App
>>> For an extremely simple case, let's assume  you have your
>>> MyFirstSparkApp packaged in your   myFirstSparkApp.jar
>>> Then all you need to do would be to kick off
>>>
>>> spark-submit --class MyFirstSparkApp   myFirstSparkApp.jar
>>>
>>> if you have any external dependencies (not spark , let's assume you are
>>> using common-utils.jar) then you should be able to kick it off via
>>>
>>> spark-submit --class MyFirstSparkApp --jars common-utiils.jar
>>> myFirstSparkApp.jar
>>>
>>> I paste below the build.sbt i am using for my SparkExamples apps, hope
>>> this helps.
>>> kr
>>>  marco
>>>
>>> name := "SparkExamples"
>>>
>>> version := "1.0"
>>>
>>> scalaVersion := "2.10.5"
>>>
>>>
>>> // Add a single dependency
>>> libraryDependencies += "junit" % "junit" % "4.8" % "test"
>>> libraryDependencies += "org.mockito" % "mockito-core" % "1.9.5"
>>> libraryDependencies ++= Seq("org.slf4j" % "slf4j-api" % "1.7.5",
>>>                             "org.slf4j" % "slf4j-simple" % "1.7.5",
>>>                             "org.clapper" %% "grizzled-slf4j" % "1.0.2")
>>> libraryDependencies += "org.powermock" %
>>> "powermock-mockito-release-full" % "1.5.4" % "test"
>>> libraryDependencies += "org.apache.spark" %% "spark-core"   % "1.6.1" %
>>> "provided"
>>> libraryDependencies += "org.apache.spark" %% "spark-streaming"   %
>>> "1.6.1" % "provided"
>>> libraryDependencies += "org.apache.spark" %% "spark-mllib"   % "1.6.1"
>>> % "provided"
>>> libraryDependencies += "org.apache.spark" %% "spark-streaming-flume"   %
>>> "1.3.0"  % "provided"
>>> resolvers += "softprops-maven" at "
>>> http://dl.bintray.com/content/softprops/maven"
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jul 20, 2016 at 3:39 PM, Mich Talebzadeh <
>>> mich.talebzadeh@gmail.com> wrote:
>>>
>>>> you need an uber jar file.
>>>>
>>>> Have you actually followed the dependencies and project sub-directory
>>>> build?
>>>>
>>>> check this.
>>>>
>>>>
>>>> http://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea
>>>>
>>>> under three answers the top one.
>>>>
>>>> I started reading the official SBT tutorial
>>>> <http://www.scala-sbt.org/0.13/tutorial/>.  .....
>>>>
>>>> HTH
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>>> any loss, damage or destruction of data or any other property which may
>>>> arise from relying on this email's technical content is explicitly
>>>> disclaimed. The author will in no case be liable for any monetary damages
>>>> arising from such loss, damage or destruction.
>>>>
>>>>
>>>>
>>>> On 20 July 2016 at 09:54, Sachin Mittal <sj...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>> I am following the example under
>>>>> https://spark.apache.org/docs/latest/quick-start.html
>>>>> For standalone scala application.
>>>>>
>>>>> I added all my dependencies via build.sbt (one dependency is under lib
>>>>> folder).
>>>>>
>>>>> When I run sbt package I see the jar created under
>>>>> target/scala-2.10/
>>>>>
>>>>> So compile seems to be working fine. However when I inspect that jar,
>>>>> it only contains my scala class.
>>>>> Unlike in java application we build a standalone jar, which contains
>>>>> all the dependencies inside that jar, here all the dependencies are missing.
>>>>>
>>>>> So as expected when I run the application via spark-submit I get the
>>>>> NoClassDefFoundError.
>>>>>
>>>>> Here is my build.sbt
>>>>>
>>>>> name := "Test Advice Project"
>>>>> version := "1.0"
>>>>> scalaVersion := "2.10.6"
>>>>> libraryDependencies ++= Seq(
>>>>>     "org.apache.spark" %% "spark-core" % "1.6.1",
>>>>>     "org.apache.spark" %% "spark-sql" % "1.6.1"
>>>>> )
>>>>>
>>>>> Can anyone please guide me to as what is going wrong and why sbt
>>>>> package is not including all the dependencies jar classes in the new jar.
>>>>>
>>>>> Thanks
>>>>> Sachin
>>>>>
>>>>>
>>>>> On Tue, Jul 19, 2016 at 8:23 PM, Andrew Ehrlich <an...@aehrlich.com>
>>>>> wrote:
>>>>>
>>>>>> Yes, spark-core will depend on Hadoop and several other jars.  Here’s
>>>>>> the list of dependencies:
>>>>>> https://github.com/apache/spark/blob/master/core/pom.xml#L35
>>>>>>
>>>>>> Whether you need spark-sql depends on whether you will use the
>>>>>> DataFrame API. Without spark-sql, you will just have the RDD API.
>>>>>>
>>>>>> On Jul 19, 2016, at 7:09 AM, Sachin Mittal <sj...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>> Can someone please guide me what all jars I need to place in my lib
>>>>>> folder of the project to build a standalone scala application via sbt.
>>>>>>
>>>>>> Note I need to provide static dependencies and I cannot download the
>>>>>> jars using libraryDependencies.
>>>>>> So I need to provide all the jars upfront.
>>>>>>
>>>>>> So far I found that we need:
>>>>>> spark-core_<version>.jar
>>>>>>
>>>>>> Do we also need
>>>>>> spark-sql_<version>.jar
>>>>>> and
>>>>>> hadoop-core-<version>.jar
>>>>>>
>>>>>> Is there any jar from spark side I may be missing? What I found that
>>>>>> spark-core needs hadoop-core classes and if I don't add them then sbt was
>>>>>> giving me this error:
>>>>>> [error] bad symbolic reference. A signature in SparkContext.class
>>>>>> refers to term hadoop
>>>>>> [error] in package org.apache which is not available.
>>>>>>
>>>>>> So I was just confused on library dependency part when building an
>>>>>> application via sbt. Any inputs here would be helpful.
>>>>>>
>>>>>> Thanks
>>>>>> Sachin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Building standalone spark application via sbt

Posted by Marco Mistroni <mm...@gmail.com>.
 that will work but ideally you should not include any of the
spark-releated jars as they are provided to you by the spark environment
whenever you launch your app via spark-submit (this will prevent unexpected
errors e.g. when you kick off your app using a different version of spark
where some of the classes has been renamd or movedaround -tbh i don't think
this is a case that happen often)

Btw did you get the NoClassDefFoundException at compile time or run time?if
at run time, what is your Spark Version  and what is the spark libraries
version you used in your sbt?
are you using a Spark version pre 1.4?

kr
 marco






On Wed, Jul 20, 2016 at 6:13 PM, Sachin Mittal <sj...@gmail.com> wrote:

> NoClassDefFound error was for spark classes like say SparkConext.
> When running a standalone spark application I was not passing external
> jars using --jars option.
>
> However I have fixed this by making a fat jar using sbt assembly plugin.
>
> Now all the dependencies are included in that jar and I use that jar in
> spark-submit
>
> Thanks
> Sachin
>
>
> On Wed, Jul 20, 2016 at 9:42 PM, Marco Mistroni <mm...@gmail.com>
> wrote:
>
>> Hello Sachin
>>   pls paste the NoClassDefFound Exception so we can see what's failing,
>> aslo please advise how are you running your Spark App
>> For an extremely simple case, let's assume  you have your
>> MyFirstSparkApp packaged in your   myFirstSparkApp.jar
>> Then all you need to do would be to kick off
>>
>> spark-submit --class MyFirstSparkApp   myFirstSparkApp.jar
>>
>> if you have any external dependencies (not spark , let's assume you are
>> using common-utils.jar) then you should be able to kick it off via
>>
>> spark-submit --class MyFirstSparkApp --jars common-utiils.jar
>> myFirstSparkApp.jar
>>
>> I paste below the build.sbt i am using for my SparkExamples apps, hope
>> this helps.
>> kr
>>  marco
>>
>> name := "SparkExamples"
>>
>> version := "1.0"
>>
>> scalaVersion := "2.10.5"
>>
>>
>> // Add a single dependency
>> libraryDependencies += "junit" % "junit" % "4.8" % "test"
>> libraryDependencies += "org.mockito" % "mockito-core" % "1.9.5"
>> libraryDependencies ++= Seq("org.slf4j" % "slf4j-api" % "1.7.5",
>>                             "org.slf4j" % "slf4j-simple" % "1.7.5",
>>                             "org.clapper" %% "grizzled-slf4j" % "1.0.2")
>> libraryDependencies += "org.powermock" % "powermock-mockito-release-full"
>> % "1.5.4" % "test"
>> libraryDependencies += "org.apache.spark" %% "spark-core"   % "1.6.1" %
>> "provided"
>> libraryDependencies += "org.apache.spark" %% "spark-streaming"   %
>> "1.6.1" % "provided"
>> libraryDependencies += "org.apache.spark" %% "spark-mllib"   % "1.6.1"  %
>> "provided"
>> libraryDependencies += "org.apache.spark" %% "spark-streaming-flume"   %
>> "1.3.0"  % "provided"
>> resolvers += "softprops-maven" at "
>> http://dl.bintray.com/content/softprops/maven"
>>
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jul 20, 2016 at 3:39 PM, Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> you need an uber jar file.
>>>
>>> Have you actually followed the dependencies and project sub-directory
>>> build?
>>>
>>> check this.
>>>
>>>
>>> http://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea
>>>
>>> under three answers the top one.
>>>
>>> I started reading the official SBT tutorial
>>> <http://www.scala-sbt.org/0.13/tutorial/>.  .....
>>>
>>> HTH
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 20 July 2016 at 09:54, Sachin Mittal <sj...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I am following the example under
>>>> https://spark.apache.org/docs/latest/quick-start.html
>>>> For standalone scala application.
>>>>
>>>> I added all my dependencies via build.sbt (one dependency is under lib
>>>> folder).
>>>>
>>>> When I run sbt package I see the jar created under
>>>> target/scala-2.10/
>>>>
>>>> So compile seems to be working fine. However when I inspect that jar,
>>>> it only contains my scala class.
>>>> Unlike in java application we build a standalone jar, which contains
>>>> all the dependencies inside that jar, here all the dependencies are missing.
>>>>
>>>> So as expected when I run the application via spark-submit I get the
>>>> NoClassDefFoundError.
>>>>
>>>> Here is my build.sbt
>>>>
>>>> name := "Test Advice Project"
>>>> version := "1.0"
>>>> scalaVersion := "2.10.6"
>>>> libraryDependencies ++= Seq(
>>>>     "org.apache.spark" %% "spark-core" % "1.6.1",
>>>>     "org.apache.spark" %% "spark-sql" % "1.6.1"
>>>> )
>>>>
>>>> Can anyone please guide me to as what is going wrong and why sbt
>>>> package is not including all the dependencies jar classes in the new jar.
>>>>
>>>> Thanks
>>>> Sachin
>>>>
>>>>
>>>> On Tue, Jul 19, 2016 at 8:23 PM, Andrew Ehrlich <an...@aehrlich.com>
>>>> wrote:
>>>>
>>>>> Yes, spark-core will depend on Hadoop and several other jars.  Here’s
>>>>> the list of dependencies:
>>>>> https://github.com/apache/spark/blob/master/core/pom.xml#L35
>>>>>
>>>>> Whether you need spark-sql depends on whether you will use the
>>>>> DataFrame API. Without spark-sql, you will just have the RDD API.
>>>>>
>>>>> On Jul 19, 2016, at 7:09 AM, Sachin Mittal <sj...@gmail.com> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>> Can someone please guide me what all jars I need to place in my lib
>>>>> folder of the project to build a standalone scala application via sbt.
>>>>>
>>>>> Note I need to provide static dependencies and I cannot download the
>>>>> jars using libraryDependencies.
>>>>> So I need to provide all the jars upfront.
>>>>>
>>>>> So far I found that we need:
>>>>> spark-core_<version>.jar
>>>>>
>>>>> Do we also need
>>>>> spark-sql_<version>.jar
>>>>> and
>>>>> hadoop-core-<version>.jar
>>>>>
>>>>> Is there any jar from spark side I may be missing? What I found that
>>>>> spark-core needs hadoop-core classes and if I don't add them then sbt was
>>>>> giving me this error:
>>>>> [error] bad symbolic reference. A signature in SparkContext.class
>>>>> refers to term hadoop
>>>>> [error] in package org.apache which is not available.
>>>>>
>>>>> So I was just confused on library dependency part when building an
>>>>> application via sbt. Any inputs here would be helpful.
>>>>>
>>>>> Thanks
>>>>> Sachin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Building standalone spark application via sbt

Posted by Sachin Mittal <sj...@gmail.com>.
NoClassDefFound error was for spark classes like say SparkConext.
When running a standalone spark application I was not passing external jars
using --jars option.

However I have fixed this by making a fat jar using sbt assembly plugin.

Now all the dependencies are included in that jar and I use that jar in
spark-submit

Thanks
Sachin


On Wed, Jul 20, 2016 at 9:42 PM, Marco Mistroni <mm...@gmail.com> wrote:

> Hello Sachin
>   pls paste the NoClassDefFound Exception so we can see what's failing,
> aslo please advise how are you running your Spark App
> For an extremely simple case, let's assume  you have your  MyFirstSparkApp
> packaged in your   myFirstSparkApp.jar
> Then all you need to do would be to kick off
>
> spark-submit --class MyFirstSparkApp   myFirstSparkApp.jar
>
> if you have any external dependencies (not spark , let's assume you are
> using common-utils.jar) then you should be able to kick it off via
>
> spark-submit --class MyFirstSparkApp --jars common-utiils.jar
> myFirstSparkApp.jar
>
> I paste below the build.sbt i am using for my SparkExamples apps, hope
> this helps.
> kr
>  marco
>
> name := "SparkExamples"
>
> version := "1.0"
>
> scalaVersion := "2.10.5"
>
>
> // Add a single dependency
> libraryDependencies += "junit" % "junit" % "4.8" % "test"
> libraryDependencies += "org.mockito" % "mockito-core" % "1.9.5"
> libraryDependencies ++= Seq("org.slf4j" % "slf4j-api" % "1.7.5",
>                             "org.slf4j" % "slf4j-simple" % "1.7.5",
>                             "org.clapper" %% "grizzled-slf4j" % "1.0.2")
> libraryDependencies += "org.powermock" % "powermock-mockito-release-full"
> % "1.5.4" % "test"
> libraryDependencies += "org.apache.spark" %% "spark-core"   % "1.6.1" %
> "provided"
> libraryDependencies += "org.apache.spark" %% "spark-streaming"   % "1.6.1"
> % "provided"
> libraryDependencies += "org.apache.spark" %% "spark-mllib"   % "1.6.1"  %
> "provided"
> libraryDependencies += "org.apache.spark" %% "spark-streaming-flume"   %
> "1.3.0"  % "provided"
> resolvers += "softprops-maven" at "
> http://dl.bintray.com/content/softprops/maven"
>
>
>
>
>
>
>
>
> On Wed, Jul 20, 2016 at 3:39 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>> you need an uber jar file.
>>
>> Have you actually followed the dependencies and project sub-directory
>> build?
>>
>> check this.
>>
>>
>> http://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea
>>
>> under three answers the top one.
>>
>> I started reading the official SBT tutorial
>> <http://www.scala-sbt.org/0.13/tutorial/>.  .....
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 20 July 2016 at 09:54, Sachin Mittal <sj...@gmail.com> wrote:
>>
>>> Hi,
>>> I am following the example under
>>> https://spark.apache.org/docs/latest/quick-start.html
>>> For standalone scala application.
>>>
>>> I added all my dependencies via build.sbt (one dependency is under lib
>>> folder).
>>>
>>> When I run sbt package I see the jar created under
>>> target/scala-2.10/
>>>
>>> So compile seems to be working fine. However when I inspect that jar, it
>>> only contains my scala class.
>>> Unlike in java application we build a standalone jar, which contains all
>>> the dependencies inside that jar, here all the dependencies are missing.
>>>
>>> So as expected when I run the application via spark-submit I get the
>>> NoClassDefFoundError.
>>>
>>> Here is my build.sbt
>>>
>>> name := "Test Advice Project"
>>> version := "1.0"
>>> scalaVersion := "2.10.6"
>>> libraryDependencies ++= Seq(
>>>     "org.apache.spark" %% "spark-core" % "1.6.1",
>>>     "org.apache.spark" %% "spark-sql" % "1.6.1"
>>> )
>>>
>>> Can anyone please guide me to as what is going wrong and why sbt package
>>> is not including all the dependencies jar classes in the new jar.
>>>
>>> Thanks
>>> Sachin
>>>
>>>
>>> On Tue, Jul 19, 2016 at 8:23 PM, Andrew Ehrlich <an...@aehrlich.com>
>>> wrote:
>>>
>>>> Yes, spark-core will depend on Hadoop and several other jars.  Here’s
>>>> the list of dependencies:
>>>> https://github.com/apache/spark/blob/master/core/pom.xml#L35
>>>>
>>>> Whether you need spark-sql depends on whether you will use the
>>>> DataFrame API. Without spark-sql, you will just have the RDD API.
>>>>
>>>> On Jul 19, 2016, at 7:09 AM, Sachin Mittal <sj...@gmail.com> wrote:
>>>>
>>>>
>>>> Hi,
>>>> Can someone please guide me what all jars I need to place in my lib
>>>> folder of the project to build a standalone scala application via sbt.
>>>>
>>>> Note I need to provide static dependencies and I cannot download the
>>>> jars using libraryDependencies.
>>>> So I need to provide all the jars upfront.
>>>>
>>>> So far I found that we need:
>>>> spark-core_<version>.jar
>>>>
>>>> Do we also need
>>>> spark-sql_<version>.jar
>>>> and
>>>> hadoop-core-<version>.jar
>>>>
>>>> Is there any jar from spark side I may be missing? What I found that
>>>> spark-core needs hadoop-core classes and if I don't add them then sbt was
>>>> giving me this error:
>>>> [error] bad symbolic reference. A signature in SparkContext.class
>>>> refers to term hadoop
>>>> [error] in package org.apache which is not available.
>>>>
>>>> So I was just confused on library dependency part when building an
>>>> application via sbt. Any inputs here would be helpful.
>>>>
>>>> Thanks
>>>> Sachin
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: Building standalone spark application via sbt

Posted by Marco Mistroni <mm...@gmail.com>.
Hello Sachin
  pls paste the NoClassDefFound Exception so we can see what's failing,
aslo please advise how are you running your Spark App
For an extremely simple case, let's assume  you have your  MyFirstSparkApp
packaged in your   myFirstSparkApp.jar
Then all you need to do would be to kick off

spark-submit --class MyFirstSparkApp   myFirstSparkApp.jar

if you have any external dependencies (not spark , let's assume you are
using common-utils.jar) then you should be able to kick it off via

spark-submit --class MyFirstSparkApp --jars common-utiils.jar
myFirstSparkApp.jar

I paste below the build.sbt i am using for my SparkExamples apps, hope this
helps.
kr
 marco

name := "SparkExamples"

version := "1.0"

scalaVersion := "2.10.5"


// Add a single dependency
libraryDependencies += "junit" % "junit" % "4.8" % "test"
libraryDependencies += "org.mockito" % "mockito-core" % "1.9.5"
libraryDependencies ++= Seq("org.slf4j" % "slf4j-api" % "1.7.5",
                            "org.slf4j" % "slf4j-simple" % "1.7.5",
                            "org.clapper" %% "grizzled-slf4j" % "1.0.2")
libraryDependencies += "org.powermock" % "powermock-mockito-release-full" %
"1.5.4" % "test"
libraryDependencies += "org.apache.spark" %% "spark-core"   % "1.6.1" %
"provided"
libraryDependencies += "org.apache.spark" %% "spark-streaming"   % "1.6.1"
% "provided"
libraryDependencies += "org.apache.spark" %% "spark-mllib"   % "1.6.1"  %
"provided"
libraryDependencies += "org.apache.spark" %% "spark-streaming-flume"   %
"1.3.0"  % "provided"
resolvers += "softprops-maven" at "
http://dl.bintray.com/content/softprops/maven"








On Wed, Jul 20, 2016 at 3:39 PM, Mich Talebzadeh <mi...@gmail.com>
wrote:

> you need an uber jar file.
>
> Have you actually followed the dependencies and project sub-directory
> build?
>
> check this.
>
>
> http://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea
>
> under three answers the top one.
>
> I started reading the official SBT tutorial
> <http://www.scala-sbt.org/0.13/tutorial/>.  .....
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 20 July 2016 at 09:54, Sachin Mittal <sj...@gmail.com> wrote:
>
>> Hi,
>> I am following the example under
>> https://spark.apache.org/docs/latest/quick-start.html
>> For standalone scala application.
>>
>> I added all my dependencies via build.sbt (one dependency is under lib
>> folder).
>>
>> When I run sbt package I see the jar created under
>> target/scala-2.10/
>>
>> So compile seems to be working fine. However when I inspect that jar, it
>> only contains my scala class.
>> Unlike in java application we build a standalone jar, which contains all
>> the dependencies inside that jar, here all the dependencies are missing.
>>
>> So as expected when I run the application via spark-submit I get the
>> NoClassDefFoundError.
>>
>> Here is my build.sbt
>>
>> name := "Test Advice Project"
>> version := "1.0"
>> scalaVersion := "2.10.6"
>> libraryDependencies ++= Seq(
>>     "org.apache.spark" %% "spark-core" % "1.6.1",
>>     "org.apache.spark" %% "spark-sql" % "1.6.1"
>> )
>>
>> Can anyone please guide me to as what is going wrong and why sbt package
>> is not including all the dependencies jar classes in the new jar.
>>
>> Thanks
>> Sachin
>>
>>
>> On Tue, Jul 19, 2016 at 8:23 PM, Andrew Ehrlich <an...@aehrlich.com>
>> wrote:
>>
>>> Yes, spark-core will depend on Hadoop and several other jars.  Here’s
>>> the list of dependencies:
>>> https://github.com/apache/spark/blob/master/core/pom.xml#L35
>>>
>>> Whether you need spark-sql depends on whether you will use the DataFrame
>>> API. Without spark-sql, you will just have the RDD API.
>>>
>>> On Jul 19, 2016, at 7:09 AM, Sachin Mittal <sj...@gmail.com> wrote:
>>>
>>>
>>> Hi,
>>> Can someone please guide me what all jars I need to place in my lib
>>> folder of the project to build a standalone scala application via sbt.
>>>
>>> Note I need to provide static dependencies and I cannot download the
>>> jars using libraryDependencies.
>>> So I need to provide all the jars upfront.
>>>
>>> So far I found that we need:
>>> spark-core_<version>.jar
>>>
>>> Do we also need
>>> spark-sql_<version>.jar
>>> and
>>> hadoop-core-<version>.jar
>>>
>>> Is there any jar from spark side I may be missing? What I found that
>>> spark-core needs hadoop-core classes and if I don't add them then sbt was
>>> giving me this error:
>>> [error] bad symbolic reference. A signature in SparkContext.class refers
>>> to term hadoop
>>> [error] in package org.apache which is not available.
>>>
>>> So I was just confused on library dependency part when building an
>>> application via sbt. Any inputs here would be helpful.
>>>
>>> Thanks
>>> Sachin
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Building standalone spark application via sbt

Posted by Mich Talebzadeh <mi...@gmail.com>.
you need an uber jar file.

Have you actually followed the dependencies and project sub-directory build?

check this.

http://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea

under three answers the top one.

I started reading the official SBT tutorial
<http://www.scala-sbt.org/0.13/tutorial/>.  .....

HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 20 July 2016 at 09:54, Sachin Mittal <sj...@gmail.com> wrote:

> Hi,
> I am following the example under
> https://spark.apache.org/docs/latest/quick-start.html
> For standalone scala application.
>
> I added all my dependencies via build.sbt (one dependency is under lib
> folder).
>
> When I run sbt package I see the jar created under
> target/scala-2.10/
>
> So compile seems to be working fine. However when I inspect that jar, it
> only contains my scala class.
> Unlike in java application we build a standalone jar, which contains all
> the dependencies inside that jar, here all the dependencies are missing.
>
> So as expected when I run the application via spark-submit I get the
> NoClassDefFoundError.
>
> Here is my build.sbt
>
> name := "Test Advice Project"
> version := "1.0"
> scalaVersion := "2.10.6"
> libraryDependencies ++= Seq(
>     "org.apache.spark" %% "spark-core" % "1.6.1",
>     "org.apache.spark" %% "spark-sql" % "1.6.1"
> )
>
> Can anyone please guide me to as what is going wrong and why sbt package
> is not including all the dependencies jar classes in the new jar.
>
> Thanks
> Sachin
>
>
> On Tue, Jul 19, 2016 at 8:23 PM, Andrew Ehrlich <an...@aehrlich.com>
> wrote:
>
>> Yes, spark-core will depend on Hadoop and several other jars.  Here’s the
>> list of dependencies:
>> https://github.com/apache/spark/blob/master/core/pom.xml#L35
>>
>> Whether you need spark-sql depends on whether you will use the DataFrame
>> API. Without spark-sql, you will just have the RDD API.
>>
>> On Jul 19, 2016, at 7:09 AM, Sachin Mittal <sj...@gmail.com> wrote:
>>
>>
>> Hi,
>> Can someone please guide me what all jars I need to place in my lib
>> folder of the project to build a standalone scala application via sbt.
>>
>> Note I need to provide static dependencies and I cannot download the jars
>> using libraryDependencies.
>> So I need to provide all the jars upfront.
>>
>> So far I found that we need:
>> spark-core_<version>.jar
>>
>> Do we also need
>> spark-sql_<version>.jar
>> and
>> hadoop-core-<version>.jar
>>
>> Is there any jar from spark side I may be missing? What I found that
>> spark-core needs hadoop-core classes and if I don't add them then sbt was
>> giving me this error:
>> [error] bad symbolic reference. A signature in SparkContext.class refers
>> to term hadoop
>> [error] in package org.apache which is not available.
>>
>> So I was just confused on library dependency part when building an
>> application via sbt. Any inputs here would be helpful.
>>
>> Thanks
>> Sachin
>>
>>
>>
>>
>>
>

Re: Building standalone spark application via sbt

Posted by Sachin Mittal <sj...@gmail.com>.
Hi,
I am following the example under
https://spark.apache.org/docs/latest/quick-start.html
For standalone scala application.

I added all my dependencies via build.sbt (one dependency is under lib
folder).

When I run sbt package I see the jar created under
target/scala-2.10/

So compile seems to be working fine. However when I inspect that jar, it
only contains my scala class.
Unlike in java application we build a standalone jar, which contains all
the dependencies inside that jar, here all the dependencies are missing.

So as expected when I run the application via spark-submit I get the
NoClassDefFoundError.

Here is my build.sbt

name := "Test Advice Project"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "1.6.1",
    "org.apache.spark" %% "spark-sql" % "1.6.1"
)

Can anyone please guide me to as what is going wrong and why sbt package is
not including all the dependencies jar classes in the new jar.

Thanks
Sachin


On Tue, Jul 19, 2016 at 8:23 PM, Andrew Ehrlich <an...@aehrlich.com> wrote:

> Yes, spark-core will depend on Hadoop and several other jars.  Here’s the
> list of dependencies:
> https://github.com/apache/spark/blob/master/core/pom.xml#L35
>
> Whether you need spark-sql depends on whether you will use the DataFrame
> API. Without spark-sql, you will just have the RDD API.
>
> On Jul 19, 2016, at 7:09 AM, Sachin Mittal <sj...@gmail.com> wrote:
>
>
> Hi,
> Can someone please guide me what all jars I need to place in my lib folder
> of the project to build a standalone scala application via sbt.
>
> Note I need to provide static dependencies and I cannot download the jars
> using libraryDependencies.
> So I need to provide all the jars upfront.
>
> So far I found that we need:
> spark-core_<version>.jar
>
> Do we also need
> spark-sql_<version>.jar
> and
> hadoop-core-<version>.jar
>
> Is there any jar from spark side I may be missing? What I found that
> spark-core needs hadoop-core classes and if I don't add them then sbt was
> giving me this error:
> [error] bad symbolic reference. A signature in SparkContext.class refers
> to term hadoop
> [error] in package org.apache which is not available.
>
> So I was just confused on library dependency part when building an
> application via sbt. Any inputs here would be helpful.
>
> Thanks
> Sachin
>
>
>
>
>

Re: Building standalone spark application via sbt

Posted by Andrew Ehrlich <an...@aehrlich.com>.
Yes, spark-core will depend on Hadoop and several other jars.  Here’s the list of dependencies: https://github.com/apache/spark/blob/master/core/pom.xml#L35 <https://github.com/apache/spark/blob/master/core/pom.xml#L35>

Whether you need spark-sql depends on whether you will use the DataFrame API. Without spark-sql, you will just have the RDD API.

> On Jul 19, 2016, at 7:09 AM, Sachin Mittal <sj...@gmail.com> wrote:
> 
> 
> Hi,
> Can someone please guide me what all jars I need to place in my lib folder of the project to build a standalone scala application via sbt.
> 
> Note I need to provide static dependencies and I cannot download the jars using libraryDependencies.
> So I need to provide all the jars upfront.
> 
> So far I found that we need:
> spark-core_<version>.jar
> 
> Do we also need
> spark-sql_<version>.jar
> and
> hadoop-core-<version>.jar
> 
> Is there any jar from spark side I may be missing? What I found that spark-core needs hadoop-core classes and if I don't add them then sbt was giving me this error:
> [error] bad symbolic reference. A signature in SparkContext.class refers to term hadoop
> [error] in package org.apache which is not available.
> 
> So I was just confused on library dependency part when building an application via sbt. Any inputs here would be helpful.
> 
> Thanks
> Sachin
> 
> 
>