You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Serega Sheypak <se...@gmail.com> on 2016/05/17 12:33:21 UTC

Why does spark 1.6.0 can't use jar files stored on HDFS

hi, I'm trying to:
1. upload my app jar files to HDFS
2. run spark-submit with:
2.1. --master yarn --deploy-mode cluster
or
2.2. --master yarn --deploy-mode client

specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar

When spark job is submitted, SparkSubmit client outputs:
Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
...

and then spark application main class fails with class not found exception.
Is there any workaround?

Re: Why does spark 1.6.0 can't use jar files stored on HDFS

Posted by Serega Sheypak <se...@gmail.com>.
Hi, I know about that approach.
I don't want to run mess of classes from single jar, I want to utilize
distributed cache functionality and ship application jar and dependent jars
explicitly.
--deploy-mode client unfortunately copies and distributes all jars
repeatedly for every spark job started from driver class...

2016-05-17 15:41 GMT+02:00 <sp...@yahoo.com>:

> Hi Serega,
>
> Create a jar including all the the dependencies and execute it like below
> through shell script
>
> /usr/local/spark/bin/spark-submit \  //location of your spark-submit
> --class classname \  //location of your main classname
> --master yarn \
> --deploy-mode cluster \
> /home/hadoop/SparkSampleProgram.jar  //location of your jar file
>
> Thanks
> Raj
>
>
>
> Sent from Yahoo Mail. Get the app <https://yho.com/148vdq>
>
>
> On Tuesday, May 17, 2016 6:03 PM, Serega Sheypak <se...@gmail.com>
> wrote:
>
>
> hi, I'm trying to:
> 1. upload my app jar files to HDFS
> 2. run spark-submit with:
> 2.1. --master yarn --deploy-mode cluster
> or
> 2.2. --master yarn --deploy-mode client
>
> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>
> When spark job is submitted, SparkSubmit client outputs:
> Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
> ...
>
> and then spark application main class fails with class not found exception.
> Is there any workaround?
>
>
>

Re: Why does spark 1.6.0 can't use jar files stored on HDFS

Posted by sp...@yahoo.com.INVALID.
Hi Serega,
Create a jar including all the the dependencies and execute it like below through shell script

/usr/local/spark/bin/spark-submit \  //location of your spark-submit
--class classname \  //location of your main classname
--master yarn \
--deploy-mode cluster \
/home/hadoop/SparkSampleProgram.jar  //location of your jar file

ThanksRaj
 

Sent from Yahoo Mail. Get the app 

    On Tuesday, May 17, 2016 6:03 PM, Serega Sheypak <se...@gmail.com> wrote:
 

 hi, I'm trying to:1. upload my app jar files to HDFS2. run spark-submit with:2.1. --master yarn --deploy-mode clusteror2.2. --master yarn --deploy-mode client
specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar 
When spark job is submitted, SparkSubmit client outputs:Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar ...

and then spark application main class fails with class not found exception.Is there any workaround?

  

Re: Why does spark 1.6.0 can't use jar files stored on HDFS

Posted by Serega Sheypak <se...@gmail.com>.
spark-submit --conf "spark.driver.userClassPathFirst=true" --class
com.MyClass --master yarn --deploy-mode client --jars
hdfs:///my-lib.jar,hdfs:///my-seocnd-lib.jar jar-wth-com-MyClass.jar
job_params



2016-05-17 15:41 GMT+02:00 Serega Sheypak <se...@gmail.com>:

> https://issues.apache.org/jira/browse/SPARK-10643
>
> Looks like it's the reason...
>
> 2016-05-17 15:31 GMT+02:00 Serega Sheypak <se...@gmail.com>:
>
>> No, and it looks like a problem.
>>
>> 2.2. --master yarn --deploy-mode client
>> means:
>> 1. submit spark as yarn app, but spark-driver is started on local
>> machine.
>> 2. A upload all dependent jars to HDFS and specify jar HDFS paths in
>> --jars arg.
>> 3. Driver runs my Spark Application main class named "MySuperSparkJob"
>> and MySuperSparkJob fails because it doesn't get jars, thay are all in
>> HDFS and not accessible from local machine...
>>
>>
>> 2016-05-17 15:18 GMT+02:00 Jeff Zhang <zj...@gmail.com>:
>>
>>> Do you put your app jar on hdfs ? The app jar must be on your local
>>> machine.
>>>
>>> On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <
>>> serega.sheypak@gmail.com> wrote:
>>>
>>>> hi, I'm trying to:
>>>> 1. upload my app jar files to HDFS
>>>> 2. run spark-submit with:
>>>> 2.1. --master yarn --deploy-mode cluster
>>>> or
>>>> 2.2. --master yarn --deploy-mode client
>>>>
>>>> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>>>>
>>>> When spark job is submitted, SparkSubmit client outputs:
>>>> Warning: Skip remote jar
>>>> hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar ...
>>>>
>>>> and then spark application main class fails with class not found
>>>> exception.
>>>> Is there any workaround?
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>>
>

Re: Why does spark 1.6.0 can't use jar files stored on HDFS

Posted by Serega Sheypak <se...@gmail.com>.
https://issues.apache.org/jira/browse/SPARK-10643

Looks like it's the reason...

2016-05-17 15:31 GMT+02:00 Serega Sheypak <se...@gmail.com>:

> No, and it looks like a problem.
>
> 2.2. --master yarn --deploy-mode client
> means:
> 1. submit spark as yarn app, but spark-driver is started on local machine.
> 2. A upload all dependent jars to HDFS and specify jar HDFS paths in
> --jars arg.
> 3. Driver runs my Spark Application main class named "MySuperSparkJob" and MySuperSparkJob
> fails because it doesn't get jars, thay are all in HDFS and not accessible
> from local machine...
>
>
> 2016-05-17 15:18 GMT+02:00 Jeff Zhang <zj...@gmail.com>:
>
>> Do you put your app jar on hdfs ? The app jar must be on your local
>> machine.
>>
>> On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <serega.sheypak@gmail.com
>> > wrote:
>>
>>> hi, I'm trying to:
>>> 1. upload my app jar files to HDFS
>>> 2. run spark-submit with:
>>> 2.1. --master yarn --deploy-mode cluster
>>> or
>>> 2.2. --master yarn --deploy-mode client
>>>
>>> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>>>
>>> When spark job is submitted, SparkSubmit client outputs:
>>> Warning: Skip remote jar
>>> hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar ...
>>>
>>> and then spark application main class fails with class not found
>>> exception.
>>> Is there any workaround?
>>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>

Re: Why does spark 1.6.0 can't use jar files stored on HDFS

Posted by Serega Sheypak <se...@gmail.com>.
No, and it looks like a problem.

2.2. --master yarn --deploy-mode client
means:
1. submit spark as yarn app, but spark-driver is started on local machine.
2. A upload all dependent jars to HDFS and specify jar HDFS paths in --jars
arg.
3. Driver runs my Spark Application main class named "MySuperSparkJob"
and MySuperSparkJob
fails because it doesn't get jars, thay are all in HDFS and not accessible
from local machine...


2016-05-17 15:18 GMT+02:00 Jeff Zhang <zj...@gmail.com>:

> Do you put your app jar on hdfs ? The app jar must be on your local
> machine.
>
> On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <se...@gmail.com>
> wrote:
>
>> hi, I'm trying to:
>> 1. upload my app jar files to HDFS
>> 2. run spark-submit with:
>> 2.1. --master yarn --deploy-mode cluster
>> or
>> 2.2. --master yarn --deploy-mode client
>>
>> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>>
>> When spark job is submitted, SparkSubmit client outputs:
>> Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
>> ...
>>
>> and then spark application main class fails with class not found
>> exception.
>> Is there any workaround?
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Why does spark 1.6.0 can't use jar files stored on HDFS

Posted by Jeff Zhang <zj...@gmail.com>.
Do you put your app jar on hdfs ? The app jar must be on your local
machine.

On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <se...@gmail.com>
wrote:

> hi, I'm trying to:
> 1. upload my app jar files to HDFS
> 2. run spark-submit with:
> 2.1. --master yarn --deploy-mode cluster
> or
> 2.2. --master yarn --deploy-mode client
>
> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>
> When spark job is submitted, SparkSubmit client outputs:
> Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
> ...
>
> and then spark application main class fails with class not found exception.
> Is there any workaround?
>



-- 
Best Regards

Jeff Zhang