You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Serega Sheypak <se...@gmail.com> on 2016/05/17 12:33:21 UTC
Why does spark 1.6.0 can't use jar files stored on HDFS
hi, I'm trying to:
1. upload my app jar files to HDFS
2. run spark-submit with:
2.1. --master yarn --deploy-mode cluster
or
2.2. --master yarn --deploy-mode client
specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
When spark job is submitted, SparkSubmit client outputs:
Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
...
and then spark application main class fails with class not found exception.
Is there any workaround?
Re: Why does spark 1.6.0 can't use jar files stored on HDFS
Posted by Serega Sheypak <se...@gmail.com>.
Hi, I know about that approach.
I don't want to run mess of classes from single jar, I want to utilize
distributed cache functionality and ship application jar and dependent jars
explicitly.
--deploy-mode client unfortunately copies and distributes all jars
repeatedly for every spark job started from driver class...
2016-05-17 15:41 GMT+02:00 <sp...@yahoo.com>:
> Hi Serega,
>
> Create a jar including all the the dependencies and execute it like below
> through shell script
>
> /usr/local/spark/bin/spark-submit \ //location of your spark-submit
> --class classname \ //location of your main classname
> --master yarn \
> --deploy-mode cluster \
> /home/hadoop/SparkSampleProgram.jar //location of your jar file
>
> Thanks
> Raj
>
>
>
> Sent from Yahoo Mail. Get the app <https://yho.com/148vdq>
>
>
> On Tuesday, May 17, 2016 6:03 PM, Serega Sheypak <se...@gmail.com>
> wrote:
>
>
> hi, I'm trying to:
> 1. upload my app jar files to HDFS
> 2. run spark-submit with:
> 2.1. --master yarn --deploy-mode cluster
> or
> 2.2. --master yarn --deploy-mode client
>
> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>
> When spark job is submitted, SparkSubmit client outputs:
> Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
> ...
>
> and then spark application main class fails with class not found exception.
> Is there any workaround?
>
>
>
Re: Why does spark 1.6.0 can't use jar files stored on HDFS
Posted by sp...@yahoo.com.INVALID.
Hi Serega,
Create a jar including all the the dependencies and execute it like below through shell script
/usr/local/spark/bin/spark-submit \ //location of your spark-submit
--class classname \ //location of your main classname
--master yarn \
--deploy-mode cluster \
/home/hadoop/SparkSampleProgram.jar //location of your jar file
ThanksRaj
Sent from Yahoo Mail. Get the app
On Tuesday, May 17, 2016 6:03 PM, Serega Sheypak <se...@gmail.com> wrote:
hi, I'm trying to:1. upload my app jar files to HDFS2. run spark-submit with:2.1. --master yarn --deploy-mode clusteror2.2. --master yarn --deploy-mode client
specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
When spark job is submitted, SparkSubmit client outputs:Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar ...
and then spark application main class fails with class not found exception.Is there any workaround?
Re: Why does spark 1.6.0 can't use jar files stored on HDFS
Posted by Serega Sheypak <se...@gmail.com>.
spark-submit --conf "spark.driver.userClassPathFirst=true" --class
com.MyClass --master yarn --deploy-mode client --jars
hdfs:///my-lib.jar,hdfs:///my-seocnd-lib.jar jar-wth-com-MyClass.jar
job_params
2016-05-17 15:41 GMT+02:00 Serega Sheypak <se...@gmail.com>:
> https://issues.apache.org/jira/browse/SPARK-10643
>
> Looks like it's the reason...
>
> 2016-05-17 15:31 GMT+02:00 Serega Sheypak <se...@gmail.com>:
>
>> No, and it looks like a problem.
>>
>> 2.2. --master yarn --deploy-mode client
>> means:
>> 1. submit spark as yarn app, but spark-driver is started on local
>> machine.
>> 2. A upload all dependent jars to HDFS and specify jar HDFS paths in
>> --jars arg.
>> 3. Driver runs my Spark Application main class named "MySuperSparkJob"
>> and MySuperSparkJob fails because it doesn't get jars, thay are all in
>> HDFS and not accessible from local machine...
>>
>>
>> 2016-05-17 15:18 GMT+02:00 Jeff Zhang <zj...@gmail.com>:
>>
>>> Do you put your app jar on hdfs ? The app jar must be on your local
>>> machine.
>>>
>>> On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <
>>> serega.sheypak@gmail.com> wrote:
>>>
>>>> hi, I'm trying to:
>>>> 1. upload my app jar files to HDFS
>>>> 2. run spark-submit with:
>>>> 2.1. --master yarn --deploy-mode cluster
>>>> or
>>>> 2.2. --master yarn --deploy-mode client
>>>>
>>>> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>>>>
>>>> When spark job is submitted, SparkSubmit client outputs:
>>>> Warning: Skip remote jar
>>>> hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar ...
>>>>
>>>> and then spark application main class fails with class not found
>>>> exception.
>>>> Is there any workaround?
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>>
>
Re: Why does spark 1.6.0 can't use jar files stored on HDFS
Posted by Serega Sheypak <se...@gmail.com>.
https://issues.apache.org/jira/browse/SPARK-10643
Looks like it's the reason...
2016-05-17 15:31 GMT+02:00 Serega Sheypak <se...@gmail.com>:
> No, and it looks like a problem.
>
> 2.2. --master yarn --deploy-mode client
> means:
> 1. submit spark as yarn app, but spark-driver is started on local machine.
> 2. A upload all dependent jars to HDFS and specify jar HDFS paths in
> --jars arg.
> 3. Driver runs my Spark Application main class named "MySuperSparkJob" and MySuperSparkJob
> fails because it doesn't get jars, thay are all in HDFS and not accessible
> from local machine...
>
>
> 2016-05-17 15:18 GMT+02:00 Jeff Zhang <zj...@gmail.com>:
>
>> Do you put your app jar on hdfs ? The app jar must be on your local
>> machine.
>>
>> On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <serega.sheypak@gmail.com
>> > wrote:
>>
>>> hi, I'm trying to:
>>> 1. upload my app jar files to HDFS
>>> 2. run spark-submit with:
>>> 2.1. --master yarn --deploy-mode cluster
>>> or
>>> 2.2. --master yarn --deploy-mode client
>>>
>>> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>>>
>>> When spark job is submitted, SparkSubmit client outputs:
>>> Warning: Skip remote jar
>>> hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar ...
>>>
>>> and then spark application main class fails with class not found
>>> exception.
>>> Is there any workaround?
>>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>
Re: Why does spark 1.6.0 can't use jar files stored on HDFS
Posted by Serega Sheypak <se...@gmail.com>.
No, and it looks like a problem.
2.2. --master yarn --deploy-mode client
means:
1. submit spark as yarn app, but spark-driver is started on local machine.
2. A upload all dependent jars to HDFS and specify jar HDFS paths in --jars
arg.
3. Driver runs my Spark Application main class named "MySuperSparkJob"
and MySuperSparkJob
fails because it doesn't get jars, thay are all in HDFS and not accessible
from local machine...
2016-05-17 15:18 GMT+02:00 Jeff Zhang <zj...@gmail.com>:
> Do you put your app jar on hdfs ? The app jar must be on your local
> machine.
>
> On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <se...@gmail.com>
> wrote:
>
>> hi, I'm trying to:
>> 1. upload my app jar files to HDFS
>> 2. run spark-submit with:
>> 2.1. --master yarn --deploy-mode cluster
>> or
>> 2.2. --master yarn --deploy-mode client
>>
>> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>>
>> When spark job is submitted, SparkSubmit client outputs:
>> Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
>> ...
>>
>> and then spark application main class fails with class not found
>> exception.
>> Is there any workaround?
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>
Re: Why does spark 1.6.0 can't use jar files stored on HDFS
Posted by Jeff Zhang <zj...@gmail.com>.
Do you put your app jar on hdfs ? The app jar must be on your local
machine.
On Tue, May 17, 2016 at 8:33 PM, Serega Sheypak <se...@gmail.com>
wrote:
> hi, I'm trying to:
> 1. upload my app jar files to HDFS
> 2. run spark-submit with:
> 2.1. --master yarn --deploy-mode cluster
> or
> 2.2. --master yarn --deploy-mode client
>
> specifying --jars hdfs:///my/home/commons.jar,hdfs:///my/home/super.jar
>
> When spark job is submitted, SparkSubmit client outputs:
> Warning: Skip remote jar hdfs:///user/baba/lib/akka-slf4j_2.11-2.3.11.jar
> ...
>
> and then spark application main class fails with class not found exception.
> Is there any workaround?
>
--
Best Regards
Jeff Zhang