You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Michal Haris <mi...@visualdna.com> on 2015/07/21 16:04:20 UTC

1.4.0 classpath issue with spark-submit

I have a spark program that uses dataframes to query hive and I run it both
as a spark-shell for exploration and I have a runner class that executes
some tasks with spark-submit. I used to run against 1.4.0-SNAPSHOT. Since
then 1.4.0 and 1.4.1 were released so I tried to switch to the official
release. Now, when I run the program as  a shell, everything works but when
I try to run it with spark-submit it complains with this error:

Exception in thread "main" java.lang.ClassNotFoundException:
java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/ql/session/SessionState when creating Hive client
using classpath: file:/home/mharis/dxp-spark.jar
Please make sure that jars for your version of hive and hadoop are included
in the paths passed to spark.sql.hive.metastore.jars.

What is suspicious is firstly 'using classpath: ...' where the jar is my
program, i.e. the paths that are passed along with  --driver-class-path
option are missing. When I switch to an older 1.4.0-SNAPSHOT on the driver,
everything works. I observe the issue with 1.4.1.

Are there any known obvious changes to how spark-submit handles
configuration that I have missed ?

-- 
Michal Haris
Technical Architect
direct line: +44 (0) 207 749 0229
www.visualdna.com | t: +44 (0) 207 734 7033
31 Old Nichol Street
London
E2 7HR

Re: 1.4.0 classpath issue with spark-submit

Posted by Michal Haris <mi...@visualdna.com>.
The thing is that the class it is complaining about is part of the spark
assembly jar, not in my extra jar. The assembly jar was compiled with
-Phive which is proven by the fact that it works with the same SPARK_HOME
when run as shell.

On 23 July 2015 at 17:33, Akhil Das <ak...@sigmoidanalytics.com> wrote:

> You can try adding that jar in SPARK_CLASSPATH (its deprecated though) in
> spark-env.sh file.
>
> Thanks
> Best Regards
>
> On Tue, Jul 21, 2015 at 7:34 PM, Michal Haris <mi...@visualdna.com>
> wrote:
>
>> I have a spark program that uses dataframes to query hive and I run it
>> both as a spark-shell for exploration and I have a runner class that
>> executes some tasks with spark-submit. I used to run against
>> 1.4.0-SNAPSHOT. Since then 1.4.0 and 1.4.1 were released so I tried to
>> switch to the official release. Now, when I run the program as  a shell,
>> everything works but when I try to run it with spark-submit it complains
>> with this error:
>>
>> Exception in thread "main" java.lang.ClassNotFoundException:
>> java.lang.NoClassDefFoundError:
>> org/apache/hadoop/hive/ql/session/SessionState when creating Hive client
>> using classpath: file:/home/mharis/dxp-spark.jar
>> Please make sure that jars for your version of hive and hadoop are
>> included in the paths passed to spark.sql.hive.metastore.jars.
>>
>> What is suspicious is firstly 'using classpath: ...' where the jar is my
>> program, i.e. the paths that are passed along with  --driver-class-path
>> option are missing. When I switch to an older 1.4.0-SNAPSHOT on the driver,
>> everything works. I observe the issue with 1.4.1.
>>
>> Are there any known obvious changes to how spark-submit handles
>> configuration that I have missed ?
>>
>> --
>> Michal Haris
>> Technical Architect
>> direct line: +44 (0) 207 749 0229
>> www.visualdna.com | t: +44 (0) 207 734 7033
>> 31 Old Nichol Street
>> London
>> E2 7HR
>>
>
>


-- 
Michal Haris
Technical Architect
direct line: +44 (0) 207 749 0229
www.visualdna.com | t: +44 (0) 207 734 7033
31 Old Nichol Street
London
E2 7HR

Re: 1.4.0 classpath issue with spark-submit

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
You can try adding that jar in SPARK_CLASSPATH (its deprecated though) in
spark-env.sh file.

Thanks
Best Regards

On Tue, Jul 21, 2015 at 7:34 PM, Michal Haris <mi...@visualdna.com>
wrote:

> I have a spark program that uses dataframes to query hive and I run it
> both as a spark-shell for exploration and I have a runner class that
> executes some tasks with spark-submit. I used to run against
> 1.4.0-SNAPSHOT. Since then 1.4.0 and 1.4.1 were released so I tried to
> switch to the official release. Now, when I run the program as  a shell,
> everything works but when I try to run it with spark-submit it complains
> with this error:
>
> Exception in thread "main" java.lang.ClassNotFoundException:
> java.lang.NoClassDefFoundError:
> org/apache/hadoop/hive/ql/session/SessionState when creating Hive client
> using classpath: file:/home/mharis/dxp-spark.jar
> Please make sure that jars for your version of hive and hadoop are
> included in the paths passed to spark.sql.hive.metastore.jars.
>
> What is suspicious is firstly 'using classpath: ...' where the jar is my
> program, i.e. the paths that are passed along with  --driver-class-path
> option are missing. When I switch to an older 1.4.0-SNAPSHOT on the driver,
> everything works. I observe the issue with 1.4.1.
>
> Are there any known obvious changes to how spark-submit handles
> configuration that I have missed ?
>
> --
> Michal Haris
> Technical Architect
> direct line: +44 (0) 207 749 0229
> www.visualdna.com | t: +44 (0) 207 734 7033
> 31 Old Nichol Street
> London
> E2 7HR
>