You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by lminer <lm...@hotmail.com> on 2016/11/18 19:31:28 UTC

Run spark with hadoop snapshot

I'm trying to figure out how to run spark with a snapshot of Hadoop 2.8 that
I built myself. I'm unclear on the configuration needed to get spark to work
with the snapshot.

I'm running spark on mesos. Per the spark documentation, I run spark-submit
as follows using the `spark-2.0.2-bin-without-hadoop`, but spark doesn't
appear to be finding hadoop 2.8.

    export SPARK_DIST_CLASSPATH=$(/path/to/hadoop2.8/bin/hadoop classpath)
    spark-submit --verbose --master mesos://$MASTER_HOST/mesos

I get the error:

    Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/fs/FSDataInputStream
    at
org.apache.spark.deploy.SparkSubmitArguments.handle(SparkSubmitArguments.scala:403)
    at
org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:163)
    at
org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:98)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:117)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.fs.FSDataInputStream
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    ... 5 more

Any ideas on the proper configuration?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Run-spark-with-hadoop-snapshot-tp28105.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org

Re: Run spark with hadoop snapshot

Posted by Luke Miner <lm...@gmail.com>.

Thanks! Should I do it from the spark build environment?

On Sat, Nov 19, 2016 at 4:48 AM, Steve Loughran <st...@hortonworks.com>
wrote:

> I'd recommend you build a fill spark release with the new hadoop version;
> you should have built that locally earlier the same day (so that ivy/maven
> pick up the snapshot)
>
>
> dev/make-distribution.sh -Pyarn,hadoop-2.7,hive -Dhadoop.version=2.9.0-
> SNAPSHOT;
>
>
>
> > On 18 Nov 2016, at 19:31, lminer <lm...@hotmail.com> wrote:
> >
> > I'm trying to figure out how to run spark with a snapshot of Hadoop 2.8
> that
> > I built myself. I'm unclear on the configuration needed to get spark to
> work
> > with the snapshot.
> >
> > I'm running spark on mesos. Per the spark documentation, I run
> spark-submit
> > as follows using the `spark-2.0.2-bin-without-hadoop`, but spark doesn't
> > appear to be finding hadoop 2.8.
> >
> >    export SPARK_DIST_CLASSPATH=$(/path/to/hadoop2.8/bin/hadoop
> classpath)
> >    spark-submit --verbose --master mesos://$MASTER_HOST/mesos
> >
> > I get the error:
> >
> >    Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/hadoop/fs/FSDataInputStream
> >    at
> > org.apache.spark.deploy.SparkSubmitArguments.handle(
> SparkSubmitArguments.scala:403)
> >    at
> > org.apache.spark.launcher.SparkSubmitOptionParser.parse(
> SparkSubmitOptionParser.java:163)
> >    at
> > org.apache.spark.deploy.SparkSubmitArguments.<init>(
> SparkSubmitArguments.scala:98)
> >    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:117)
> >    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.fs.FSDataInputStream
> >    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> >    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> >    at java.security.AccessController.doPrivileged(Native Method)
> >    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> >    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> >    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> >    ... 5 more
> >
> > Any ideas on the proper configuration?
> >
> >
> >
> > --
> > View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Run-spark-with-hadoop-snapshot-tp28105.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> >
> >
>
>

Re: Run spark with hadoop snapshot

Posted by Steve Loughran <st...@hortonworks.com>.

I'd recommend you build a fill spark release with the new hadoop version; you should have built that locally earlier the same day (so that ivy/maven pick up the snapshot)


dev/make-distribution.sh -Pyarn,hadoop-2.7,hive -Dhadoop.version=2.9.0-SNAPSHOT;



> On 18 Nov 2016, at 19:31, lminer <lm...@hotmail.com> wrote:
> 
> I'm trying to figure out how to run spark with a snapshot of Hadoop 2.8 that
> I built myself. I'm unclear on the configuration needed to get spark to work
> with the snapshot.
> 
> I'm running spark on mesos. Per the spark documentation, I run spark-submit
> as follows using the `spark-2.0.2-bin-without-hadoop`, but spark doesn't
> appear to be finding hadoop 2.8.
> 
>    export SPARK_DIST_CLASSPATH=$(/path/to/hadoop2.8/bin/hadoop classpath)
>    spark-submit --verbose --master mesos://$MASTER_HOST/mesos
> 
> I get the error:
> 
>    Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/hadoop/fs/FSDataInputStream
>    at
> org.apache.spark.deploy.SparkSubmitArguments.handle(SparkSubmitArguments.scala:403)
>    at
> org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:163)
>    at
> org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:98)
>    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:117)
>    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.fs.FSDataInputStream
>    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>    at java.security.AccessController.doPrivileged(Native Method)
>    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>    ... 5 more
> 
> Any ideas on the proper configuration?
> 
> 
> 
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Run-spark-with-hadoop-snapshot-tp28105.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org