You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Dan Dong <do...@gmail.com> on 2015/07/22 23:25:57 UTC

spark-submit and spark-shell behaviors mismatch.

Hi,

  I have a simple test spark program as below, the strange thing is that it
runs well under a spark-shell, but will get a runtime error of

java.lang.NoSuchMethodError:

in spark-submit, which indicate the line of:

val maps2=maps.collect.toMap

has problem. But why the compilation has no problem and it works well under
spark-shell(==>maps2: scala.collection.immutable.Map[Int,String] =
Map(269953 -> once, 97 -> a, 451002 -> upon, 117481 -> was, 226916 ->
there, 414413 -> time, 146327 -> king) )? Thanks!

import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.mllib.feature.HashingTF
import org.apache.spark.mllib.linalg.Vector
import org.apache.spark.rdd.RDD
import org.apache.spark.SparkContext
import org.apache.spark._
import SparkContext._


val docs=sc.parallelize(Array(Array("once" ,"upon", "a", "time"),
Array("there", "was", "a", "king")))

val hashingTF = new HashingTF()

val maps=docs.flatMap{term=>term.map(ele=>(hashingTF.indexOf(ele),ele))}

val maps2=maps.collect.toMap


Cheers,

Dan

Re: spark-submit and spark-shell behaviors mismatch.

Posted by Yana Kadiyska <ya...@gmail.com>.
that is pretty odd -- toMap not being there would be from scala...but what
is even weirder is that toMap is positively executed on the driver machine,
which is the same when you do spark-shell and spark-submit...does it work
if you run with --master local[*]?

Also, you can try to put a set -x in bin/spark-class right before the
RUNNER gets invoked -- this will show you the exact java command that is
being run, classpath and all

On Thu, Jul 23, 2015 at 3:14 PM, Dan Dong <do...@gmail.com> wrote:

> The problem should be "toMap", as I tested that "val maps2=maps.collect"
> runs ok. When I run spark-shell, I run with "--master
> mesos://cluster-1:5050" parameter which is the same with "spark-submit".
> Confused here.
>
>
>
> 2015-07-22 20:01 GMT-05:00 Yana Kadiyska <ya...@gmail.com>:
>
>> Is it complaining about "collect" or "toMap"? In either case this error
>> is indicative of an old version usually -- any chance you have an old
>> installation of Spark somehow? Or scala? You can try running spark-submit
>> with --verbose. Also, when you say it runs with spark-shell do you run
>> spark shell in local mode or with --master? I'd try with --master <whatever
>> master you use for spark-submit>
>>
>> Also, if you're using standalone mode I believe the worker log contains
>> the launch command for the executor -- you probably want to examine that
>> classpath carefully
>>
>> On Wed, Jul 22, 2015 at 5:25 PM, Dan Dong <do...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>   I have a simple test spark program as below, the strange thing is that
>>> it runs well under a spark-shell, but will get a runtime error of
>>>
>>> java.lang.NoSuchMethodError:
>>>
>>> in spark-submit, which indicate the line of:
>>>
>>> val maps2=maps.collect.toMap
>>>
>>> has problem. But why the compilation has no problem and it works well
>>> under spark-shell(==>maps2: scala.collection.immutable.Map[Int,String] =
>>> Map(269953 -> once, 97 -> a, 451002 -> upon, 117481 -> was, 226916 ->
>>> there, 414413 -> time, 146327 -> king) )? Thanks!
>>>
>>> import org.apache.spark.SparkContext._
>>> import org.apache.spark.SparkConf
>>> import org.apache.spark.mllib.feature.HashingTF
>>> import org.apache.spark.mllib.linalg.Vector
>>> import org.apache.spark.rdd.RDD
>>> import org.apache.spark.SparkContext
>>> import org.apache.spark._
>>> import SparkContext._
>>>
>>>
>>> val docs=sc.parallelize(Array(Array("once" ,"upon", "a", "time"), Array("there", "was", "a", "king")))
>>>
>>> val hashingTF = new HashingTF()
>>>
>>> val maps=docs.flatMap{term=>term.map(ele=>(hashingTF.indexOf(ele),ele))}
>>>
>>> val maps2=maps.collect.toMap
>>>
>>>
>>> Cheers,
>>>
>>> Dan
>>>
>>>
>>
>

Re: spark-submit and spark-shell behaviors mismatch.

Posted by Dan Dong <do...@gmail.com>.
The problem should be "toMap", as I tested that "val maps2=maps.collect"
runs ok. When I run spark-shell, I run with "--master
mesos://cluster-1:5050" parameter which is the same with "spark-submit".
Confused here.



2015-07-22 20:01 GMT-05:00 Yana Kadiyska <ya...@gmail.com>:

> Is it complaining about "collect" or "toMap"? In either case this error is
> indicative of an old version usually -- any chance you have an old
> installation of Spark somehow? Or scala? You can try running spark-submit
> with --verbose. Also, when you say it runs with spark-shell do you run
> spark shell in local mode or with --master? I'd try with --master <whatever
> master you use for spark-submit>
>
> Also, if you're using standalone mode I believe the worker log contains
> the launch command for the executor -- you probably want to examine that
> classpath carefully
>
> On Wed, Jul 22, 2015 at 5:25 PM, Dan Dong <do...@gmail.com> wrote:
>
>> Hi,
>>
>>   I have a simple test spark program as below, the strange thing is that
>> it runs well under a spark-shell, but will get a runtime error of
>>
>> java.lang.NoSuchMethodError:
>>
>> in spark-submit, which indicate the line of:
>>
>> val maps2=maps.collect.toMap
>>
>> has problem. But why the compilation has no problem and it works well
>> under spark-shell(==>maps2: scala.collection.immutable.Map[Int,String] =
>> Map(269953 -> once, 97 -> a, 451002 -> upon, 117481 -> was, 226916 ->
>> there, 414413 -> time, 146327 -> king) )? Thanks!
>>
>> import org.apache.spark.SparkContext._
>> import org.apache.spark.SparkConf
>> import org.apache.spark.mllib.feature.HashingTF
>> import org.apache.spark.mllib.linalg.Vector
>> import org.apache.spark.rdd.RDD
>> import org.apache.spark.SparkContext
>> import org.apache.spark._
>> import SparkContext._
>>
>>
>> val docs=sc.parallelize(Array(Array("once" ,"upon", "a", "time"), Array("there", "was", "a", "king")))
>>
>> val hashingTF = new HashingTF()
>>
>> val maps=docs.flatMap{term=>term.map(ele=>(hashingTF.indexOf(ele),ele))}
>>
>> val maps2=maps.collect.toMap
>>
>>
>> Cheers,
>>
>> Dan
>>
>>
>

Re: spark-submit and spark-shell behaviors mismatch.

Posted by Yana Kadiyska <ya...@gmail.com>.
Is it complaining about "collect" or "toMap"? In either case this error is
indicative of an old version usually -- any chance you have an old
installation of Spark somehow? Or scala? You can try running spark-submit
with --verbose. Also, when you say it runs with spark-shell do you run
spark shell in local mode or with --master? I'd try with --master <whatever
master you use for spark-submit>

Also, if you're using standalone mode I believe the worker log contains the
launch command for the executor -- you probably want to examine that
classpath carefully

On Wed, Jul 22, 2015 at 5:25 PM, Dan Dong <do...@gmail.com> wrote:

> Hi,
>
>   I have a simple test spark program as below, the strange thing is that
> it runs well under a spark-shell, but will get a runtime error of
>
> java.lang.NoSuchMethodError:
>
> in spark-submit, which indicate the line of:
>
> val maps2=maps.collect.toMap
>
> has problem. But why the compilation has no problem and it works well
> under spark-shell(==>maps2: scala.collection.immutable.Map[Int,String] =
> Map(269953 -> once, 97 -> a, 451002 -> upon, 117481 -> was, 226916 ->
> there, 414413 -> time, 146327 -> king) )? Thanks!
>
> import org.apache.spark.SparkContext._
> import org.apache.spark.SparkConf
> import org.apache.spark.mllib.feature.HashingTF
> import org.apache.spark.mllib.linalg.Vector
> import org.apache.spark.rdd.RDD
> import org.apache.spark.SparkContext
> import org.apache.spark._
> import SparkContext._
>
>
> val docs=sc.parallelize(Array(Array("once" ,"upon", "a", "time"), Array("there", "was", "a", "king")))
>
> val hashingTF = new HashingTF()
>
> val maps=docs.flatMap{term=>term.map(ele=>(hashingTF.indexOf(ele),ele))}
>
> val maps2=maps.collect.toMap
>
>
> Cheers,
>
> Dan
>
>