You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Debasish Das <de...@gmail.com> on 2014/11/05 01:42:04 UTC

Issues with AbstractParams

Hi,

I build the master today and I was testing IR statistics on movielens
dataset (open up a PR in a bit)...

Right now in the master examples.MovieLensALS, case class Params extends
AbstractParam[Params]

On my localhost spark, if I run as follows it fails:

./bin/spark-submit --master spark://
tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
/Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
org.apache.spark.examples.mllib.MovieLensALS
./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
0.065 hdfs://localhost:8020/sandbox/movielens/

2014-11-04 16:00:18.691 java[1811:1903] Unable to load realm mapping info
from SCDynamicStore

14/11/04 16:00:18 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

14/11/04 16:00:21 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
tusca09lmlvt00c.uswin.ad.vzwcorp.com): java.io.InvalidClassException:
org.apache.spark.examples.mllib.MovieLensALS$Params; no valid constructor


java.io.ObjectStreamClass$ExceptionInfo.newInvalidClassException(ObjectStreamClass.java:150)


java.io.ObjectStreamClass.checkDeserialize(ObjectStreamClass.java:768)


java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1772)

        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)


java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)


java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)


java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)


java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)


java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)


java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)


java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)


java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)


java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)

        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

        java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)


org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)


org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)

        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)

        org.apache.spark.scheduler.Task.run(Task.scala:56)


org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)


java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        java.lang.Thread.run(Thread.java:745)
If I remove AbstractParams from examples.MovieLensALS and recompile then
code runs fine:

./bin/spark-submit --master spark://
tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
/Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
--total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
org.apache.spark.examples.mllib.MovieLensALS
./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
0.065 hdfs://localhost:8020/sandbox/movielens/

2014-11-04 16:26:25.359 java[2892:1903] Unable to load realm mapping info
from SCDynamicStore

14/11/04 16:26:25 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

Got 1000209 ratings from 6040 users on 3706 movies.

Training: 800650, test: 199559.

Test RMSE = 0.8525220763317215.

14/11/04 16:26:41 ERROR ConnectionManager: Corresponding SendingConnection
to ConnectionManagerId(tusca09lmlvt00c.uswin.ad.vzwcorp.com,50749) not found

14/11/04 16:26:42 WARN ConnectionManager: All connections not cleaned up
Is this a known issue for mllib examples ? If there is no open JIRA for
this, I can open it up...

Thanks.
Deb

Re: Issues with AbstractParams

Posted by Joseph Bradley <jo...@databricks.com>.
Hi Deb,
Thanks for pointing it out!  I don't know of a JIRA for it now, so it would
be great if you could open one.  I'm looking into the bug...
Joseph

On Tue, Nov 4, 2014 at 4:42 PM, Debasish Das <de...@gmail.com>
wrote:

> Hi,
>
> I build the master today and I was testing IR statistics on movielens
> dataset (open up a PR in a bit)...
>
> Right now in the master examples.MovieLensALS, case class Params extends
> AbstractParam[Params]
>
> On my localhost spark, if I run as follows it fails:
>
> ./bin/spark-submit --master spark://
> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
>
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> --total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
> org.apache.spark.examples.mllib.MovieLensALS
> ./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
> 0.065 hdfs://localhost:8020/sandbox/movielens/
>
> 2014-11-04 16:00:18.691 java[1811:1903] Unable to load realm mapping info
> from SCDynamicStore
>
> 14/11/04 16:00:18 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 14/11/04 16:00:21 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> tusca09lmlvt00c.uswin.ad.vzwcorp.com): java.io.InvalidClassException:
> org.apache.spark.examples.mllib.MovieLensALS$Params; no valid constructor
>
>
>
> java.io.ObjectStreamClass$ExceptionInfo.newInvalidClassException(ObjectStreamClass.java:150)
>
>
> java.io.ObjectStreamClass.checkDeserialize(ObjectStreamClass.java:768)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1772)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>
>
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>
>
>
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>
>         org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
>
>         org.apache.spark.scheduler.Task.run(Task.scala:56)
>
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>
>
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         java.lang.Thread.run(Thread.java:745)
> If I remove AbstractParams from examples.MovieLensALS and recompile then
> code runs fine:
>
> ./bin/spark-submit --master spark://
> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
>
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> --total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
> org.apache.spark.examples.mllib.MovieLensALS
> ./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
> 0.065 hdfs://localhost:8020/sandbox/movielens/
>
> 2014-11-04 16:26:25.359 java[2892:1903] Unable to load realm mapping info
> from SCDynamicStore
>
> 14/11/04 16:26:25 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> Got 1000209 ratings from 6040 users on 3706 movies.
>
> Training: 800650, test: 199559.
>
> Test RMSE = 0.8525220763317215.
>
> 14/11/04 16:26:41 ERROR ConnectionManager: Corresponding SendingConnection
> to ConnectionManagerId(tusca09lmlvt00c.uswin.ad.vzwcorp.com,50749) not
> found
>
> 14/11/04 16:26:42 WARN ConnectionManager: All connections not cleaned up
> Is this a known issue for mllib examples ? If there is no open JIRA for
> this, I can open it up...
>
> Thanks.
> Deb
>

Re: Issues with AbstractParams

Posted by Joseph Bradley <jo...@databricks.com>.
I'm making a JIRA and will then do a quick PR for it.  (Thanks both for
pointing out the bug & fix!)

On Tue, Nov 4, 2014 at 10:55 PM, Sean Owen <so...@cloudera.com> wrote:

> I don't think it's anything to do with AbstractParams. The problem is
> MovieLensALS$Params, which is a case class without default
> constructor. It is not Serializable.
>
> However you can see it gets used in an RDD function:
>
> val ratings = sc.textFile(params.input).map { line =>
>   val fields = line.split("::")
>   if (params.implicitPrefs) {
>
> It it just a matter of rejiggering the code to not pass params.
> Have at it; I'm happy to do it too.
>
> On Wed, Nov 5, 2014 at 12:42 AM, Debasish Das <de...@gmail.com>
> wrote:
> > Hi,
> >
> > I build the master today and I was testing IR statistics on movielens
> > dataset (open up a PR in a bit)...
> >
> > Right now in the master examples.MovieLensALS, case class Params extends
> > AbstractParam[Params]
> >
> > On my localhost spark, if I run as follows it fails:
> >
> > ./bin/spark-submit --master spark://
> > tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
> >
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> > --total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
> > org.apache.spark.examples.mllib.MovieLensALS
> > ./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
> > 0.065 hdfs://localhost:8020/sandbox/movielens/
> >
> > 2014-11-04 16:00:18.691 java[1811:1903] Unable to load realm mapping info
> > from SCDynamicStore
> >
> > 14/11/04 16:00:18 WARN NativeCodeLoader: Unable to load native-hadoop
> > library for your platform... using builtin-java classes where applicable
> >
> > 14/11/04 16:00:21 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> > tusca09lmlvt00c.uswin.ad.vzwcorp.com): java.io.InvalidClassException:
> > org.apache.spark.examples.mllib.MovieLensALS$Params; no valid constructor
> >
> >
> >
> java.io.ObjectStreamClass$ExceptionInfo.newInvalidClassException(ObjectStreamClass.java:150)
> >
> >
> > java.io.ObjectStreamClass.checkDeserialize(ObjectStreamClass.java:768)
> >
> >
> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1772)
> >
> >
>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >
> >
> > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> >
> >
> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> >
> >
> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> >
> >
>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >
> >
> > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> >
> >
> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> >
> >
> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> >
> >
>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >
> >
> > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> >
> >
> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> >
> >
> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> >
> >
>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >
> >         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> >
> >
> >
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
> >
> >
> >
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
> >
> >
>  org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
> >
> >         org.apache.spark.scheduler.Task.run(Task.scala:56)
> >
> >
> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
> >
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >
> >         java.lang.Thread.run(Thread.java:745)
> > If I remove AbstractParams from examples.MovieLensALS and recompile then
> > code runs fine:
> >
> > ./bin/spark-submit --master spark://
> > tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
> >
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> > --total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
> > org.apache.spark.examples.mllib.MovieLensALS
> > ./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
> > 0.065 hdfs://localhost:8020/sandbox/movielens/
> >
> > 2014-11-04 16:26:25.359 java[2892:1903] Unable to load realm mapping info
> > from SCDynamicStore
> >
> > 14/11/04 16:26:25 WARN NativeCodeLoader: Unable to load native-hadoop
> > library for your platform... using builtin-java classes where applicable
> >
> > Got 1000209 ratings from 6040 users on 3706 movies.
> >
> > Training: 800650, test: 199559.
> >
> > Test RMSE = 0.8525220763317215.
> >
> > 14/11/04 16:26:41 ERROR ConnectionManager: Corresponding
> SendingConnection
> > to ConnectionManagerId(tusca09lmlvt00c.uswin.ad.vzwcorp.com,50749) not
> found
> >
> > 14/11/04 16:26:42 WARN ConnectionManager: All connections not cleaned up
> > Is this a known issue for mllib examples ? If there is no open JIRA for
> > this, I can open it up...
> >
> > Thanks.
> > Deb
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: Issues with AbstractParams

Posted by Sean Owen <so...@cloudera.com>.
I don't think it's anything to do with AbstractParams. The problem is
MovieLensALS$Params, which is a case class without default
constructor. It is not Serializable.

However you can see it gets used in an RDD function:

val ratings = sc.textFile(params.input).map { line =>
  val fields = line.split("::")
  if (params.implicitPrefs) {

It it just a matter of rejiggering the code to not pass params.
Have at it; I'm happy to do it too.

On Wed, Nov 5, 2014 at 12:42 AM, Debasish Das <de...@gmail.com> wrote:
> Hi,
>
> I build the master today and I was testing IR statistics on movielens
> dataset (open up a PR in a bit)...
>
> Right now in the master examples.MovieLensALS, case class Params extends
> AbstractParam[Params]
>
> On my localhost spark, if I run as follows it fails:
>
> ./bin/spark-submit --master spark://
> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> --total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
> org.apache.spark.examples.mllib.MovieLensALS
> ./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
> 0.065 hdfs://localhost:8020/sandbox/movielens/
>
> 2014-11-04 16:00:18.691 java[1811:1903] Unable to load realm mapping info
> from SCDynamicStore
>
> 14/11/04 16:00:18 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 14/11/04 16:00:21 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> tusca09lmlvt00c.uswin.ad.vzwcorp.com): java.io.InvalidClassException:
> org.apache.spark.examples.mllib.MovieLensALS$Params; no valid constructor
>
>
> java.io.ObjectStreamClass$ExceptionInfo.newInvalidClassException(ObjectStreamClass.java:150)
>
>
> java.io.ObjectStreamClass.checkDeserialize(ObjectStreamClass.java:768)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1772)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
>
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>
>
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>
>         org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
>
>         org.apache.spark.scheduler.Task.run(Task.scala:56)
>
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         java.lang.Thread.run(Thread.java:745)
> If I remove AbstractParams from examples.MovieLensALS and recompile then
> code runs fine:
>
> ./bin/spark-submit --master spark://
> tusca09lmlvt00c.uswin.ad.vzwcorp.com:7077 --jars
> /Users/v606014/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar
> --total-executor-cores 4 --executor-memory 4g --driver-memory 1g --class
> org.apache.spark.examples.mllib.MovieLensALS
> ./examples/target/spark-examples_2.10-1.2.0-SNAPSHOT.jar --kryo --lambda
> 0.065 hdfs://localhost:8020/sandbox/movielens/
>
> 2014-11-04 16:26:25.359 java[2892:1903] Unable to load realm mapping info
> from SCDynamicStore
>
> 14/11/04 16:26:25 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> Got 1000209 ratings from 6040 users on 3706 movies.
>
> Training: 800650, test: 199559.
>
> Test RMSE = 0.8525220763317215.
>
> 14/11/04 16:26:41 ERROR ConnectionManager: Corresponding SendingConnection
> to ConnectionManagerId(tusca09lmlvt00c.uswin.ad.vzwcorp.com,50749) not found
>
> 14/11/04 16:26:42 WARN ConnectionManager: All connections not cleaned up
> Is this a known issue for mllib examples ? If there is no open JIRA for
> this, I can open it up...
>
> Thanks.
> Deb

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org