You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Velu nambi (JIRA)" <ji...@apache.org> on 2015/08/27 19:26:46 UTC

[jira] [Created] (SPARK-10319) ALS training using PySpark throws a StackOverflowError

Velu nambi created SPARK-10319:
----------------------------------

             Summary: ALS training using PySpark throws a StackOverflowError
                 Key: SPARK-10319
                 URL: https://issues.apache.org/jira/browse/SPARK-10319
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.4.1
         Environment: Windows 10, spark - 1.4.1,
            Reporter: Velu nambi


When attempting to train a machine learning model using ALS in Spark's MLLib (1.4) on windows, Pyspark always terminates with a StackoverflowError. I tried adding the checkpoint as described in http://stackoverflow.com/a/31484461/36130 -- doesn't seem to help.

Here's the training code and stack trace:

{code:none}
ranks = [8, 12]
lambdas = [0.1, 10.0]
numIters = [10, 20]
bestModel = None
bestValidationRmse = float("inf")
bestRank = 0
bestLambda = -1.0
bestNumIter = -1

for rank, lmbda, numIter in itertools.product(ranks, lambdas, numIters):
    ALS.checkpointInterval = 2
    model = ALS.train(training, rank, numIter, lmbda)
    validationRmse = computeRmse(model, validation, numValidation)

    if (validationRmse < bestValidationRmse):
         bestModel = model
         bestValidationRmse = validationRmse
         bestRank = rank
         bestLambda = lmbda
         bestNumIter = numIter

testRmse = computeRmse(bestModel, test, numTest)
{code}

Stacktrace:

15/08/27 02:02:58 ERROR Executor: Exception in task 3.0 in stage 56.0 (TID 127)
java.lang.StackOverflowError
    at java.io.ObjectInputStream$BlockDataInputStream.readInt(Unknown Source)
    at java.io.ObjectInputStream.readHandle(Unknown Source)
    at java.io.ObjectInputStream.readClassDesc(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.readObject(Unknown Source)
    at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org