You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2014/06/19 05:03:26 UTC
[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib
can lead to the Serialized Task size become bigger and bigger
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036891#comment-14036891 ]
Xiangrui Meng commented on SPARK-2138:
--------------------------------------
Backend doesn't read spark.akka.frameSize.
> The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger
> -----------------------------------------------------------------------------------------------
>
> Key: SPARK-2138
> URL: https://issues.apache.org/jira/browse/SPARK-2138
> Project: Spark
> Issue Type: Bug
> Components: MLlib
> Affects Versions: 0.9.0, 0.9.1
> Reporter: DjvuLee
> Assignee: Xiangrui Meng
>
> When the algorithm running at certain stage, when running the reduceBykey() function, It can lead to Executor Lost and Task lost, after several times. the application exit.
> When this error occurred, the size of serialized task is bigger than 10MB, and the size become larger as the iteration increase.
> the data generation file: https://gist.github.com/djvulee/7e3b2c9eb33ff0037622
> the running code: https://gist.github.com/djvulee/6bf00e60885215e3bfd5
--
This message was sent by Atlassian JIRA
(v6.2#6252)