You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2014/06/17 08:28:02 UTC

[jira] [Commented] (SPARK-2121) Not fully cached when there is enough memory

    [ https://issues.apache.org/jira/browse/SPARK-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033481#comment-14033481 ] 

Xiangrui Meng commented on SPARK-2121:
--------------------------------------

Could you explain how the tasks fail? Because even if the RDDs are not fully cached, the tasks should not fail.

> Not fully cached when there is enough memory
> --------------------------------------------
>
>                 Key: SPARK-2121
>                 URL: https://issues.apache.org/jira/browse/SPARK-2121
>             Project: Spark
>          Issue Type: Bug
>          Components: Block Manager, MLlib, Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Shuo Xiang
>
> While factorizing a large matrix using the latest Alternating Least Squares (ALS) in mllib, from sparkUI it looks like that spark fail to cache all the partitions of some RDD while memory is sufficient. Please find [this post](http://apache-spark-user-list.1001560.n3.nabble.com/Not-fully-cached-when-there-is-enough-memory-tt7429.html) for screenshots. This may cause subsequent job failures while executing `userOut.Count()` or `productsOut.count`.



--
This message was sent by Atlassian JIRA
(v6.2#6252)