You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2014/11/12 03:25:33 UTC

[jira] [Commented] (SPARK-4351) Record cacheable RDD reads and display RDD miss rates

    [ https://issues.apache.org/jira/browse/SPARK-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14207566#comment-14207566 ] 

Apache Spark commented on SPARK-4351:
-------------------------------------

User 'woggle' has created a pull request for this issue:
https://github.com/apache/spark/pull/3218

> Record cacheable RDD reads and display RDD miss rates
> -----------------------------------------------------
>
>                 Key: SPARK-4351
>                 URL: https://issues.apache.org/jira/browse/SPARK-4351
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Charles Reiss
>            Priority: Minor
>
> Currently, when Spark fails to keep an RDD cached, there is little visibility to the user (beyond performance effects), especially if the user is not reading executor logs. We could expose this information to the Web UI and the event log like we do for RDD storage information by reporting RDD reads and their results with task metrics.
> From this, live computation of RDD miss rates is straightforward, and information in the event log would enable more complicated post-hoc analyses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org