You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by sarutak <gi...@git.apache.org> on 2014/09/16 17:04:56 UTC
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
GitHub user sarutak opened a pull request:
https://github.com/apache/spark/pull/2411
[SPARK-3548] [WebUI] Display cache hit ratio on WebUI
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sarutak/spark cache-hit-ratio-feature
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2411.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2411
----
commit 1d1b18d80bd0ff2f2546821ad637eb07c3df59f2
Author: Kousuke Saruta <sa...@oss.nttdata.co.jp>
Date: 2014-09-16T11:44:04Z
Added Cache Hit Count and Cache Miss Count metrics
commit 678c676004f180f4087807cac6a472338d796b3e
Author: Kousuke Saruta <sa...@oss.nttdata.co.jp>
Date: 2014-09-16T13:09:07Z
Modified StagePage.scala
commit 05724f84b5927e589a76a4f3cc7e3b8161996512
Author: Kousuke Saruta <sa...@oss.nttdata.co.jp>
Date: 2014-09-16T15:03:40Z
Modified ExecutorTable.scala to display cache hit ratio
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55766313
[QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20390/consoleFull) for PR 2411 at commit [`05724f8`](https://github.com/apache/spark/commit/05724f84b5927e589a76a4f3cc7e3b8161996512).
* This patch **fails** unit tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55757731
[QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20390/consoleFull) for PR 2411 at commit [`05724f8`](https://github.com/apache/spark/commit/05724f84b5927e589a76a4f3cc7e3b8161996512).
* This patch merges cleanly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55820856
Hey so I think there are a few issues with this. Given the semantics of persisting RDD's I don't think it's really possible to express a "hit ratio" that makes sense. If I cache my RDD with MEMORY_AND_DISK, and the data is served from disk, is that considered a cache hit? We don't have a binary system of "cached, not cached", so reducing the result to a ratio doesn't make much sense.
Another issue with this is that it has somewhat awkward semantics around pipelining. For instance:
```
>>> val x = rdd1.cache().count
# This will be at most 33% cache ratio, even if all partitions of x are served from cache
>>> x.filter(...).filter(...).count
# This will be at most 25% cache ratio, even if all partitions of x are served from cache
>>> x.filter(...).filter(...).filter(...).count
```
So I'd propose instead of this to augment the existing InputMetrics with a count of the number of partitions coming from each input source. That way we just give the user all relevant information. I think we almost have this already, we just need to add a partition counter for each input source.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by sarutak <gi...@git.apache.org>.
Github user sarutak closed the pull request at:
https://github.com/apache/spark/pull/2411
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55773715
[QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20395/consoleFull) for PR 2411 at commit [`8a2000a`](https://github.com/apache/spark/commit/8a2000a71c3f9c0a0d58f02b035f7470621d1474).
* This patch merges cleanly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55794612
[QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20399/consoleFull) for PR 2411 at commit [`8a2000a`](https://github.com/apache/spark/commit/8a2000a71c3f9c0a0d58f02b035f7470621d1474).
* This patch **passes** unit tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-62332025
Let's close this issue for now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by sarutak <gi...@git.apache.org>.
Github user sarutak commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55783030
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55781856
[QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20395/consoleFull) for PR 2411 at commit [`8a2000a`](https://github.com/apache/spark/commit/8a2000a71c3f9c0a0d58f02b035f7470621d1474).
* This patch **fails** unit tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
* `class NonASCIICharacterChecker extends ScalariformChecker `
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/2411#issuecomment-55783640
[QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20399/consoleFull) for PR 2411 at commit [`8a2000a`](https://github.com/apache/spark/commit/8a2000a71c3f9c0a0d58f02b035f7470621d1474).
* This patch merges cleanly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org