You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by JerryLead <gi...@git.apache.org> on 2014/12/02 07:31:22 UTC
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
GitHub user JerryLead opened a pull request:
https://github.com/apache/spark/pull/3549
[SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten the lineage
The related JIRA is https://issues.apache.org/jira/browse/SPARK-4672
Iterative GraphX applications always have long lineage, while checkpoint() on EdgeRDD and VertexRDD themselves cannot shorten the lineage. In contrast, if we perform checkpoint() on their ParitionsRDD, the long lineage can be cut off. Moreover, the existing operations such as cache() in this code is performed on the PartitionsRDD, so checkpoint() should do the same way. More details and explanation can be found in the JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JerryLead/spark my_graphX_checkpoint
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3549.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3549
----
commit 52799e3ea2b22f4bcaec3d9cd4c8891e212be09e
Author: Lijie Xu <cs...@gmail.com>
Date: 2014-12-01T08:54:37Z
Merge pull request #1 from apache/master
update
commit c0169da181660281b3bd82678ae89a73f5926370
Author: JerryLead <je...@163.com>
Date: 2014-12-02T03:19:31Z
Merge branch 'master' of https://github.com/apache/spark
update to the latest version
commit ff08ed4a963127119d335a67d7977eaab0e4e437
Author: JerryLead <je...@163.com>
Date: 2014-12-02T04:42:43Z
Merge branch 'master' of https://github.com/apache/spark
commit d1aa8d88fd9af0d78066c9023ec7b30cd8341a3b
Author: JerryLead <je...@163.com>
Date: 2014-12-02T06:18:14Z
Perform checkpoint() on PartitionsRDD not VertexRDD and EdgeRDD themselves
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by jason-dai <gi...@git.apache.org>.
Github user jason-dai commented on the pull request:
https://github.com/apache/spark/pull/3549#issuecomment-65343799
Maybe we can try something like:
class ZippedPartitionsRDD2 (sc, f, …) {
val cleanF(part1, part2, ctx) = sc.clean(f(rdd1.iterator(part1, ctx), rdd2.iterator(part2, context)))
override def compute(s: Partition, context: TaskContext): Iterator[V] = {
…
cleanF(partitions(0), partitions(1), context)
}
…
}
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3549#issuecomment-65304304
[Test build #24056 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24056/consoleFull) for PR 3549 at commit [`d1aa8d8`](https://github.com/apache/spark/commit/d1aa8d88fd9af0d78066c9023ec7b30cd8341a3b).
* This patch merges cleanly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3549#issuecomment-65316895
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24056/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by ankurdave <gi...@git.apache.org>.
Github user ankurdave commented on the pull request:
https://github.com/apache/spark/pull/3549#issuecomment-65303586
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/3549
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/3549#issuecomment-65188935
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/3549#issuecomment-65316886
[Test build #24056 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24056/consoleFull) for PR 3549 at commit [`d1aa8d8`](https://github.com/apache/spark/commit/d1aa8d88fd9af0d78066c9023ec7b30cd8341a3b).
* This patch **passes all tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-4672][GraphX]Perform checkpoint() on Pa...
Posted by ankurdave <gi...@git.apache.org>.
Github user ankurdave commented on the pull request:
https://github.com/apache/spark/pull/3549#issuecomment-65336297
Thanks, merged into master and branch-1.2.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org