You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by mengxr <gi...@git.apache.org> on 2014/02/11 00:29:22 UTC
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
GitHub user mengxr opened a pull request:
https://github.com/apache/incubator-spark/pull/578
Adding assignRanks and assignUniqueIds to RDD
Assign ranks to an ordered or unordered data set is a common operation. This could be done by first counting records in each partition and then assign ranks in parallel.
The purpose of assigning ranks to an unordered set is usually to get a unique id for each item, e.g., to map feature names to feature indices. In such cases, the assignment could be done without counting records, saving one spark job.
https://spark-project.atlassian.net/browse/SPARK-1076
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-spark rank
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-spark/pull/578.patch
----
commit 21b434b77f1a7ffd75ba2d1ad4ab2296f1914971
Author: Xiangrui Meng <me...@databricks.com>
Date: 2014-02-10T23:18:41Z
add assignRanks and assignUniqueIds to RDD
commit 630868c88f14ea955991acfd3d68caa8be6dedec
Author: Xiangrui Meng <me...@databricks.com>
Date: 2014-02-10T23:20:21Z
newline
----
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34848915
Merged build finished.
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34842336
Merged build triggered.
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34847422
Merged build started.
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34846847
One or more automated tests failed
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12688/
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34710311
All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12672/
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34846846
Merged build finished.
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34844284
BTW - the new dev rules require us to create a JIRA for this, and update the PR description to include a link to the JIRA.
[GitHub] incubator-spark pull request: Adding zipWithIndex and zipWithUniqu...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34845562
Merged build triggered.
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34847266
Jenkins, retest this please.
[GitHub] incubator-spark pull request: Adding zipWithIndex and zipWithUniqu...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34845563
Merged build started.
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by mengxr <gi...@git.apache.org>.
Github user mengxr closed the pull request at:
https://github.com/apache/incubator-spark/pull/578
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34705055
Merged build triggered.
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34842337
Merged build started.
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34710309
Merged build finished.
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34846154
lgtm. will merge once jenkins is happy
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34705056
Merged build started.
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34847421
Merged build triggered.
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34843553
All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12685/
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34842407
@rxin Thanks! Please see the updated code.
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34845415
The link is at the bottom of the PR description.
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34848917
All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12689/
[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...
Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34848995
Thanks. Merging this.
We might need to add Java / Python APIs too ... but that can be done in a later PR.
[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/incubator-spark/pull/578#issuecomment-34843552
Merged build finished.