You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by mengxr <gi...@git.apache.org> on 2014/02/11 00:29:22 UTC

[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

GitHub user mengxr opened a pull request:

    https://github.com/apache/incubator-spark/pull/578

    Adding assignRanks and assignUniqueIds to RDD

    Assign ranks to an ordered or unordered data set is a common operation. This could be done by first counting records in each partition and then assign ranks in parallel.
    
    The purpose of assigning ranks to an unordered set is usually to get a unique id for each item, e.g., to map feature names to feature indices. In such cases, the assignment could be done without counting records, saving one spark job.
    
    https://spark-project.atlassian.net/browse/SPARK-1076

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-spark rank

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-spark/pull/578.patch

----
commit 21b434b77f1a7ffd75ba2d1ad4ab2296f1914971
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-02-10T23:18:41Z

    add assignRanks and assignUniqueIds to RDD

commit 630868c88f14ea955991acfd3d68caa8be6dedec
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-02-10T23:20:21Z

    newline

----


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34848915
  
    Merged build finished.


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34842336
  
     Merged build triggered.


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34847422
  
    Merged build started.


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34846847
  
    One or more automated tests failed
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12688/


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34710311
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12672/


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34846846
  
    Merged build finished.


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34844284
  
    BTW - the new dev rules require us to create a JIRA for this, and update the PR description to include a link to the JIRA.


[GitHub] incubator-spark pull request: Adding zipWithIndex and zipWithUniqu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34845562
  
     Merged build triggered.


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34847266
  
    Jenkins, retest this please.


[GitHub] incubator-spark pull request: Adding zipWithIndex and zipWithUniqu...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34845563
  
    Merged build started.


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr closed the pull request at:

    https://github.com/apache/incubator-spark/pull/578


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34705055
  
     Merged build triggered.


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34842337
  
    Merged build started.


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34710309
  
    Merged build finished.


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34846154
  
    lgtm. will merge once jenkins is happy


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34705056
  
    Merged build started.


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34847421
  
     Merged build triggered.


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34843553
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12685/


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34842407
  
    @rxin Thanks! Please see the updated code.


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by mengxr <gi...@git.apache.org>.
Github user mengxr commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34845415
  
    The link is at the bottom of the PR description.


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34848917
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12689/


[GitHub] incubator-spark pull request: SPARK-1076: zipWithIndex and zipWith...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34848995
  
    Thanks. Merging this. 
    
    We might need to add Java / Python APIs too ... but that can be done in a later PR.


[GitHub] incubator-spark pull request: Adding assignRanks and assignUniqueI...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/incubator-spark/pull/578#issuecomment-34843552
  
    Merged build finished.