You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rxin <gi...@git.apache.org> on 2014/07/20 09:43:09 UTC

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/1500

    [SPARK-2598] RangePartitioner's binary search does not use the given Ordering

    We should fix this in branch-1.0 as well.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark rangePartitioner

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1500.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1500
    
----
commit c0a94f50cc9a422d1c513b808dada397cf3e7fcd
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-20T07:42:19Z

    [SPARK-2598] RangePartitioner's binary search does not use the given Ordering.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1500#issuecomment-49565601
  
    @rxin does this need to go into branch-0.9 as well? We should make another maintenance release for it if this issue is there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1500#issuecomment-49539744
  
    QA tests have started for PR 1500. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16866/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1500#issuecomment-49541588
  
    QA results for PR 1500:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16866/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1500#issuecomment-49569130
  
    Ah cool, that makes sense


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/1500#issuecomment-49554380
  
    Merged in master & branch-1.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/1500


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by mateiz <gi...@git.apache.org>.
Github user mateiz commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1500#discussion_r15153273
  
    --- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala ---
    @@ -91,6 +91,17 @@ class PartitioningSuite extends FunSuite with SharedSparkContext with PrivateMet
         }
       }
     
    +  test("RangePartitioner for keys that are not Comparable (but with Ordering)") {
    +    // Row does not extend Comparable, but has an implicit Ordering defined.
    +    implicit object RowOrdering extends Ordering[Row] {
    +      override def compare(x: Row, y: Row) = x.value - y.value
    +    }
    +
    +    val rdd = sc.parallelize(1 to 4500).map(x => (Row(x), Row(x)))
    +    val partitioner = new RangePartitioner(1500, rdd)
    +    partitioner.getPartition(Row(100))
    --- End diff --
    
    Shouldn't we also add a test where we do sortByKey and then collect and make sure stuff is in order?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/1500#issuecomment-49565928
  
    0.9.x doesn't have this problem because there was no binary search.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2598] RangePartitioner's binary search ...

Posted by aarondav <gi...@git.apache.org>.
Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/1500#issuecomment-49560613
  
    LGTM, FWIW


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---