You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by davies <gi...@git.apache.org> on 2016/08/02 19:05:08 UTC

[GitHub] spark pull request #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeR...

GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/14464

    [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

    ## What changes were proposed in this pull request?
    
    This patch fix the overflow in LongToUnsafeRowMap when the range of key is very wide (the key is much much smaller then minKey, for example, key is Long.MinValue, minKey is > 0).
    
    ## How was this patch tested?
    
    Added regression test (also for SPARK-16740)
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark fix_overflow

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14464.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14464
    
----
commit 97027f086f1e7a977832a1acfb9ebe29250a28ea
Author: Davies Liu <da...@databricks.com>
Date:   2016-08-02T19:02:13Z

    fix overflow in LongToUnsafeRowMap

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    **[Test build #63191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63191/consoleFull)** for PR 14464 at commit [`24169ac`](https://github.com/apache/spark/commit/24169acaeef7e53a1ed09bd4fd972c2a0b0f5866).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63133/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    **[Test build #63191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63191/consoleFull)** for PR 14464 at commit [`24169ac`](https://github.com/apache/spark/commit/24169acaeef7e53a1ed09bd4fd972c2a0b0f5866).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeR...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14464#discussion_r73224507
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala ---
    @@ -459,8 +459,8 @@ private[execution] final class LongToUnsafeRowMap(val mm: TaskMemoryManager, cap
        */
       def getValue(key: Long, resultRow: UnsafeRow): UnsafeRow = {
         if (isDense) {
    -      val idx = (key - minKey).toInt
    -      if (idx >= 0 && key <= maxKey && array(idx) > 0) {
    +      val idx = (key - minKey).toInt  // could overflow
    +      if (key >= minKey && key <= maxKey && array(idx) > 0) {
    --- End diff --
    
    Yeah I see where this is going but I think this doesn't totally eliminate the problem. `key - minKey` could still overflow such that the `int` is positive and even `>= minKey`. It seems like we need to test the keys against each other as longs, and only then covert to an `int` to index into the array?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeR...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14464#discussion_r73231143
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala ---
    @@ -459,8 +459,8 @@ private[execution] final class LongToUnsafeRowMap(val mm: TaskMemoryManager, cap
        */
       def getValue(key: Long, resultRow: UnsafeRow): UnsafeRow = {
         if (isDense) {
    -      val idx = (key - minKey).toInt
    -      if (idx >= 0 && key <= maxKey && array(idx) > 0) {
    +      val idx = (key - minKey).toInt  // could overflow
    +      if (key >= minKey && key <= maxKey && array(idx) > 0) {
    --- End diff --
    
    I think having both key >= minKey and key <= maxKey could make sure that there is no overflow, then we can safely use `(key - minKey).toInt`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    **[Test build #63133 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63133/consoleFull)** for PR 14464 at commit [`97027f0`](https://github.com/apache/spark/commit/97027f086f1e7a977832a1acfb9ebe29250a28ea).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63191/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeR...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14464#discussion_r73254410
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala ---
    @@ -459,8 +459,8 @@ private[execution] final class LongToUnsafeRowMap(val mm: TaskMemoryManager, cap
        */
       def getValue(key: Long, resultRow: UnsafeRow): UnsafeRow = {
         if (isDense) {
    -      val idx = (key - minKey).toInt
    -      if (idx >= 0 && key <= maxKey && array(idx) > 0) {
    +      val idx = (key - minKey).toInt  // could overflow
    +      if (key >= minKey && key <= maxKey && array(idx) > 0) {
    --- End diff --
    
    Ah yeah OK this should be OK. I might suggest the following as a little simpler, but whatever:
    
    ```
    if (key >= minKey && key <= maxKey) {
      val value = array((key - minKey).toInt)
      if (value > 0) {
        return getRow(value, resultRow)
      }
    }
    ```
    ?
      


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14464
  
    **[Test build #63133 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63133/consoleFull)** for PR 14464 at commit [`97027f0`](https://github.com/apache/spark/commit/97027f086f1e7a977832a1acfb9ebe29250a28ea).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14464: [SPARK-16802] [SQL] fix overflow in LongToUnsafeR...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14464


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org