You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ueshin <gi...@git.apache.org> on 2018/11/29 07:11:48 UTC

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

GitHub user ueshin opened a pull request:

    https://github.com/apache/spark/pull/23176

    [SPARK-26211][SQL] Fix InSet for binary, and struct and array with null.

    ## What changes were proposed in this pull request?
    
    Currently `InSet` doesn't work properly for binary type, or struct and array type with null value in the set.
    Because, as for binary type, the `HashSet` doesn't work properly for `Array[Byte]`, and as for struct and array type with null value in the set, the `ordering` will throw a `NPE`.
    
    ## How was this patch tested?
    
    Added a few tests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ueshin/apache-spark issues/SPARK-26211/inset

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23176.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23176
    
----
commit 277c48f63b9d36028b5dfbb3c850418713197cc4
Author: Takuya UESHIN <ue...@...>
Date:   2018-11-29T06:47:29Z

    Fix InSet for binary, and struct and array with null.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    **[Test build #99435 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99435/testReport)** for PR 23176 at commit [`277c48f`](https://github.com/apache/spark/commit/277c48f63b9d36028b5dfbb3c850418713197cc4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    cc @cloud-fan @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237397442
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -367,11 +367,29 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
       }
     
       @transient lazy val set: Set[Any] = child.dataType match {
    -    case _: AtomicType => hset
    +    case t: AtomicType if !t.isInstanceOf[BinaryType] => hset
         case _: NullType => hset
         case _ =>
    +      val ord = TypeUtils.getInterpretedOrdering(child.dataType)
    +      val ordering = if (hasNull) {
    +        new Ordering[Any] {
    +          override def compare(x: Any, y: Any): Int = {
    --- End diff --
    
    InSet overrides nullSafeEval, and for codegen we look into `set` only if `!ev.isNull`, so I think we only need to consider the case one side is null.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5511/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/23176


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237382990
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -367,11 +367,29 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
       }
     
       @transient lazy val set: Set[Any] = child.dataType match {
    -    case _: AtomicType => hset
    +    case t: AtomicType if !t.isInstanceOf[BinaryType] => hset
         case _: NullType => hset
         case _ =>
    +      val ord = TypeUtils.getInterpretedOrdering(child.dataType)
    +      val ordering = if (hasNull) {
    +        new Ordering[Any] {
    +          override def compare(x: Any, y: Any): Int = {
    +            if (x == null && y == null) {
    +              0
    +            } else if (x == null) {
    +              -1
    +            } else if (y == null) {
    +              1
    +            } else {
    +              ord.compare(x, y)
    +            }
    +          }
    +        }
    +      } else {
    +        ord
    +      }
           // for structs use interpreted ordering to be able to compare UnsafeRows with non-UnsafeRows
    -      TreeSet.empty(TypeUtils.getInterpretedOrdering(child.dataType)) ++ hset
    +      TreeSet.empty(ordering) ++ hset
    --- End diff --
    
    and udpate eval to
    ```
    if (value == null) {
      null
    } else if (set.contains(value)) {
      true
    } else if (hasNull) {
      null
    } else {
      false
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237382322
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -367,11 +367,29 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
       }
     
       @transient lazy val set: Set[Any] = child.dataType match {
    -    case _: AtomicType => hset
    +    case t: AtomicType if !t.isInstanceOf[BinaryType] => hset
         case _: NullType => hset
         case _ =>
    +      val ord = TypeUtils.getInterpretedOrdering(child.dataType)
    +      val ordering = if (hasNull) {
    +        new Ordering[Any] {
    +          override def compare(x: Any, y: Any): Int = {
    +            if (x == null && y == null) {
    +              0
    +            } else if (x == null) {
    +              -1
    +            } else if (y == null) {
    +              1
    +            } else {
    +              ord.compare(x, y)
    +            }
    +          }
    +        }
    +      } else {
    +        ord
    +      }
           // for structs use interpreted ordering to be able to compare UnsafeRows with non-UnsafeRows
    -      TreeSet.empty(TypeUtils.getInterpretedOrdering(child.dataType)) ++ hset
    +      TreeSet.empty(ordering) ++ hset
    --- End diff --
    
    shall we just filter out nulls when building the tree set?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    **[Test build #99443 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99443/testReport)** for PR 23176 at commit [`5de541b`](https://github.com/apache/spark/commit/5de541bdfd950d9b05ab61d9079a74592c141055).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237771176
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala ---
    @@ -293,6 +293,54 @@ class PredicateSuite extends SparkFunSuite with ExpressionEvalHelper {
         }
       }
     
    +  test("INSET: binary") {
    --- End diff --
    
    Sure, I'll do it later. Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237383211
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala ---
    @@ -293,6 +293,54 @@ class PredicateSuite extends SparkFunSuite with ExpressionEvalHelper {
         }
       }
     
    +  test("INSET: binary") {
    +    val hS = HashSet[Any]() + Array(1.toByte, 2.toByte) + Array(3.toByte)
    +    val nS = HashSet[Any]() + Array(1.toByte, 2.toByte) + Array(3.toByte) + null
    +    val onetwo = Literal(Array(1.toByte, 2.toByte))
    +    val three = Literal(Array(3.toByte))
    +    val threefour = Literal(Array(3.toByte, 4.toByte))
    +    val nl = Literal(null, onetwo.dataType)
    +    checkEvaluation(InSet(onetwo, hS), true)
    +    checkEvaluation(InSet(three, hS), true)
    +    checkEvaluation(InSet(three, nS), true)
    --- End diff --
    
    this line is duplicated


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5507/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237826533
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala ---
    @@ -293,6 +293,54 @@ class PredicateSuite extends SparkFunSuite with ExpressionEvalHelper {
         }
       }
     
    +  test("INSET: binary") {
    --- End diff --
    
    Submitted #23187.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99435/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    LGTM, thanks for the fix!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237399670
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -367,11 +367,29 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
       }
     
       @transient lazy val set: Set[Any] = child.dataType match {
    -    case _: AtomicType => hset
    +    case t: AtomicType if !t.isInstanceOf[BinaryType] => hset
         case _: NullType => hset
         case _ =>
    +      val ord = TypeUtils.getInterpretedOrdering(child.dataType)
    +      val ordering = if (hasNull) {
    +        new Ordering[Any] {
    +          override def compare(x: Any, y: Any): Int = {
    --- End diff --
    
    Thanks! yeah, I'm updating as @cloud-fan's idea.
    Also we can use `nullSafeCodeGen` for codegen path, I'll update it as well.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237398085
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -367,11 +367,29 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
       }
     
       @transient lazy val set: Set[Any] = child.dataType match {
    -    case _: AtomicType => hset
    +    case t: AtomicType if !t.isInstanceOf[BinaryType] => hset
         case _: NullType => hset
         case _ =>
    +      val ord = TypeUtils.getInterpretedOrdering(child.dataType)
    +      val ordering = if (hasNull) {
    +        new Ordering[Any] {
    +          override def compare(x: Any, y: Any): Int = {
    --- End diff --
    
    Or simply filter out null from the tree set as @cloud-fan's idea.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    **[Test build #99443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99443/testReport)** for PR 23176 at commit [`5de541b`](https://github.com/apache/spark/commit/5de541bdfd950d9b05ab61d9079a74592c141055).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    LGTM, thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    good catch!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99439/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237770687
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala ---
    @@ -293,6 +293,54 @@ class PredicateSuite extends SparkFunSuite with ExpressionEvalHelper {
         }
       }
     
    +  test("INSET: binary") {
    --- End diff --
    
    good idea! we should test `In` and `InSet` together


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    thanks, merging to master/2.4/2.3!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99443/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5514/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    **[Test build #99439 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99439/testReport)** for PR 23176 at commit [`277c48f`](https://github.com/apache/spark/commit/277c48f63b9d36028b5dfbb3c850418713197cc4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237400321
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ---
    @@ -367,11 +367,29 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
       }
     
       @transient lazy val set: Set[Any] = child.dataType match {
    -    case _: AtomicType => hset
    +    case t: AtomicType if !t.isInstanceOf[BinaryType] => hset
         case _: NullType => hset
         case _ =>
    +      val ord = TypeUtils.getInterpretedOrdering(child.dataType)
    +      val ordering = if (hasNull) {
    +        new Ordering[Any] {
    +          override def compare(x: Any, y: Any): Int = {
    +            if (x == null && y == null) {
    +              0
    +            } else if (x == null) {
    +              -1
    +            } else if (y == null) {
    +              1
    +            } else {
    +              ord.compare(x, y)
    +            }
    +          }
    +        }
    +      } else {
    +        ord
    +      }
           // for structs use interpreted ordering to be able to compare UnsafeRows with non-UnsafeRows
    -      TreeSet.empty(TypeUtils.getInterpretedOrdering(child.dataType)) ++ hset
    +      TreeSet.empty(ordering) ++ hset
    --- End diff --
    
    Actually we are using `nullSafeEval`, so we don't need to update it.
    Instead, I'm updating to use `nullSafeCodeGen` for codegen path.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23176#discussion_r237766198
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/PredicateSuite.scala ---
    @@ -293,6 +293,54 @@ class PredicateSuite extends SparkFunSuite with ExpressionEvalHelper {
         }
       }
     
    +  test("INSET: binary") {
    --- End diff --
    
    Regarding the semantics, InSet is equal to In. Could we combine the test cases? Test both?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    **[Test build #99435 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99435/testReport)** for PR 23176 at commit [`277c48f`](https://github.com/apache/spark/commit/277c48f63b9d36028b5dfbb3c850418713197cc4).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    **[Test build #99439 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99439/testReport)** for PR 23176 at commit [`277c48f`](https://github.com/apache/spark/commit/277c48f63b9d36028b5dfbb3c850418713197cc4).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23176: [SPARK-26211][SQL] Fix InSet for binary, and struct and ...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/23176
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org