You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2018/05/15 12:30:53 UTC

[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/21331

    [SPARK-24276][SQL] Order of literals in IN should not affect semantic equality

    ## What changes were proposed in this pull request?
    
    When two `In` operators are created with the same list of values, but different order, we are considering them as semantically different. This is wrong, since they have the same semantic meaning.
    
    The PR adds a canonicalization rule which orders the literals in the `In` operator so the semantic equality works properly.
    
    ## How was this patch tested?
    
    added UT


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-24276

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21331.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21331
    
----
commit 9595237a3178de7fb4c8e280a55072ed050fffc3
Author: Marco Gaido <ma...@...>
Date:   2018-05-15T12:13:39Z

    [SPARK-24276][SQL] Order of literals in IN should not affect semantic equality

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90860 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90860/testReport)** for PR 21331 at commit [`6cd34d9`](https://github.com/apache/spark/commit/6cd34d9322dc8025df3cb0fe5cf596de09ad10eb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91153/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90641 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90641/testReport)** for PR 21331 at commit [`9595237`](https://github.com/apache/spark/commit/9595237a3178de7fb4c8e280a55072ed050fffc3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90641 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90641/testReport)** for PR 21331 at commit [`9595237`](https://github.com/apache/spark/commit/9595237a3178de7fb4c8e280a55072ed050fffc3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #91309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91309/testReport)** for PR 21331 at commit [`5c88d86`](https://github.com/apache/spark/commit/5c88d869cd4750d9c5e02789a075a29d96357eaa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #91309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91309/testReport)** for PR 21331 at commit [`5c88d86`](https://github.com/apache/spark/commit/5c88d869cd4750d9c5e02789a075a29d96357eaa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90860/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Thanks! Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189458810
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    +      val literals = list.map(_.asInstanceOf[Literal])
    +      val ordering = TypeUtils.getInterpretedOrdering(literals.head.dataType)
    +      In(value, literals.sortBy(_.value)(ordering))
    --- End diff --
    
    actually it works. The problem in your example is that `array(1, 2)` is not a literal, but it is a `CreateArray` expression. So we are not reordering it. Using literal arrays it works as shown in the updated UT.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21331


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189411395
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    --- End diff --
    
    Let me check again. I might forget something here. :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189483055
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    +      val literals = list.map(_.asInstanceOf[Literal])
    +      val ordering = TypeUtils.getInterpretedOrdering(literals.head.dataType)
    +      In(value, literals.sortBy(_.value)(ordering))
    --- End diff --
    
    Thanks. BTW, it comes from [your example](https://github.com/apache/spark/pull/21331/files#r189407673). Anyway, my bad.  :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189488666
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    +      val literals = list.map(_.asInstanceOf[Literal])
    +      val ordering = TypeUtils.getInterpretedOrdering(literals.head.dataType)
    --- End diff --
    
    For non-ordering type, this will throw match error.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90890/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r190038976
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,10 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    case In(value, list) =>
    +      In(value, list.sortBy(_.semanticHash()))
    --- End diff --
    
    The only difference is that the elements in the list are canonicalized before the hash.  I can't think of any meaningful example. The only one I can think of is something like ` mybool in (5 < 2)` and ` mybool in (not 5 >= 2)`. But in the future we may have more rules here making meaningful using the `semanticHash` which is logically what we want here IMHO


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r190518402
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,10 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    case In(value, list) =>
    +      In(value, list.sortBy(_.semanticHash()))
    --- End diff --
    
    thanks for your great comment. This is very helpful and I'll keep in mind for the future. I think I addressed all the points now. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3391/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189715760
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,10 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    case In(value, list) =>
    +      In(value, list.sortBy(_.semanticHash()))
    --- End diff --
    
    Why using `semanticHash` instead of `hashCode`? Do you have a test case to show the difference?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90860 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90860/testReport)** for PR 21331 at commit [`6cd34d9`](https://github.com/apache/spark/commit/6cd34d9322dc8025df3cb0fe5cf596de09ad10eb).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189411116
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    --- End diff --
    
    Does isSetConvertable accept that? Also, does your example works with the very next line `val literals = list.map(_.asInstanceOf[Literal])`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189413969
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    +      val literals = list.map(_.asInstanceOf[Literal])
    +      val ordering = TypeUtils.getInterpretedOrdering(literals.head.dataType)
    +      In(value, literals.sortBy(_.value)(ordering))
    --- End diff --
    
    For complex literals like `array`, this doesn't work. Please add a test case for complex types and handle them.
    ```
    scala> sql("select * from t where array(1,2) in (array(1,2),array(2,1))").queryExecution.logical.canonicalized.semanticHash()
    res4: Int = -1398094385
    
    scala> sql("select * from t where array(1,2) in (array(2,1),array(1,2))").queryExecution.logical.canonicalized.semanticHash()
    res5: Int = -1233982198
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189413304
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    --- End diff --
    
    Yep. Right. I confused those cases.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    cc @dongjoon-hyun @gatorsmile @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90995/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90995 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90995/testReport)** for PR 21331 at commit [`0c484f1`](https://github.com/apache/spark/commit/0c484f16a7bcb0f4b476decdf6820c4d89d4cfb0).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90890/testReport)** for PR 21331 at commit [`ccbdd11`](https://github.com/apache/spark/commit/ccbdd11a1f2ff6f08db47694f315109b61c8726e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3476/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3707/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r190062760
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,10 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    case In(value, list) =>
    +      In(value, list.sortBy(_.semanticHash()))
    --- End diff --
    
    A few general suggestions. 
    - We need to add test cases when we want to cover the extra cases. 
    - Update the comments in the description when we add new code
    - Try our best to make the code simple.
    - Find the best test suite when we add a new test. If unable to find a good one, create a new test suite. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189324907
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    --- End diff --
    
    Thank you for pinging me, @mgaido91 . 
    `isSetConvertible` ensures all types are literal. So, `Literal.dataType.asNullable` doesn't do anything for here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r190790891
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +86,9 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    case In(value, list) => In(value, list.sortBy(_.hashCode()))
    --- End diff --
    
    Let us exclude IN subqueries from this case?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189525577
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    --- End diff --
    
    nice idea, thanks. I am doing it. It will allow also to cover more cases than simple literals. Thank you.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189714473
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -17,6 +17,8 @@
     
     package org.apache.spark.sql.catalyst.expressions
     
    +import org.apache.spark.sql.catalyst.util.TypeUtils
    --- End diff --
    
    Useless?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90890 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90890/testReport)** for PR 21331 at commit [`ccbdd11`](https://github.com/apache/spark/commit/ccbdd11a1f2ff6f08db47694f315109b61c8726e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91309/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189525415
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    +      val literals = list.map(_.asInstanceOf[Literal])
    +      val ordering = TypeUtils.getInterpretedOrdering(literals.head.dataType)
    +      In(value, literals.sortBy(_.value)(ordering))
    --- End diff --
    
    you're right, it's my fault. Anyway, with @viirya's suggestion, also this case works properly. I updated the UT too. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #91099 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91099/testReport)** for PR 21331 at commit [`a0af525`](https://github.com/apache/spark/commit/a0af52524e30a9ace9d9a6239de79a7251a2499c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #90995 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90995/testReport)** for PR 21331 at commit [`0c484f1`](https://github.com/apache/spark/commit/0c484f16a7bcb0f4b476decdf6820c4d89d4cfb0).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #91153 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91153/testReport)** for PR 21331 at commit [`5c88d86`](https://github.com/apache/spark/commit/5c88d869cd4750d9c5e02789a075a29d96357eaa).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3584/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #91099 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91099/testReport)** for PR 21331 at commit [`a0af525`](https://github.com/apache/spark/commit/a0af52524e30a9ace9d9a6239de79a7251a2499c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    **[Test build #91153 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91153/testReport)** for PR 21331 at commit [`5c88d86`](https://github.com/apache/spark/commit/5c88d869cd4750d9c5e02789a075a29d96357eaa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90641/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3232/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189488893
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    --- End diff --
    
    Can't we just reorder elements in list by `hashCode`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3547/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3412/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21331: [SPARK-24276][SQL] Order of literals in IN should...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21331#discussion_r189407673
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Canonicalize.scala ---
    @@ -85,6 +87,14 @@ object Canonicalize {
         case Not(GreaterThanOrEqual(l, r)) => LessThan(l, r)
         case Not(LessThanOrEqual(l, r)) => GreaterThan(l, r)
     
    +    // order the list in the In operator
    +    // we can do this only if all the elements in the list are literals with the same datatype
    +    case i @ In(value, list)
    +        if i.inSetConvertible && list.map(_.dataType.asNullable).distinct.size == 1 =>
    --- End diff --
    
    thanks for your comment @dongjoon-hyun, but I am not sure I agree with you. What if we have something like ` in (array(null, 1), array(1, 2, 3), array(3, 2, 1))`? The first literal would contain an array which can contain nulls while the others would not be, so in this case we would have 2 distinct datatypes (because of nullability).
    Am I missing something? Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21331: [SPARK-24276][SQL] Order of literals in IN should not af...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21331
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91099/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org