You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by dilipbiswal <gi...@git.apache.org> on 2015/10/05 20:49:08 UTC

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

GitHub user dilipbiswal opened a pull request:

    https://github.com/apache/spark/pull/8983

    [SPARK-8654][SQL] Fix Analysis exception when using NULL IN (...)

    In the analysis phase , while processing the rules for IN predicate, we
    compare the in-list types to the lhs expression type and generate
    cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
    generating cast between in list types to NULL like cast (1 as NULL) which
    is not a valid cast.
    
    The fix is to not generate such a cast if the lhs type is a NullType instead
    we translate the expression to Literal(Null).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dilipbiswal/spark spark_8654

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8983
    
----
commit 38f973bb124c63c1caabe14ee6e5cca7b764b15a
Author: Dilip Biswal <db...@us.ibm.com>
Date:   2015-10-02T23:20:56Z

    [SPARK-8654] Analysis exception when using NULL IN (...) : invalid cast
    
    In the analysis phase , while processing the rules for IN predicate, we
    compare the in-list types to the lhs expression type and generate
    cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
    generating cast between in list types to NULL like cast (1 as NULL) which
    is not a valid cast.
    
    The fix is to not generate such a cast if the lhs type is a NullType instead
    we translate the expression to Literal(Null).

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145977094
  
    yup, another JIRA please.
    
    You need ask some committers like @marmbrus  to review your PR to get it merged .
    
    My final thoughts on this PR:
    When the type is confilct, for example `null in (true, array(2,3))`, should we return null or report type conflict error during analysis? What hive's behavior for this case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145689983
  
    if you change the `value` type to boolean, then we have to change each element in `list` to boolean type too, which maybe dangerous as a lot of types can't be casted to boolean. See https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146404345
  
    @cloud-fan
    Hi Wenchen, can you please take a look at the changes and let me know what you think..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145639934
  
    Thanks for reviewing the code Wenchen. I was trying to model the test case based on what was put in the JIRA which did a caseInsensitiveAnalyze. I have fixed it now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146634337
  
    Thanks, merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146679919
  
    test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8983#discussion_r41212281
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -305,12 +305,17 @@ object HiveTypeCoercion {
     
       /**
        * Convert all expressions in in() list to the left operator type
    +   * except when the left operator type is NullType. In case when left hand
    +   * operator type is NullType create a Literal(Null).
        */
       object InConversion extends Rule[LogicalPlan] {
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
           // Skip nodes who's children have not been resolved yet.
           case e if !e.childrenResolved => e
     
    +      case i @ In(a, b) if (a.dataType == NullType) =>
    +        Literal.create(null, BooleanType)
    --- End diff --
    
    Ah sorry for my mistake. I thought you are casting the `a` to boolean type, but actually you just turn the result to boolean null.
    
    Can you reference to a hive doc that says something like "if the value is null, the In operation will always return null"? I think we should follow hive semantic here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146688197
  
    open a new one is also OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145711558
  
    Hi Wenchen,
    
    Here is the link i could find where its a bit confusing on the equality operator.
    
    https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-RelationalOperators
    
    In order to test it out , i tried the queries on hive like following ..
    
    hive> select * from tnull ;
    OK
    1
    2
    NULL
    
    
    hive> select null = 1 from tnull ;
    OK
    NULL
    NULL
    NULL
    Time taken: 0.118 seconds, Fetched: 3 row(s)
    hive> select null in (1,2) from tnull ;
    OK
    NULL
    NULL
    NULL
    
    Please let me know what you think and thanks again for your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145630551
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146631777
  
    I only found one rule in `Optimizer` to optimize `In` to `InSet` when the values in `list` are all literals.
    I think it's better to add a case in `NullPropagation` to optimize `In` with literal null `key` to literal null even the `list` are not all literals.
    
    Anyway it should be another PR and this PR LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146682761
  
    just reopen this PR and we will trigger a test on our jenkin for it.
    
    For local test, you can do `./build/sbt catalyst/test`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146047895
  
    @marmbrus
    
    Thanks a lot michael for looking into this. I debugged hive to understand the 
    behaviour and would like to share my findings. I wanted to make sure we are doing 
    the right thing here. Here are the comments at the top of GenericUDFIn() in hive.
    
    /**
     * GenericUDFIn
     *
     * Example usage:
     * SELECT key FROM src WHERE key IN ("238", "1");
     *
     * From MySQL page on IN(): To comply with the SQL standard, IN returns NULL
     * not only if the expression on the left hand side is NULL, but also if no
     * match is found in the list and one of the expressions in the list is NULL.
     *
     * Also noteworthy: type conversion behavior is different from MySQL. With
     * expr IN expr1, expr2... in MySQL, exprN will each be converted into the same
     * type as expr. In the Hive implementation, all expr(N) will be converted into
     * a common type for conversion consistency with other UDF's, and to prevent
     * conversions from a big type to a small type (e.g. int to tinyint)
     */
    
    **(case 1) expr in (expr1,... exprN)**
    * **1.1** Per sql standard if expr is NULL then IN should return NULL
            (with this PR we are attempting to achieve this)
    * **1.2** if any of the expression in the right hand side is NULL and also
            no match is found in the list then IN should also return NULL.
            We also enforce this semantics by 
            [implementation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L124-L144)
    
    **(case 2) Type conversion semantics.**
    * **2.1** In MySQL all the expressions in the right hand side is converted 
            to left hand side type. Our behaviour matches this semantics. I am
            not sure if this is the standard though.
    * **2.2** In Hive, they seem to find a common type (probably larger type) and
            promote both left hand side and right hand side to that common type.
            I believe this is where it throws the SemanticException.
    
    Our behaviour seems match that of MySql more at the present time. Do we want to change this ? 
    also about case 1 , it is not clear from the hive comments on what they intended to
    do vs what is the external behaviour. Please let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146470802
  
    @cloud-fan 
    Thanks a lot. I have implemented the review comments. Please take a look. I looked at the optimizer code,. we already seem to be transforming NULL in (...) to  "Filter null"  in ConstantFolding rule. So we should be okay here , right ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8983#discussion_r41475573
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -305,12 +305,22 @@ object HiveTypeCoercion {
     
       /**
        * Convert all expressions in in() list to the left operator type
    +   * except when the left operator type is NullType. In case when left hand
    +   * operator type is NullType create a Literal(Null).
        */
       object InConversion extends Rule[LogicalPlan] {
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
           // Skip nodes who's children have not been resolved yet.
           case e if !e.childrenResolved => e
     
    +      case i @ In(a, b) if (a.dataType == NullType) =>
    +        var inTypes : Seq[DataType] = Seq.empty
    +        b.foreach(e => inTypes = inTypes ++ Seq(e.dataType))
    +        findWiderCommonType(inTypes) match {
    +          case Some(finalDataType) => Literal.create(null, BooleanType)
    +          case None => i
    +        }
    --- End diff --
    
    instead of returning literal null, I think we should just wider the types, as it does fix the bug(we can add a rule in `Optimizer` to return null for this case).
    
    the code can be:
    ```
    case i @ In(a, b) if b.exists(_.dataType != a.dataType) =>
      findWiderCommonType(i.children.map(_.dataType)) match {
        case Some(finalDataType) => i.withNewChildren(i.children.map(Cast(_, finalDataType)))
        case None => i
      }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8983#discussion_r41188938
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala ---
    @@ -305,12 +305,17 @@ object HiveTypeCoercion {
     
       /**
        * Convert all expressions in in() list to the left operator type
    +   * except when the left operator type is NullType. In case when left hand
    +   * operator type is NullType create a Literal(Null).
        */
       object InConversion extends Rule[LogicalPlan] {
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
           // Skip nodes who's children have not been resolved yet.
           case e if !e.childrenResolved => e
     
    +      case i @ In(a, b) if (a.dataType == NullType) =>
    +        Literal.create(null, BooleanType)
    --- End diff --
    
    instead of just casting null to boolean, can we come up with a better idea according to the data types of `b`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145696762
  
    Thanks Wenchen. You are right that not all types can be casted to boolean. 
    
    However, in this case, we are not trying to cast the in list types to the LHS type (null type in our case) as we know that this is a special case predicate would always evaluate to NULL. That is why we are simply transforming the in predicate to NULL one and dropping the in list altogether.
    
    == Parsed Logical Plan ==
    'Project [unresolvedalias(*)]
     'Filter NOT null IN (1,2,3,4) => original one
      'UnresolvedRelation [inttab], None
    
    == Analyzed Logical Plan ==
    c1: int
    Project [c1#0]
     Filter NOT null    => rewritten one
      Subquery inttab
       LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at <console>:26
    
    Please let me know what you think .. If you have a test case in mind that would exhibit a problem then i would like to try it out.
    
    Thanks a lot for your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145712756
  
    can you try `select null in (1,2,null)`? I wanna make sure hive doesn't execute the `In` operation when `value` is null type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145717934
  
    I checked our [implementation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L126-L127), we should return null for this case. 
    
    Can you also add a rule in `Optimizer` so that if `value` is literal null, we can just return null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by rick-ibm <gi...@git.apache.org>.

Github user rick-ibm commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146271123
  
    Hi Michael,
    
    Postgres and Derby raise an error if the expressions in the IN list can't be implicitly cast to a common type. MySQL is more forgiving.
    
    Thanks,
    -Rick
    
    ---------------------------------
    
    MySQL:
    
    mysql> SELECT NULL IN ( 1, 'abc' );
    SELECT NULL IN ( 1, 'abc' );
    +----------------------+
    | NULL IN ( 1, 'abc' ) |
    +----------------------+
    |                 NULL |
    +----------------------+
    1 row in set (0.00 sec)
    
    
    ---------------------------------
    
    Postgres:
    
    mydb=# SELECT NULL IN ( 1, 'abc' );
    SELECT NULL IN ( 1, 'abc' );
    ERROR:  invalid input syntax for integer: "abc"
    LINE 1: SELECT NULL IN ( 1, 'abc' );
    
    
    ---------------------------------
    
    Derby:
    
    ij> VALUES CAST (NULL AS INT) IN (1, 'abc' );
    ERROR 42818: Comparisons between 'INTEGER' and 'CHAR (UCS_BASIC)' are not supported. Types must be comparable. String types must also have matching collation. If collation does not match, a possible solution is to cast operands to force them to the default collation (e.g. SELECT tablename FROM sys.systables WHERE CAST(tablename AS VARCHAR(128)) = 'T1')



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145718166
  
    btw you can send another PR to add a rule in `Optimizer` so that if `value` is literal null, we can just return null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8983#discussion_r41475635
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala ---
    @@ -135,4 +135,26 @@ class AnalysisSuite extends AnalysisTest {
         plan = testRelation.select(CreateStructUnsafe(Seq(a, (a + 1).as("a+1"))).as("col"))
         checkAnalysis(plan, plan)
       }
    +
    +  test("SPARK-8654: invalid CAST in NULL IN(...) expression") {
    +    val plan = Project(Alias(In(Literal(null), Seq(Literal(1), Literal(2))), "a")() :: Nil,
    +      LocalRelation()
    +    )
    +    assertAnalysisSuccess(plan)
    +  }
    +
    +  test("SPARK-8654: different types in inlist but can be converted to a commmon type") {
    +    val plan = Project(Alias(In(Literal(null), Seq(Literal(1), Literal(1.2345))), "a")() :: Nil,
    +      LocalRelation()
    +    )
    +    assertAnalysisSuccess(plan)
    +  }
    +
    +  test("SPARK-8654: check type compatibility error") {
    +    val plan = Project(Alias(In(Literal(null), Seq(Literal(true), Literal(1))), "a")() :: Nil,
    +      LocalRelation()
    +    )
    +    assertAnalysisError(plan, Seq("cannot resolve 'null IN (true,1)' due to data " +
    +                                 "type mismatch: Arguments must be same type"))
    --- End diff --
    
    This can be simplified to `assertAnalysisError(plan, Seq("data type mismatch: Arguments must be same type")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145953466
  
    ok misleaded by the imperative code style, sorry for that...
    So we will return null if the `key` is null, or we can't find a value in `list` matching our `key`, and there is null value inside `list`.
    
    This PR make `In` operation return null when `key` is NullType, which is right. But I think we can do it further by returning null for any type of `key` which is literal null(probably another PR).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146077599
  
    if `null in (true, array(2,3))` returns null in hive, then we probably should first cast `key` and all values in `list` to a common type. If there is no such a common type, throw exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145664718
  
    Thanks !! 
    
    Do we need to look at the in list types in this case ? The in list types could be literals of different types , right ? for example NULL not in (1, 'a')
    
    Since the result of IN predicate is a boolean type, i thought  it would be safe to transform it to
    a Null literal of boolean type. Are you thinking of a case where this would not work ?
    
    Thanks a lot in advance for your help.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by rick-ibm <gi...@git.apache.org>.

Github user rick-ibm commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146280859
  
    I don't find any guidance in the Standard for what should be done if the left side of the IN operator is an untyped NULL literal. Technically, there is no such thing in the Standard. The NULL needs to be cast to a legal type.
    
    Section 8.4 provides no guidance about the type correspondence of the IN list values. However, section 8.9 implies that the IN list is equivalent to the result of a subquery, which means that we must be able to cast all of the values on the right side to a common type.
    
    Thanks,
    -Rick



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145734618
  
    Can you please help clarify ? Are you referring to the case when one of the value in the 
    in list is literal null  like 1 in (1, NULL) ? If so, i don't think we can evaluate this to NULL...
    
    Ran the following two queries on hive 
    hive> select 1 in (1,NULL) from tnull;
    OK
    true
    true
    true
    Time taken: 1.21 seconds, Fetched: 3 row(s)
    hive> select 1 in (NULL,1) from tnull;
    OK
    true
    true
    true
    Time taken: 0.168 seconds, Fetched: 3 row(s)
    
    We have already taken care of the case LHS type is null. Let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145930866
  
    Looks like our [implementation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L124-L144) is wrong.... We return null if any value in the list is null.
    Would you like to send another PR to fix it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146003985
  
    Very good point..Thanks.. Actually Hive reports an error in this case.
    
    hive> select * from tnull where array(2,3) in (1, array(2,3));
    FAILED: SemanticException [Error 10014]: Line 1:37 Wrong arguments '3': The arguments for IN should be the same type! Types are: {array<int> IN (int, array<int>)}
    
    I am not sure what is the right thing to do here. 
    
    Any comments @marmbrus ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/8983


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by rick-ibm <gi...@git.apache.org>.

Github user rick-ibm commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146244545
  
    According to my reading of the SQL Standard,
    
      NULL IN (expr1, ...)
    
    should always evaluate to NULL. Here is my reasoning:
    
    The 2011 SQL Standard, part 2, section 8.4 (in predicate), syntax rule 5 says that
    
      expr IN (expr1, ...)
    
    is equivalent to
    
      expr = ANY (expr1, ...)
    
    Section 8.9 (quantified comparison predicate), general rule 2, subrules (c) and (d), say that
    
      expr = ANY (expr1, ...)
    
    evaluates to the following:
    
      TRUE if (expr = exprN) is TRUE for at least one of the expressions on the right side
    
      FALSE if the right side is an empty list or if (expr = exprN) is FALSE for every exprN on the right side
    
      UNKNOWN (NULL) otherwise
    
    Since (NULL = exprN) is always UNKNOWN and since an IN list must be non-empty (see the BNF in section 8.4), it follows that
    
      NULL IN (expr1, ...)
    
    always evaluates to UNKNOWN (NULL). So Dilip's transformation of
    
      NULL IN (expr1, ...) -> NULL
    
    looks correct to me. There is no need to cast the expressions on the right side to a common type. That is, not unless you want to raise syntax errors in situations where there is no implicit conversion to a common type.
    
    As the following examples show, Postgres, MySQL, and Derby all exhibit the correct Standard behavior.
    
    Thanks,
    -Rick
    
    ----------------------------------------------------
    
    MySQL behavior:
    
    mysql> SELECT NULL IN (1, 2, 3);
    SELECT NULL IN (1, 2, 3);
    +-------------------+
    | NULL IN (1, 2, 3) |
    +-------------------+
    |              NULL |
    +-------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT NULL IN (1, 2, NULL);
    SELECT NULL IN (1, 2, NULL);
    +----------------------+
    | NULL IN (1, 2, NULL) |
    +----------------------+
    |                 NULL |
    +----------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT NULL IN ();
    SELECT NULL IN ();
    ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 1
    
    
    
    ----------------------------------------------------
    
    Postgres behavior:
    
    mydb=# SELECT NULL IN (1, 2, 3);
    SELECT NULL IN (1, 2, 3);
     ?column? 
    ----------
     
    (1 row)
    
    mydb=# SELECT NULL IN (1, 2, NULL);
    SELECT NULL IN (1, 2, NULL);
     ?column? 
    ----------
     
    (1 row)
    
    mydb=# SELECT NULL IN ();
    SELECT NULL IN ();
    ERROR:  syntax error at or near ")"
    LINE 1: SELECT NULL IN ();
    
    
    
    ----------------------------------------------------
    
    Derby behavior:
    
    ij> VALUES CAST (NULL AS INT) IN (1, 2, 3);
    1    
    -----
    NULL 
    
    1 row selected
    ij> VALUES CAST (NULL AS INT) IN (1, 2, CAST (NULL AS INT));
    1    
    -----
    NULL 
    
    1 row selected
    ij> VALUES CAST (NULL AS INT) IN ();
    ERROR 42X01: Syntax error: Encountered ")" at line 1, column 31.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145713214
  
    hive> select null in (1,2,null) from tnull; 
    OK
    NULL
    NULL
    NULL
    Time taken: 0.139 seconds, Fetched: 3 row(s)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146016743
  
    Lets follow hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146266287
  
    I think we are all in agreement that `null IN (...)` should return null.  
    
    The only question here is if we should through an error when the stuff in `(...)` can be coerced to a common type.  It seems to me that the user must be confused if they are attempting to compare a value with an incompatible set of options.  It would be good to see what other systems do here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146680375
  
    Sorry, this never passed tests and broke something.  I'm going to revert.  Please reopen the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145958925
  
    Thanks a LOT @cloud-fan. Sure.. i will look into it. When you say another PR, 
    do we mean another JIRA ? 
    
    Asking as i am new to the process. 
    
    One other question.. what is the process to get this change integrated ? Do i need to initiate any action from my end ?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146645486
  
    Thanks a lot @marmbrus .
    
    Many thanks to @cloud-fan for his help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8983#discussion_r41183045
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala ---
    @@ -135,4 +135,11 @@ class AnalysisSuite extends AnalysisTest {
         plan = testRelation.select(CreateStructUnsafe(Seq(a, (a + 1).as("a+1"))).as("col"))
         checkAnalysis(plan, plan)
       }
    +
    +  test("SPARK-8654: invalid CAST in NULL IN(...) expression") {
    +    val plan = Project(Alias(In(Literal(null), Seq(Literal(1), Literal(2))), "a")() :: Nil,
    +      LocalRelation()
    +    )
    +    assertAnalysisSuccess(plan, false)
    --- End diff --
    
    why change the default value of `caseSensitive`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146682265
  
    @marmbrus .. sorry about it. Is there a way i can look at the list of failures ?
    I had run :
    build/mvn -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver -Dhadoop.version=2.6.0  -Dmaven.test.failure.ignore=true test
    
    and it reported success. But then this is my first time.. so may not have right configuration.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145948447
  
    @cloud-fan , please confirm my understanding of the code (fairly new to the codebase..:-)
    In the code we go through the entire in list and run evaluate the expression flagging hasNull. But
    we continue with next items and return true if we see a match. If we haven't seen it then we look at the hasNull flag and return Null or False.
    
    To confirm if there is a issue, i tried to run the following two queries again. The output looks 
    ok to me.. 
    
    select * from inttab where 1 in (1,2,NULL)
    var2: org.apache.spark.sql.DataFrame = [c1: int]
    +---+
    | c1|
    +---+
    |  1|
    |  2|
    |  3|
    |  4|
    |  5|
    +---+
    
    == Parsed Logical Plan ==
    'Project [unresolvedalias(*)]
     'Filter 1 IN (1,2,null)
      'UnresolvedRelation [inttab], None
    
    == Analyzed Logical Plan ==
    c1: int
    Project [c1#0]
     Filter 1 IN (cast(1 as int),cast(2 as int),cast(null as int))
      Subquery inttab
       LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at <console>:26
    
    == Optimized Logical Plan ==
    LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at <console>:26
    
    == Physical Plan ==
    Scan PhysicalRDD[c1#0]
    
    Code Generation: true
    
    select * from inttab where 1 in (NULL,1,2)
    var2: org.apache.spark.sql.DataFrame = [c1: int]
    +---+
    | c1|
    +---+
    |  1|
    |  2|
    |  3|
    |  4|
    |  5|
    +---+
    
    == Parsed Logical Plan ==
    'Project [unresolvedalias(*)]
     'Filter 1 IN (null,1,2)
      'UnresolvedRelation [inttab], None
    
    == Analyzed Logical Plan ==
    c1: int
    Project [c1#0]
     Filter 1 IN (cast(null as int),cast(1 as int),cast(2 as int))
      Subquery inttab
       LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at <console>:26
    
    == Optimized Logical Plan ==
    LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at <console>:26
    
    == Physical Plan ==
    Scan PhysicalRDD[c1#0]
    
    Code Generation: true
    
    Please let me know your thoughts ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146685208
  
    @cloud-fan 
    I didn't see a re-open option on this pull request. Do i have to create a new pull request ?
    Please let me know ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

Posted by dilipbiswal <gi...@git.apache.org>.

Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-146287283
  
    Thank you @marmbrus @rick-ibm @cloud-fan 
    
    I checked the behavior of db2. It also raises an error if the in list types are not compatible.
    
    db2 => select * from f1 where NULL in (1, true)
    SQL0401N  The data types of the operands for the operation "IN" are not 
    compatible or comparable.  SQLSTATE=42818
    
    I am studying the code now to figure out how to detect this and raise an error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org