You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/09/22 12:39:06 UTC

[jira] [Updated] (SPARK-10707) Set operation output columns may have incorrect nullability

     [ https://issues.apache.org/jira/browse/SPARK-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated SPARK-10707:
------------------------------
    Component/s: SQL

> Set operation output columns may have incorrect nullability
> -----------------------------------------------------------
>
>                 Key: SPARK-10707
>                 URL: https://issues.apache.org/jira/browse/SPARK-10707
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Mikhail Bautin
>            Assignee: Mark Hamstra
>
> If "dual" is a one-row table, the query
> SELECT count(v) FROM ( 
>   SELECT v FROM ( 
>     SELECT NULL AS v FROM dual 
>     UNION ALL 
>     SELECT 'foo' AS v FROM dual
>   ) my_union WHERE isnull(v)
> ) my_subview;
> returns 0 and the same query with union'ed sub-queries switched returns 1:
> SELECT count(v) FROM ( 
>   SELECT v FROM ( 
>     SELECT 'foo' AS v FROM dual
>     UNION ALL 
>     SELECT NULL AS v FROM dual 
>   ) my_union WHERE isnull(v)
> ) my_subview;
> Example output (with Catalyst tracing turned on): https://gist.githubusercontent.com/mbautin/c916a2a7ce733d039137/raw
> This is caused by the behavior in set operation implementation where only the nullability of the first child is taken into account:
> SetOperation in org.apache.spark.sql.catalyst.plans.logical:
> https://github.com/apache/spark/blob/82268f07abfa658869df2354ae72f8d6ddd119e8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala#L93
> Union in org.apache.spark.sql.execution:
> https://github.com/apache/spark/blob/e626ac5f5c27dcc74113070f2fec03682bcd12bd/sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala#L178
> Intersect in org.apache.spark.sql.execution:
> https://github.com/apache/spark/blob/e626ac5f5c27dcc74113070f2fec03682bcd12bd/sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala#L320
> Except in org.apache.spark.sql.execution:
> https://github.com/apache/spark/blob/e626ac5f5c27dcc74113070f2fec03682bcd12bd/sql/core/src/main/scala/org/apache/spark/sql/execution/basicOperators.scala#L307



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org