You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2018/04/19 13:24:00 UTC

[jira] [Resolved] (SPARK-21811) Inconsistency when finding the widest common type of a combination of DateType, StringType, and NumericType

     [ https://issues.apache.org/jira/browse/SPARK-21811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-21811.
----------------------------------
       Resolution: Fixed
         Assignee: Jiang Xingbo
    Fix Version/s: 2.4.0

Fixed in https://github.com/apache/spark/pull/21074

> Inconsistency when finding the widest common type of a combination of DateType, StringType, and NumericType
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21811
>                 URL: https://issues.apache.org/jira/browse/SPARK-21811
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Ryan Bald
>            Assignee: Jiang Xingbo
>            Priority: Minor
>             Fix For: 2.4.0
>
>
> Finding the widest common type for the arguments of a variadic function (such as IN or COALESCE) when the types of the arguments are a combination of DateType/TimestampType, StringType, and NumericType fails with an AnalysisException for some orders of the arguments and succeeds with a common type of StringType for other orders of the arguments.
> The below examples used to reproduce the error assume a schema of:
> {{[c1: date, c2: string, c3: int]}}
> The following succeeds:
> {{SELECT coalesce(c1, c2, c3) FROM table}}
> While the following produces an exception:
> {{SELECT coalesce(c1, c3, c2) FROM table}}
> The order of arguments affects the behavior because it looks to be the widest common type is found by repeatedly looking at two arguments at a time, the widest common type found thus far and the next argument. On initial thought of a fix, I think the way the widest common type is found would have to be changed and instead look at all arguments first before deciding what the widest common type should be.
> As my boss is out of office for the rest of the day I will give a pull request a shot, but as I am not super familiar with Scala or Spark's coding style guidelines, a pull request is not promised. Going forward with my attempted pull request, I will assume having DateType/TimestampType, StringType, and NumericType arguments in an IN expression and COALESCE function (and any other function/expression where this combination of argument types can occur) is valid. I find it also quite reasonable to have this combination of argument types to be invalid, so if that's what is decided, then oh well.
> If I were a betting man, I'd say the fix would be made in the following file: [TypeCoercion.scala|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org