You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/05/10 13:25:00 UTC

[jira] [Commented] (SPARK-34246) New type coercion syntax rules in ANSI mode

    [ https://issues.apache.org/jira/browse/SPARK-34246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341908#comment-17341908 ] 

Apache Spark commented on SPARK-34246:
--------------------------------------

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/32493

> New type coercion syntax rules in ANSI mode
> -------------------------------------------
>
>                 Key: SPARK-34246
>                 URL: https://issues.apache.org/jira/browse/SPARK-34246
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.1.2
>            Reporter: Gengliang Wang
>            Assignee: Gengliang Wang
>            Priority: Major
>             Fix For: 3.2.0
>
>
> Add new implicit cast syntax rules in ANSI mode.
> In Spark ANSI mode, the type coercion rules are based on the type precedence lists of the input data types. 
> As per the section "Type precedence list determination" of "ISO/IEC 9075-2:2011
> Information technology — Database languages - SQL — Part 2: Foundation (SQL/Foundation)", the type precedence lists of primitive
>  data types are as following:
> * Byte: Byte, Short, Int, Long, Decimal, Float, Double
> * Short: Short, Int, Long, Decimal, Float, Double
> * Int: Int, Long, Decimal, Float, Double
> * Long: Long, Decimal, Float, Double
> * Decimal: Any wider Numeric type
> * Float: Float, Double
> * Double: Double
> * String: String
> * Date: Date, Timestamp
> * Timestamp: Timestamp
> * Binary: Binary
> * Boolean: Boolean
> * Interval: Interval
> As for complex data types, Spark will determine the precedent list recursively based on their sub-types.
> With the definition of type precedent list, the general type coercion rules are as following:
> * Data type S is allowed to be implicitly cast as type T iff T is in the precedence list of S
> * Comparison is allowed iff the data type precedence list of both sides has at least one common element. When evaluating the comparison, Spark casts both sides as the tightest common data type of their precedent lists.
> * There should be at least one common data type among all the children's precedence lists for the following operators. The data type of the operator is the tightest common precedent data type.
> {code:java}
> In
> Except(odd)
> Intersect
> Greatest
> Least
> Union
> If
> CaseWhen
> CreateArray
> Array Concat
> Sequence
> MapConcat
> CreateMap
> {code}
> * For complex types (struct, array, map), Spark recursively looks into the element type and applies the rules above. If the element nullability is converted from true to false, add runtime null check to the elements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org