You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Gengliang Wang (Jira)" <ji...@apache.org> on 2021/11/03 06:04:00 UTC

[jira] [Updated] (SPARK-37179) ANSI mode: Add a config to allow casting between Datetime and Numeric

     [ https://issues.apache.org/jira/browse/SPARK-37179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gengliang Wang updated SPARK-37179:
-----------------------------------
    Description: 
Add a config `spark.sql.ansi.allowCastBetweenDatetimeAndNumeric`to allow casting between Datetime and Numeric. The default value of the configuration is `false`.
Also, casting double/float type to timestamp should raise exceptions if there is overflow or the input is Nan/infinite.

This is for better adoption of ANSI SQL mode:
- As we did some data science, we found that many Spark SQL users are actually using `Cast(Timestamp as Numeric)` and `Cast(Numeric as Timestamp)`. There are also some usages of `Cast(Date as Numeric)`.
- The Spark SQL connector for Tableau is using this feature for DateTime math. e.g.
 `CAST(FROM_UNIXTIME(CAST(CAST(%1 AS BIGINT) + (%2 * 86400) AS BIGINT)) AS TIMESTAMP)`

So, having a new configuration can provide users with an alternative choice on turning on ANSI mode.

  was:
We should allow the casting between Timestamp and Numeric types:
* As we did some data science, we found that many Spark SQL users are actually using `Cast(Timestamp as Numeric)` and `Cast(Numeric as Timestamp)`. 
* The Spark SQL connector for Tableau is using this feature for DateTime math. e.g.
{code:java}
CAST(FROM_UNIXTIME(CAST(CAST(%1 AS BIGINT) + (%2 * 86400) AS BIGINT)) AS TIMESTAMP)
{code}
* In the current syntax, we specially allow Numeric <=> Boolean and String <=> Binary since they are straight forward and frequently used.  I suggest we allow Timestamp <=> Numeric as well for better ANSI mode adoption.


> ANSI mode: Add a config to allow casting between Datetime and Numeric
> ---------------------------------------------------------------------
>
>                 Key: SPARK-37179
>                 URL: https://issues.apache.org/jira/browse/SPARK-37179
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Gengliang Wang
>            Assignee: Gengliang Wang
>            Priority: Major
>             Fix For: 3.3.0
>
>
> Add a config `spark.sql.ansi.allowCastBetweenDatetimeAndNumeric`to allow casting between Datetime and Numeric. The default value of the configuration is `false`.
> Also, casting double/float type to timestamp should raise exceptions if there is overflow or the input is Nan/infinite.
> This is for better adoption of ANSI SQL mode:
> - As we did some data science, we found that many Spark SQL users are actually using `Cast(Timestamp as Numeric)` and `Cast(Numeric as Timestamp)`. There are also some usages of `Cast(Date as Numeric)`.
> - The Spark SQL connector for Tableau is using this feature for DateTime math. e.g.
>  `CAST(FROM_UNIXTIME(CAST(CAST(%1 AS BIGINT) + (%2 * 86400) AS BIGINT)) AS TIMESTAMP)`
> So, having a new configuration can provide users with an alternative choice on turning on ANSI mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org