You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/03/19 13:18:00 UTC
[jira] [Assigned] (SPARK-38604) ceil and floor return different types when called from scala than sql

     [ https://issues.apache.org/jira/browse/SPARK-38604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-38604:
------------------------------------

    Assignee: Apache Spark

> ceil and floor return different types when called from scala than sql
> ---------------------------------------------------------------------
>
>                 Key: SPARK-38604
>                 URL: https://issues.apache.org/jira/browse/SPARK-38604
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Robert Joseph Evans
>            Assignee: Apache Spark
>            Priority: Critical
>
> In Spark 3.3.0  SPARK-37475 [PR|http://example.com/][https://github.com/apache/spark/pull/34729] went in and added support for a scale parameter to floor and ceil.  There was [discussion|https://github.com/apache/spark/pull/34729#discussion_r761157050] about potential incompatibilities, specifically with respect to the return types. It looks like it was [decided|https://github.com/apache/spark/pull/34729#discussion_r767446855 to keep the old behavior if no scale parameter was passed in, but use the new functionality if a scale is passed in.
>  
> But the scala API didn't get updated to do the same thing as the SQL API.
> {code:scala}
> scala> spark.range(1).selectExpr("id", "ceil(id) as one_arg_sql", "ceil(id, 0) as two_arg_sql").select(col("*"), ceil(col("id")).alias("one_arg_func"), ceil(col("id"), lit(0)).alias("two_arg_func")).printSchema
> root
>  |-- id: long (nullable = false)
>  |-- one_arg_sql: long (nullable = true)
>  |-- two_arg_sql: decimal(20,0) (nullable = true)
>  |-- one_arg_func: decimal(20,0) (nullable = true)
>  |-- two_arg_func: decimal(20,0) (nullable = true)
>  
> scala> spark.range(1).selectExpr("cast(id as double) as id").selectExpr("id", "ceil(id) as one_arg_sql", "ceil(id, 0) as two_arg_sql").select(col("*"), ceil(col("id")).alias("one_arg_func"), ceil(col("id"), lit(0)).alias("two_arg_func")).printSchema
> root
>  |-- id: double (nullable = false)
>  |-- one_arg_sql: long (nullable = true)
>  |-- two_arg_sql: decimal(30,0) (nullable = true)
>  |-- one_arg_func: decimal(30,0) (nullable = true)
>  |-- two_arg_func: decimal(30,0) (nullable = true) {code}
> And because the python code call into this too it also has the same problem. I suspect that the java and R code also expose it too, but I didn't check.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org