You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "xsys (Jira)" <ji...@apache.org> on 2022/10/02 00:16:00 UTC

[jira] [Created] (SPARK-40629) FLOAT/DOUBLE division by 0 gives Infinity/-Infinity/NaN in DataFrame but NULL in SparkSQL

xsys created SPARK-40629:
----------------------------

             Summary: FLOAT/DOUBLE division by 0 gives Infinity/-Infinity/NaN in DataFrame but NULL in SparkSQL
                 Key: SPARK-40629
                 URL: https://issues.apache.org/jira/browse/SPARK-40629
             Project: Spark
          Issue Type: Bug
          Components: Spark Shell, SQL
    Affects Versions: 3.2.1
            Reporter: xsys


h3. Describe the bug

Storing a FLOAT/DOUBLE value with division by 0 (e.g. {{{}( 1.0/0 ).floatValue(){}}}) via {{spark-shell}} outputs {{{}Infinity{}}}. However, {{1.0/0}} ({{{}cast ( 1.0/0 as float){}}}) evaluated to {{NULL}} if the value is inserted into a FLOAT/DOUBLE column of a table via {{{}spark-sql{}}}.
h3. To Reproduce

On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}:

 
{code:java}
$SPARK_HOME/bin/spark-sql{code}
 

Execute the following:

 
{code:java}
spark-sql> create table float_vals(c1 float) stored as ORC;
spark-sql> insert into float_vals cast ( 1.0/0  as float);
spark-sql> select * from float_vals;
NULL{code}
 

Using {{{}spark-shell{}}}:
{code:java}
$SPARK_HOME/bin/spark-shell{code}
Execute the following:
{code:java}
scala> val rdd = sc.parallelize(Seq(Row(( 1.0/0 ).floatValue())))
rdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = ParallelCollectionRDD[180] at parallelize at <console>:28
scala> val schema = new StructType().add(StructField("c1", FloatType, true) 
)
schema: org.apache.spark.sql.types.StructType = StructType( StructField(c1,FloatType,true))
scala> val df = spark.createDataFrame(rdd, schema)
df: org.apache.spark.sql.DataFrame = [c1: float]
scala> df.show(false)
+---------+
|c1       |
+---------+
|Infinity |
+---------+
{code}
h3. Expected behavior

We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) to behave consistently for the same data type & input combination & configuration ({{{}FLOAT/DOUBLE{}}} and {{{}1.0/0{}}}).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org