You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shafique Jamal (JIRA)" <ji...@apache.org> on 2017/10/13 05:16:04 UTC
[jira] [Created] (SPARK-22271) Describe results in "null" for the
value of "mean" of a numeric variable
Shafique Jamal created SPARK-22271:
--------------------------------------
Summary: Describe results in "null" for the value of "mean" of a numeric variable
Key: SPARK-22271
URL: https://issues.apache.org/jira/browse/SPARK-22271
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.1.0
Environment:
Reporter: Shafique Jamal
Priority: Minor
Please excuse me if this issue was addressed already - I was unable to find it.
Calling .describe().show() on my dataframe results in a value of null for the row "mean":
{{val foo = spark.read.parquet("decimalNumbers.parquet")
foo.select(col("numericvariable")).describe().show()
foo: org.apache.spark.sql.DataFrame = [numericvariable: decimal(38,32)]
+-------+--------------------+
|summary| numericvariable|
+-------+--------------------+
| count| 299|
| mean| null|
| stddev| 0.2376438793946738|
| min|0.037815489727642...|
| max|2.138189366554511...|}}
But all of the rows for this seem ok (I can attache a parquet file). When I round the column, however, all is fine:
{{foo.select(bround(col("numericvariable"), 31)).describe().show()
+-------+---------------------------+
|summary|bround(numericvariable, 31)|
+-------+---------------------------+
| count| 299|
| mean| 0.139522503183236...|
| stddev| 0.2376438793946738|
| min| 0.037815489727642...|
| max| 2.138189366554511...|
+-------+---------------------------+}}
Rounding to 32 give null also though.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org