You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/06/14 21:46:06 UTC
[jira] [Assigned] (SPARK-21100) describe should give quartiles
similar to Pandas
[ https://issues.apache.org/jira/browse/SPARK-21100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-21100:
------------------------------------
Assignee: (was: Apache Spark)
> describe should give quartiles similar to Pandas
> ------------------------------------------------
>
> Key: SPARK-21100
> URL: https://issues.apache.org/jira/browse/SPARK-21100
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 2.1.1
> Reporter: Andrew Ray
> Priority: Minor
>
> The DataFrame describe method should also include quartiles (25th, 50th, and 75th percentiles) like Pandas.
> Example pandas output:
> {code}
> In [4]: df.describe()
> Out[4]:
> Unnamed: 0 displ year cyl cty hwy
> count 234.000000 234.000000 234.000000 234.000000 234.000000 234.000000
> mean 117.500000 3.471795 2003.500000 5.888889 16.858974 23.440171
> std 67.694165 1.291959 4.509646 1.611534 4.255946 5.954643
> min 1.000000 1.600000 1999.000000 4.000000 9.000000 12.000000
> 25% 59.250000 2.400000 1999.000000 4.000000 14.000000 18.000000
> 50% 117.500000 3.300000 2003.500000 6.000000 17.000000 24.000000
> 75% 175.750000 4.600000 2008.000000 8.000000 19.000000 27.000000
> max 234.000000 7.000000 2008.000000 8.000000 35.000000 44.000000
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org