You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andrew Ray (JIRA)" <ji...@apache.org> on 2017/06/14 21:43:00 UTC

[jira] [Created] (SPARK-21100) describe should give quartiles similar to Pandas

Andrew Ray created SPARK-21100:
----------------------------------

             Summary: describe should give quartiles similar to Pandas
                 Key: SPARK-21100
                 URL: https://issues.apache.org/jira/browse/SPARK-21100
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.1.1
            Reporter: Andrew Ray
            Priority: Minor


The DataFrame describe method should also include quartiles (25th, 50th, and 75th percentiles) like Pandas.

Example pandas output:
{code}
In [4]: df.describe()
Out[4]:
       Unnamed: 0       displ         year         cyl         cty         hwy
count  234.000000  234.000000   234.000000  234.000000  234.000000  234.000000
mean   117.500000    3.471795  2003.500000    5.888889   16.858974   23.440171
std     67.694165    1.291959     4.509646    1.611534    4.255946    5.954643
min      1.000000    1.600000  1999.000000    4.000000    9.000000   12.000000
25%     59.250000    2.400000  1999.000000    4.000000   14.000000   18.000000
50%    117.500000    3.300000  2003.500000    6.000000   17.000000   24.000000
75%    175.750000    4.600000  2008.000000    8.000000   19.000000   27.000000
max    234.000000    7.000000  2008.000000    8.000000   35.000000   44.000000
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org