You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2014/11/23 05:08:12 UTC

[jira] [Created] (SPARK-4561) PySparkSQL's Row.asDict() should convert nested rows to dictionaries

Josh Rosen created SPARK-4561:
---------------------------------

             Summary: PySparkSQL's Row.asDict() should convert nested rows to dictionaries
                 Key: SPARK-4561
                 URL: https://issues.apache.org/jira/browse/SPARK-4561
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 1.2.0
            Reporter: Josh Rosen


In PySpark, you can call {{.asDict
()}} on a SparkSQL {{Row}} to convert it to a dictionary.  Unfortunately, though, this does not convert nested rows to dictionaries.  For example:

{code}
>>> sqlContext.sql("select results from results").first()
Row(results=[Row(time=3.762), Row(time=3.47), Row(time=3.559), Row(time=3.458), Row(time=3.229), Row(time=3.21), Row(time=3.166), Row(time=3.276), Row(time=3.239), Row(time=3.149)])
>>> sqlContext.sql("select results from results").first().asDict()
{u'results': [(3.762,),
  (3.47,),
  (3.559,),
  (3.458,),
  (3.229,),
  (3.21,),
  (3.166,),
  (3.276,),
  (3.239,),
  (3.149,)]}
{code}

I ran into this issue when trying to use Pandas dataframes to display nested data that I queried from Spark SQL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org