You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Carlos M. Casas (JIRA)" <ji...@apache.org> on 2017/06/27 10:43:00 UTC

[jira] [Created] (SPARK-21226) Save empty dataframe in pyspark prints nothing

Carlos M. Casas created SPARK-21226:
---------------------------------------

             Summary: Save empty dataframe in pyspark prints nothing
                 Key: SPARK-21226
                 URL: https://issues.apache.org/jira/browse/SPARK-21226
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.1.0, 2.0.0
            Reporter: Carlos M. Casas


I try the following:

schema = whatever schema you want
df1 = sqlContext.createDataFrame(sc.emptyRDD(), schema)
df1.write.parquet("as1")

and I just get a directory as1 with a _SUCCESS file in it. If I try to read that file, I get an exception.

On the other hand, if I run:

schema = whatever schema you want
df2 = sqlContext.createDataFrame([], schema)
df2.write.parquet("as2")

I get a directory as2 with some files on it (representing field type information?). If I try to read it, it works: it read an empty df with the proper schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org