You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/03 13:08:31 UTC

[GitHub] [spark] dlindelof opened a new pull request #26747: [SPARK-29188][PySpark] toPandas gets wrong dtypes when applied on empty DF

dlindelof opened a new pull request #26747: [SPARK-29188][PySpark] toPandas gets wrong dtypes when applied on empty DF
URL: https://github.com/apache/spark/pull/26747
 
 
   ### What changes were proposed in this pull request?
   
   An empty Spark DataFrame converted to a Pandas DataFrame wouldn't have the right column types. Several type mappings were missing.
   
   ### Why are the changes needed?
   
   Empty Spark DataFrames can be used to write unit tests, and verified by converting them to Pandas first. But this can fail when the column types are wrong.
   
   ### Does this PR introduce any user-facing change?
   
   Yes; the error reported in the JIRA issue should not happen anymore.
   
   ### How was this patch tested?
   
   Through unit tests in `pyspark.sql.tests.test_dataframe.DataFrameTests#test_to_pandas_from_empty_dataframe`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org