You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bryan Cutler (JIRA)" <ji...@apache.org> on 2018/09/19 21:34:00 UTC

[jira] [Created] (SPARK-25471) Fix tests for Python 3.6 with Pandas 0.23+

Bryan Cutler created SPARK-25471:
------------------------------------

             Summary: Fix tests for Python 3.6 with Pandas 0.23+
                 Key: SPARK-25471
                 URL: https://issues.apache.org/jira/browse/SPARK-25471
             Project: Spark
          Issue Type: Bug
          Components: PySpark, Tests
    Affects Versions: 2.4.0
            Reporter: Bryan Cutler
            Assignee: Bryan Cutler


Running pyspark tests causes at least 1 error when using Python 3.6 and Pandas 0.23 or higher.  This is because the Pandas DataFrame constructor can create columns in the defined order, where earlier versions might be in alphabetical order.  This leads to the following failure:
{noformat}
======================================================================
ERROR: test_create_dataframe_from_pandas_with_timestamp (pyspark.sql.tests.SQLTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bryan/git/spark/python/pyspark/sql/tests.py", line 3275, in test_create_dataframe_from_pandas_with_timestamp
    df = self.spark.createDataFrame(pdf, schema="d date, ts timestamp")
  File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 748, in createDataFrame
    rdd, schema = self._createFromLocal(map(prepare, data), schema)
  File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 413, in _createFromLocal
    data = list(data)
  File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 730, in prepare
    verify_func(obj)
  File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1389, in verify
    verify_value(obj)
  File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1370, in verify_struct
    verifier(v)
  File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1389, in verify
    verify_value(obj)
  File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1383, in verify_default
    verify_acceptable_types(obj)
  File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1278, in verify_acceptable_types
    % (dataType, obj, type(obj))))
TypeError: field ts: TimestampType can not accept object datetime.date(2018, 9, 19) in type <class 'datetime.date'>
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org