You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bryan Cutler (JIRA)" <ji...@apache.org> on 2018/09/19 21:34:00 UTC
[jira] [Created] (SPARK-25471) Fix tests for Python 3.6 with Pandas
0.23+
Bryan Cutler created SPARK-25471:
------------------------------------
Summary: Fix tests for Python 3.6 with Pandas 0.23+
Key: SPARK-25471
URL: https://issues.apache.org/jira/browse/SPARK-25471
Project: Spark
Issue Type: Bug
Components: PySpark, Tests
Affects Versions: 2.4.0
Reporter: Bryan Cutler
Assignee: Bryan Cutler
Running pyspark tests causes at least 1 error when using Python 3.6 and Pandas 0.23 or higher. This is because the Pandas DataFrame constructor can create columns in the defined order, where earlier versions might be in alphabetical order. This leads to the following failure:
{noformat}
======================================================================
ERROR: test_create_dataframe_from_pandas_with_timestamp (pyspark.sql.tests.SQLTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/bryan/git/spark/python/pyspark/sql/tests.py", line 3275, in test_create_dataframe_from_pandas_with_timestamp
df = self.spark.createDataFrame(pdf, schema="d date, ts timestamp")
File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 748, in createDataFrame
rdd, schema = self._createFromLocal(map(prepare, data), schema)
File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 413, in _createFromLocal
data = list(data)
File "/home/bryan/git/spark/python/pyspark/sql/session.py", line 730, in prepare
verify_func(obj)
File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1389, in verify
verify_value(obj)
File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1370, in verify_struct
verifier(v)
File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1389, in verify
verify_value(obj)
File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1383, in verify_default
verify_acceptable_types(obj)
File "/home/bryan/git/spark/python/pyspark/sql/types.py", line 1278, in verify_acceptable_types
% (dataType, obj, type(obj))))
TypeError: field ts: TimestampType can not accept object datetime.date(2018, 9, 19) in type <class 'datetime.date'>
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org