You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yin Huai (JIRA)" <ji...@apache.org> on 2018/02/01 04:27:00 UTC

[jira] [Created] (SPARK-23292) python tests related to pandas are skipped

Yin Huai created SPARK-23292:
--------------------------------

             Summary: python tests related to pandas are skipped
                 Key: SPARK-23292
                 URL: https://issues.apache.org/jira/browse/SPARK-23292
             Project: Spark
          Issue Type: Bug
          Components: Tests
    Affects Versions: 2.3.0
            Reporter: Yin Huai


I was running python tests and found that [pyspark.sql.tests.GroupbyAggPandasUDFTests.test_unsupported_types|https://github.com/apache/spark/blob/52e00f70663a87b5837235bdf72a3e6f84e11411/python/pyspark/sql/tests.py#L4528-L4548] does not run with Python 2 because the test uses "assertRaisesRegex" (supported by Python 3) instead of "assertRaisesRegexp" (supported by Python 2). However, spark jenkins does not fail because of this issue (see run history at [here|https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-master-test-sbt-hadoop-2.7/]). After looking into this issue, [seems test script will skip tests related to pandas if pandas is not installed|https://github.com/apache/spark/blob/2ac895be909de7e58e1051dc2a1bba98a25bf4be/python/pyspark/sql/tests.py#L51-L63], which means that jenkins does not have pandas installed. 
 
Since pyarrow related tests have the same skipping logic, we will need to check if jenkins has pyarrow installed correctly as well. 
 
Since features using pandas and pyarrow are in 2.3, we should fix the test issue and make sure all tests pass before we make the release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org