You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2018/04/22 07:06:00 UTC

[jira] [Updated] (SPARK-24044) Explicitly print out skipped tests from unittest module

     [ https://issues.apache.org/jira/browse/SPARK-24044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-24044:
---------------------------------
    Description: 
There was an actual issue, SPARK-23300, and we fixed this by manually checking if the package is installed. This way needed duplicated codes and could only check dependencies. There are many conditions, for example, Python version specific or other packages like NumPy.  I think this is something we should fix.

`unittest` module can print out the skipped messages but these were swallowed so far in our own testing script. This PR prints out the messages below after sorted.

It would be nicer if we remove the duplications and print out all the skipped tests. For example, as below:

This PR proposes to remove duplicated dependency checking logics and also print out skipped tests from unittests. 

For example, as below:


{code}
Skipped tests in pyspark.sql.tests with pypy:
    test_createDataFrame_column_name_encoding (pyspark.sql.tests.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'
    test_createDataFrame_does_not_modify_input (pyspark.sql.tests.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'
...

Skipped tests in pyspark.sql.tests with python3:
    test_createDataFrame_column_name_encoding (pyspark.sql.tests.ArrowTests) ... skipped 'PyArrow >= 0.8.0 must be installed; however, it was not found.'
    test_createDataFrame_does_not_modify_input (pyspark.sql.tests.ArrowTests) ... skipped 'PyArrow >= 0.8.0 must be installed; however, it was not found.'
...
{code}

Actual format can be a bit varied per the discussion in the PR. Please check out the PR for exact format.

  was:
There was an actual issue, SPARK-23300, and we fixed this by manually checking if the package is installed. This way needed duplicated codes and could only check dependencies. There are many conditions, for example, Python version specific or other packages like NumPy.  I think this is something we should fix.

`unittest` module can print out the skipped messages but these were swallowed so far in our own testing script. This PR prints out the messages below after sorted.

It would be nicer if we remove the duplications and print out all the skipped tests. For example, as below:

{code}
This PR proposes to remove duplicated dependency checking logics and also print out skipped tests from unittests. 

For example, as below:


{code}
Skipped tests in pyspark.sql.tests with pypy:
    test_createDataFrame_column_name_encoding (pyspark.sql.tests.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'
    test_createDataFrame_does_not_modify_input (pyspark.sql.tests.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'
...

Skipped tests in pyspark.sql.tests with python3:
    test_createDataFrame_column_name_encoding (pyspark.sql.tests.ArrowTests) ... skipped 'PyArrow >= 0.8.0 must be installed; however, it was not found.'
    test_createDataFrame_does_not_modify_input (pyspark.sql.tests.ArrowTests) ... skipped 'PyArrow >= 0.8.0 must be installed; however, it was not found.'
...
{code}

Actual format can be a bit varied per the discussion in the PR. Please check out the PR for exact format.


> Explicitly print out skipped tests from unittest module
> -------------------------------------------------------
>
>                 Key: SPARK-24044
>                 URL: https://issues.apache.org/jira/browse/SPARK-24044
>             Project: Spark
>          Issue Type: Test
>          Components: PySpark
>    Affects Versions: 2.4.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> There was an actual issue, SPARK-23300, and we fixed this by manually checking if the package is installed. This way needed duplicated codes and could only check dependencies. There are many conditions, for example, Python version specific or other packages like NumPy.  I think this is something we should fix.
> `unittest` module can print out the skipped messages but these were swallowed so far in our own testing script. This PR prints out the messages below after sorted.
> It would be nicer if we remove the duplications and print out all the skipped tests. For example, as below:
> This PR proposes to remove duplicated dependency checking logics and also print out skipped tests from unittests. 
> For example, as below:
> {code}
> Skipped tests in pyspark.sql.tests with pypy:
>     test_createDataFrame_column_name_encoding (pyspark.sql.tests.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'
>     test_createDataFrame_does_not_modify_input (pyspark.sql.tests.ArrowTests) ... skipped 'Pandas >= 0.19.2 must be installed; however, it was not found.'
> ...
> Skipped tests in pyspark.sql.tests with python3:
>     test_createDataFrame_column_name_encoding (pyspark.sql.tests.ArrowTests) ... skipped 'PyArrow >= 0.8.0 must be installed; however, it was not found.'
>     test_createDataFrame_does_not_modify_input (pyspark.sql.tests.ArrowTests) ... skipped 'PyArrow >= 0.8.0 must be installed; however, it was not found.'
> ...
> {code}
> Actual format can be a bit varied per the discussion in the PR. Please check out the PR for exact format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org