You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Haejoon Lee (Jira)" <ji...@apache.org> on 2021/08/13 00:56:00 UTC

[jira] [Created] (SPARK-36504) Improve test coverage for pandas API on Spark

Haejoon Lee created SPARK-36504:
-----------------------------------

             Summary: Improve test coverage for pandas API on Spark
                 Key: SPARK-36504
                 URL: https://issues.apache.org/jira/browse/SPARK-36504
             Project: Spark
          Issue Type: Umbrella
          Components: PySpark
    Affects Versions: 3.3.0
            Reporter: Haejoon Lee


There are many codes in pandas-on-Spark are not being tested, for example:
 * (Series|DataFrame).to_clipboard

           !image-2021-08-13-09-28-12-342.png|width=516,height=108!
 * `value` and `method` argument for Series.fillna

           !image-2021-08-13-09-30-45-563.png|width=508,height=34!

 

The red line above screen capture means that "this line is not being tested".

Now the test coverage of pandas-on-Spark is 89.93% for total, 93.43% for frame.py (which is including DataFrame API), 89.04% for indexing.py (which is including Index API) and 93.43% for series.py (which is including Series API).

Not necessarily cover the 100% of codes, since some test such as `DataFrame.to_delta` is not easy to test for now, but we should cover the codes as much as possible for healthy of project.

You can find more missing tests and percentage of coverage in [code cov report|[https://app.codecov.io/gh/apache/spark]|https://app.codecov.io/gh/apache/spark].].

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org