You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@spark.apache.org by gu...@apache.org on 2020/08/10 09:47:58 UTC

[spark] branch branch-3.0 updated: [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new eaae91b  [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests
eaae91b is described below

commit eaae91b589b12273c429da1b49578802434157de
Author: Weichen Xu <we...@databricks.com>
AuthorDate: Mon Aug 10 18:43:41 2020 +0900

    [MINOR] add test_createDataFrame_empty_partition in pyspark arrow tests
    
    ### What changes were proposed in this pull request?
    add test_createDataFrame_empty_partition in pyspark arrow tests
    
    ### Why are the changes needed?
    test edge cases.
    
    ### Does this PR introduce _any_ user-facing change?
    no.
    
    ### How was this patch tested?
    N/A
    
    Closes #29398 from WeichenXu123/add_one_pyspark_arrow_test.
    
    Authored-by: Weichen Xu <we...@databricks.com>
    Signed-off-by: HyukjinKwon <gu...@apache.org>
    (cherry picked from commit fc62d720769e3267132f31ee847f2783923b3195)
    Signed-off-by: HyukjinKwon <gu...@apache.org>
---
 python/pyspark/sql/tests/test_arrow.py | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/python/pyspark/sql/tests/test_arrow.py b/python/pyspark/sql/tests/test_arrow.py
index 7a41d24..42f064a 100644
--- a/python/pyspark/sql/tests/test_arrow.py
+++ b/python/pyspark/sql/tests/test_arrow.py
@@ -428,6 +428,12 @@ class ArrowTests(ReusedSQLTestCase):
         self.assertEqual(len(pdf), 0)
         self.assertEqual(list(pdf.columns), ["col1"])
 
+    def test_createDataFrame_empty_partition(self):
+        pdf = pd.DataFrame({"c1": [1], "c2": ["string"]})
+        df = self.spark.createDataFrame(pdf)
+        self.assertEqual([Row(c1=1, c2='string')], df.collect())
+        self.assertGreater(self.spark.sparkContext.defaultParallelism, len(pdf))
+
 
 @unittest.skipIf(
     not have_pandas or not have_pyarrow,


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org