You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ru...@apache.org on 2022/11/22 09:16:01 UTC

[spark] branch master updated: [SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty`

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new e16dd7c0cfe [SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty`
e16dd7c0cfe is described below

commit e16dd7c0cfed8745a49bd46c30c05fc82ac292d5
Author: Ruifeng Zheng <ru...@apache.org>
AuthorDate: Tue Nov 22 17:15:42 2022 +0800

    [SPARK-41212][CONNECT][PYTHON] Implement `DataFrame.isEmpty`
    
    ### What changes were proposed in this pull request?
    Implement `DataFrame.isEmpty`
    
    ### Why are the changes needed?
    API Coverage
    
    ### Does this PR introduce _any_ user-facing change?
    Yes, new api
    
    ### How was this patch tested?
    added UT
    
    Closes #38734 from zhengruifeng/connect_df_is_empty.
    
    Authored-by: Ruifeng Zheng <ru...@apache.org>
    Signed-off-by: Ruifeng Zheng <ru...@apache.org>
---
 python/pyspark/sql/connect/dataframe.py                | 12 ++++++++++++
 python/pyspark/sql/tests/connect/test_connect_basic.py |  5 +++++
 2 files changed, 17 insertions(+)

diff --git a/python/pyspark/sql/connect/dataframe.py b/python/pyspark/sql/connect/dataframe.py
index b8fa640a42f..579403299fe 100644
--- a/python/pyspark/sql/connect/dataframe.py
+++ b/python/pyspark/sql/connect/dataframe.py
@@ -125,6 +125,18 @@ class DataFrame(object):
         new_frame._plan = plan
         return new_frame
 
+    def isEmpty(self) -> bool:
+        """Returns ``True`` if this :class:`DataFrame` is empty.
+
+        .. versionadded:: 3.4.0
+
+        Returns
+        -------
+        bool
+            Whether it's empty DataFrame or not.
+        """
+        return len(self.take(1)) == 0
+
     def select(self, *cols: "ExpressionOrString") -> "DataFrame":
         return DataFrame.withPlan(plan.Project(self._plan, *cols), session=self._session)
 
diff --git a/python/pyspark/sql/tests/connect/test_connect_basic.py b/python/pyspark/sql/tests/connect/test_connect_basic.py
index 8bf28bf8a75..49973ba70c3 100644
--- a/python/pyspark/sql/tests/connect/test_connect_basic.py
+++ b/python/pyspark/sql/tests/connect/test_connect_basic.py
@@ -346,6 +346,11 @@ class SparkConnectTests(SparkConnectSQLTestCase):
         self.assertEqual(1, len(pdf.columns))  # one column
         self.assertEqual("X", pdf.columns[0])
 
+    def test_is_empty(self):
+        # SPARK-41212: Test is empty
+        self.assertFalse(self.connect.sql("SELECT 1 AS X").isEmpty())
+        self.assertTrue(self.connect.sql("SELECT 1 AS X LIMIT 0").isEmpty())
+
     def test_session(self):
         self.assertEqual(self.connect, self.connect.sql("SELECT 1").sparkSession())
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org