You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by kokes <gi...@git.apache.org> on 2018/06/28 07:28:33 UTC

[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

GitHub user kokes opened a pull request:

    https://github.com/apache/spark/pull/21654

    [SPARK-24671][PySpark] DataFrame length using a dunder/magic method

    ## What changes were proposed in this pull request?
    
    `len(df)` should work by implementing `__len__` method on class `DataFrame`, this just invokes `self.count()`
    
    ## How was this patch tested?
    
    It was not, because local tests failed early on (lint-scala), before they got to PySpark and I wasn't sure how to skip them. I'm relying on Jenkins here.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kokes/spark dflen

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21654.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21654
    
----
commit 4d0afaf3cd046b11e8bae43dc00ddf4b1eb97732
Author: Ondrej Kokes <on...@...>
Date:   2018-06-27T19:50:58Z

    len(df) == df.count()

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by rgbkrk <gi...@git.apache.org>.
Github user rgbkrk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r216414567
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    I'd argue for bringing this in, if you don't think we're providing people a footgun where they'd incidentally use `len()` on a dataframe often. As for making a plan around built in function support, I'm happy to be part of a `_repr_*_` campaign. I wouldn't have the background to participate in others (`__lt__`, etc.) as I wouldn't be able to weigh their maintainability, performance, and utility like I could visual elements like reprs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Interesting, WDYT Python people like .. @holdenk ? This could be implemented on other classes like RDD, I guess. Any downside? does it help people mix up a local collection and distributed data structure?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95823/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    hey @kokes this is out of sync with master, can you merge in the latest master? I'm going to follow up on the dev@ list for the plan which @HyukjinKwon wants to see (please feel free to join in that discussion).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    cc @rgbkrk 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    **[Test build #98211 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98211/testReport)** for PR 21654 at commit [`e580442`](https://github.com/apache/spark/commit/e5804422c2711b3b8f7989a909ef27ef4cacb056).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r217779879
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    How about `iter(df)` .. how about `df[0]` ..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r216385714
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    Yea eventually we could allow. My question was if we should make an explicit plan for built-in function support before we go bit by bit


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    **[Test build #98095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98095/testReport)** for PR 21654 at commit [`4d0afaf`](https://github.com/apache/spark/commit/4d0afaf3cd046b11e8bae43dc00ddf4b1eb97732).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by rgbkrk <gi...@git.apache.org>.
Github user rgbkrk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r216718402
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    > what would you expect from `list(df)` and `len(df)` ..
    
    😨
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by rgbkrk <gi...@git.apache.org>.
Github user rgbkrk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r216381942
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    This is roughly the same line of thinking I'd have. I don't know how expensive it is to allow `count` to be able to be called within iterators, etc.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Thanks, @holdenk for addressing my concern. I will try to join as well.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98211/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r217874544
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    Adding some more reasonable helpers can actually confuse. It might be better nothing as well. I agree with adding those at the end but I think we better have the explicit plan before adding those bit by bit.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r217808692
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    Well those are a bit harder to say, I _think_ `iter` might be reasonable (main concern is if folks tried to use `map(lambda x, df)`) but those aren't the parts of the API we're talking about right now and is starting to boarder on a broader design decision we should consider taking to the list. Given the timeline of 3 this seems like a good time to have these discussions anyways -- maybe we can look at Dask for some inspiration on how to provide a more python friendly API while still encouraging good design on the part of our users.
    
    That being said, I think the potential confusion of `iter` or indexing into a DF shouldn't block adding other more reasonable helpers.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    **[Test build #95823 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95823/testReport)** for PR 21654 at commit [`4d0afaf`](https://github.com/apache/spark/commit/4d0afaf3cd046b11e8bae43dc00ddf4b1eb97732).
     * This patch **fails due to an unknown error code, -9**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98095/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r216121454
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    Can we better just not define this? RDD doesn't have this one too. IMHO, such allowing bit by bit wouldn't be so ideal .. For example, `columns.py` ended up with a weird limit:
    
    ```python
    >>> iter(spark.range(1).id)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/.../spark/python/pyspark/sql/column.py", line 344, in __iter__
        raise TypeError("Column is not iterable")
    TypeError: Column is not iterable
    >>> isinstance(spark.range(1).id, collections.Iterable)
    True
    ```
    
    It makes a general sense though.
    
    This `__iter__` can't be removed BTW because we implement `__getitem__` and `__getattr__` to access columns in dataframes IIRC.
    
    `__repr__` was added because it's commonly used and it had a strong usecase for notebook, etc. However, for `len()` I wouldn't add it for now. Think about `if len(df) ...` and it is eagerly evaluated .. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Jenkins, ok to test.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Jenkins ok to test.
    Given our move with repr this seems to be in the same line, but let me think on it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by rgbkrk <gi...@git.apache.org>.
Github user rgbkrk commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    If I was an admin I'd say go for it Jenkins.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r217773474
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    I mean I _think_ a reasonable thing to do if someone calls `list(df)` is collect - they clearly want the dataframe as a list. If that's a good idea or not is up to the developer.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    **[Test build #98095 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98095/testReport)** for PR 21654 at commit [`4d0afaf`](https://github.com/apache/spark/commit/4d0afaf3cd046b11e8bae43dc00ddf4b1eb97732).
     * This patch **fails Spark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    Build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21654: [SPARK-24671][PySpark] DataFrame length using a d...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21654#discussion_r216554397
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -375,6 +375,9 @@ def _truncate(self):
             return int(self.sql_ctx.getConf(
                 "spark.sql.repl.eagerEval.truncate", "20"))
     
    +    def __len__(self):
    --- End diff --
    
    I would say `_repr_*_` was added because I saw a strong need and usecase for it. I am worried about `len(..)` we are trying to add here. For instance, what would you expect from `list(df)` and `len(df)` .. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    **[Test build #95823 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95823/testReport)** for PR 21654 at commit [`4d0afaf`](https://github.com/apache/spark/commit/4d0afaf3cd046b11e8bae43dc00ddf4b1eb97732).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21654: [SPARK-24671][PySpark] DataFrame length using a dunder/m...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21654
  
    **[Test build #98211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98211/testReport)** for PR 21654 at commit [`e580442`](https://github.com/apache/spark/commit/e5804422c2711b3b8f7989a909ef27ef4cacb056).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org