You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gianmarcodonetti <gi...@git.apache.org> on 2018/01/13 08:44:17 UTC

[GitHub] spark pull request #20258: [SPARK-23060][Python] New feature - apply method ...

GitHub user gianmarcodonetti opened a pull request:

    https://github.com/apache/spark/pull/20258

    [SPARK-23060][Python] New feature - apply method to extend rdd's functionality

    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gianmarcodonetti/spark feature/apply-func-to-rdd

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20258.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20258
    
----
commit d8463db0e665087426f9a0d656f8730f704a66c2
Author: Gianmarco Donetti <gi...@...>
Date:   2018-01-12T15:31:40Z

    added function apply to rdd

commit 8e79b4691ca61c31d346b0581e7dd733049e2b19
Author: Gianmarco Donetti <gi...@...>
Date:   2018-01-12T15:48:24Z

    refactor and add todo

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    Oh, I see! Yea, they look quite same.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    That resembles [pipe](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pipe.html?) as I pointed out in the JIRA. It's just a little trick and I don't think it's worth adding it for an API alone.
    
    BTW, we should consider Java / Scala APIs and how it's going to work with Dataset and DataFrame too. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    Isn't it just a helper function?
    
    ```python
    def apply(self, func):
        return func(self)
    ```
    
    I don't think it's quite worth adding it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    I'm +1 to @srowen on this, I don't believe this is a change we're going to make to the API. @gianmarcodonetti please close this PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20258: [SPARK-23060][Python] New feature - apply method ...

Posted by gianmarcodonetti <gi...@git.apache.org>.
Github user gianmarcodonetti closed the pull request at:

    https://github.com/apache/spark/pull/20258


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    Is this similar to `Dataset.transform()` in Java/Scala API? But we don't have similar APIs for RDDs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    At best, the functionality already exists for the new API in a form. This should be closed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...

Posted by gianmarcodonetti <gi...@git.apache.org>.
Github user gianmarcodonetti commented on the issue:

    https://github.com/apache/spark/pull/20258
  
    @HyukjinKwon in my opinion, it helps a lot.
    My goal is to avoid this case:
    
    `final_rdd = func_3(func_2(func_1(initial_rdd)))`
    
    And admit this:
    
    `final_rdd = initial_rdd.apply(func_1).apply(func_2).apply(func_3)`
    
    More functional and readable...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org