You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gianmarcodonetti <gi...@git.apache.org> on 2018/01/13 08:44:17 UTC
[GitHub] spark pull request #20258: [SPARK-23060][Python] New feature - apply method ...
GitHub user gianmarcodonetti opened a pull request:
https://github.com/apache/spark/pull/20258
[SPARK-23060][Python] New feature - apply method to extend rdd's functionality
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Please review http://spark.apache.org/contributing.html before opening a pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gianmarcodonetti/spark feature/apply-func-to-rdd
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20258.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20258
----
commit d8463db0e665087426f9a0d656f8730f704a66c2
Author: Gianmarco Donetti <gi...@...>
Date: 2018-01-12T15:31:40Z
added function apply to rdd
commit 8e79b4691ca61c31d346b0581e7dd733049e2b19
Author: Gianmarco Donetti <gi...@...>
Date: 2018-01-12T15:48:24Z
refactor and add todo
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20258
Oh, I see! Yea, they look quite same.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20258
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20258
That resembles [pipe](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.pipe.html?) as I pointed out in the JIRA. It's just a little trick and I don't think it's worth adding it for an API alone.
BTW, we should consider Java / Scala APIs and how it's going to work with Dataset and DataFrame too.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/20258
Isn't it just a helper function?
```python
def apply(self, func):
return func(self)
```
I don't think it's quite worth adding it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/20258
I'm +1 to @srowen on this, I don't believe this is a change we're going to make to the API. @gianmarcodonetti please close this PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #20258: [SPARK-23060][Python] New feature - apply method ...
Posted by gianmarcodonetti <gi...@git.apache.org>.
Github user gianmarcodonetti closed the pull request at:
https://github.com/apache/spark/pull/20258
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20258
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/20258
Is this similar to `Dataset.transform()` in Java/Scala API? But we don't have similar APIs for RDDs.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/20258
At best, the functionality already exists for the new API in a form. This should be closed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/20258
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #20258: [SPARK-23060][Python] New feature - apply method to exte...
Posted by gianmarcodonetti <gi...@git.apache.org>.
Github user gianmarcodonetti commented on the issue:
https://github.com/apache/spark/pull/20258
@HyukjinKwon in my opinion, it helps a lot.
My goal is to avoid this case:
`final_rdd = func_3(func_2(func_1(initial_rdd)))`
And admit this:
`final_rdd = initial_rdd.apply(func_1).apply(func_2).apply(func_3)`
More functional and readable...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org