You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Maciej Szymkiewicz <ms...@gmail.com> on 2017/01/07 20:39:21 UTC

[SQL][PYTHON] UDF improvements.

Hi,

I've been looking at the PySpark UserDefinedFunction and I have a couple
of suggestions how it could be improved including:

  * Full featured decorator syntax.
  * Docstring handling improvements.
  * Lazy initialization.

I summarized all suggestions with links to possible solutions in gist
(https://gist.github.com/zero323/88953975361dbb6afd639b35368a97b4) and
I'll be happy to open a JIRA and submit a PR if there is any interest in
that.

-- 
Best,
Maciej


Re: [SQL][PYTHON] UDF improvements.

Posted by Maciej Szymkiewicz <ms...@gmail.com>.
Thanks for your response Ryan. Here you are
https://issues.apache.org/jira/browse/SPARK-19159


On 01/09/2017 07:30 PM, Ryan Blue wrote:
> Maciej, this looks great.
>
> Could you open a JIRA issue for improving the @udf decorator and
> possibly sub-tasks for the specific features from the gist? Thanks!
>
> rb
>
> On Sat, Jan 7, 2017 at 12:39 PM, Maciej Szymkiewicz
> <mszymkiewicz@gmail.com <ma...@gmail.com>> wrote:
>
>     Hi,
>
>     I've been looking at the PySpark UserDefinedFunction and I have a
>     couple of suggestions how it could be improved including:
>
>       * Full featured decorator syntax.
>       * Docstring handling improvements.
>       * Lazy initialization.
>
>     I summarized all suggestions with links to possible solutions in
>     gist
>     (https://gist.github.com/zero323/88953975361dbb6afd639b35368a97b4
>     <https://gist.github.com/zero323/88953975361dbb6afd639b35368a97b4>)
>     and I'll be happy to open a JIRA and submit a PR if there is any
>     interest in that.
>
>     -- 
>     Best,
>     Maciej
>
>
>
>
> -- 
> Ryan Blue
> Software Engineer
> Netflix

-- 
Best,
Maciej


Re: [SQL][PYTHON] UDF improvements.

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
Maciej, this looks great.

Could you open a JIRA issue for improving the @udf decorator and possibly
sub-tasks for the specific features from the gist? Thanks!

rb

On Sat, Jan 7, 2017 at 12:39 PM, Maciej Szymkiewicz <ms...@gmail.com>
wrote:

> Hi,
>
> I've been looking at the PySpark UserDefinedFunction and I have a couple
> of suggestions how it could be improved including:
>
>    - Full featured decorator syntax.
>    - Docstring handling improvements.
>    - Lazy initialization.
>
> I summarized all suggestions with links to possible solutions in gist (
> https://gist.github.com/zero323/88953975361dbb6afd639b35368a97b4) and
> I'll be happy to open a JIRA and submit a PR if there is any interest in
> that.
>
> --
> Best,
> Maciej
>
>


-- 
Ryan Blue
Software Engineer
Netflix