You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2019/12/30 09:45:40 UTC

Revisiting Python / pandas UDF (new proposal)

Hi all,

I happen to come up with another idea about pandas redesign.
Thanks Reynold, Bryan, Xiangrui, Takuya and Tim for offline discussions and
helping me to write this proposal.

Please take a look and let me know what you guys think.

-
https://docs.google.com/document/d/1-kV0FS_LF2zvaRh_GhkV32Uqksm_Sq8SvnBBmRyxm30/edit?usp=sharing
- https://issues.apache.org/jira/browse/SPARK-28264

I know it's a holiday season but please have some time to take a look so
we can make it on time before code freeze (31st Jan).

Re: Revisiting Python / pandas UDF (new proposal)

Posted by Hyukjin Kwon <gu...@gmail.com>.
Hi all, I made a PR - https://github.com/apache/spark/pull/27165
Please have a look when you guys fine some times.

I addressed another point (by Maciej), "A couple of less-intuitive pandas
UDF types" together because
the more I look, the more I felt I should deal with it together with the
proposal.


2020년 1월 6일 (월) 오후 10:52, Hyukjin Kwon <gu...@gmail.com>님이 작성:

> I happened to propose a somewhat big refactoring PR as a preparation for
> this.
> Basically, grouping all related codes into one sub-package since currently
> all pandas and PyArrow related codes are here and there.
> I would appreciate if you guys can review and give some feedback.
>
> https://github.com/apache/spark/pull/27109
>
> Thanks!
>
>
> 2020년 1월 4일 (토) 오전 5:11, Li Jin <ic...@gmail.com>님이 작성:
>
>> Hyukjin,
>>
>> Thanks for putting this together. I took a look at the proposal and left
>> some comments. At the high level I like using type hints to specify
>> input/output types but not so use about type hints for cordiality. I have
>> commented on more details in the doc.
>>
>> Li
>>
>> On Thu, Jan 2, 2020 at 9:42 AM Li Jin <ic...@gmail.com> wrote:
>>
>>> I am going to review this carefully today. Thanks for the work!
>>>
>>> Li
>>>
>>> On Wed, Jan 1, 2020 at 10:34 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> Thanks for comments Maciej - I am addressing them.
>>>> adding Li Jin too.
>>>>
>>>> I plan to proceed this late this week or early next week to make it on
>>>> time before code freeze.
>>>> I am going to pretty actively respond so please give feedback if
>>>> there's any :-).
>>>>
>>>>
>>>>
>>>> 2019년 12월 30일 (월) 오후 6:45, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I happen to come up with another idea about pandas redesign.
>>>>> Thanks Reynold, Bryan, Xiangrui, Takuya and Tim for offline
>>>>> discussions and
>>>>> helping me to write this proposal.
>>>>>
>>>>> Please take a look and let me know what you guys think.
>>>>>
>>>>> -
>>>>> https://docs.google.com/document/d/1-kV0FS_LF2zvaRh_GhkV32Uqksm_Sq8SvnBBmRyxm30/edit?usp=sharing
>>>>> - https://issues.apache.org/jira/browse/SPARK-28264
>>>>>
>>>>> I know it's a holiday season but please have some time to take a look
>>>>> so
>>>>> we can make it on time before code freeze (31st Jan).
>>>>>
>>>>>

Re: Revisiting Python / pandas UDF (new proposal)

Posted by Hyukjin Kwon <gu...@gmail.com>.
I happened to propose a somewhat big refactoring PR as a preparation for
this.
Basically, grouping all related codes into one sub-package since currently
all pandas and PyArrow related codes are here and there.
I would appreciate if you guys can review and give some feedback.

https://github.com/apache/spark/pull/27109

Thanks!


2020년 1월 4일 (토) 오전 5:11, Li Jin <ic...@gmail.com>님이 작성:

> Hyukjin,
>
> Thanks for putting this together. I took a look at the proposal and left
> some comments. At the high level I like using type hints to specify
> input/output types but not so use about type hints for cordiality. I have
> commented on more details in the doc.
>
> Li
>
> On Thu, Jan 2, 2020 at 9:42 AM Li Jin <ic...@gmail.com> wrote:
>
>> I am going to review this carefully today. Thanks for the work!
>>
>> Li
>>
>> On Wed, Jan 1, 2020 at 10:34 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> Thanks for comments Maciej - I am addressing them.
>>> adding Li Jin too.
>>>
>>> I plan to proceed this late this week or early next week to make it on
>>> time before code freeze.
>>> I am going to pretty actively respond so please give feedback if there's
>>> any :-).
>>>
>>>
>>>
>>> 2019년 12월 30일 (월) 오후 6:45, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>
>>>> Hi all,
>>>>
>>>> I happen to come up with another idea about pandas redesign.
>>>> Thanks Reynold, Bryan, Xiangrui, Takuya and Tim for offline discussions
>>>> and
>>>> helping me to write this proposal.
>>>>
>>>> Please take a look and let me know what you guys think.
>>>>
>>>> -
>>>> https://docs.google.com/document/d/1-kV0FS_LF2zvaRh_GhkV32Uqksm_Sq8SvnBBmRyxm30/edit?usp=sharing
>>>> - https://issues.apache.org/jira/browse/SPARK-28264
>>>>
>>>> I know it's a holiday season but please have some time to take a look so
>>>> we can make it on time before code freeze (31st Jan).
>>>>
>>>>

Re: Revisiting Python / pandas UDF (new proposal)

Posted by Li Jin <ic...@gmail.com>.
Hyukjin,

Thanks for putting this together. I took a look at the proposal and left
some comments. At the high level I like using type hints to specify
input/output types but not so use about type hints for cordiality. I have
commented on more details in the doc.

Li

On Thu, Jan 2, 2020 at 9:42 AM Li Jin <ic...@gmail.com> wrote:

> I am going to review this carefully today. Thanks for the work!
>
> Li
>
> On Wed, Jan 1, 2020 at 10:34 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Thanks for comments Maciej - I am addressing them.
>> adding Li Jin too.
>>
>> I plan to proceed this late this week or early next week to make it on
>> time before code freeze.
>> I am going to pretty actively respond so please give feedback if there's
>> any :-).
>>
>>
>>
>> 2019년 12월 30일 (월) 오후 6:45, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>
>>> Hi all,
>>>
>>> I happen to come up with another idea about pandas redesign.
>>> Thanks Reynold, Bryan, Xiangrui, Takuya and Tim for offline discussions
>>> and
>>> helping me to write this proposal.
>>>
>>> Please take a look and let me know what you guys think.
>>>
>>> -
>>> https://docs.google.com/document/d/1-kV0FS_LF2zvaRh_GhkV32Uqksm_Sq8SvnBBmRyxm30/edit?usp=sharing
>>> - https://issues.apache.org/jira/browse/SPARK-28264
>>>
>>> I know it's a holiday season but please have some time to take a look so
>>> we can make it on time before code freeze (31st Jan).
>>>
>>>

Re: Revisiting Python / pandas UDF (new proposal)

Posted by Li Jin <ic...@gmail.com>.
I am going to review this carefully today. Thanks for the work!

Li

On Wed, Jan 1, 2020 at 10:34 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks for comments Maciej - I am addressing them.
> adding Li Jin too.
>
> I plan to proceed this late this week or early next week to make it on
> time before code freeze.
> I am going to pretty actively respond so please give feedback if there's
> any :-).
>
>
>
> 2019년 12월 30일 (월) 오후 6:45, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>
>> Hi all,
>>
>> I happen to come up with another idea about pandas redesign.
>> Thanks Reynold, Bryan, Xiangrui, Takuya and Tim for offline discussions
>> and
>> helping me to write this proposal.
>>
>> Please take a look and let me know what you guys think.
>>
>> -
>> https://docs.google.com/document/d/1-kV0FS_LF2zvaRh_GhkV32Uqksm_Sq8SvnBBmRyxm30/edit?usp=sharing
>> - https://issues.apache.org/jira/browse/SPARK-28264
>>
>> I know it's a holiday season but please have some time to take a look so
>> we can make it on time before code freeze (31st Jan).
>>
>>

Re: Revisiting Python / pandas UDF (new proposal)

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks for comments Maciej - I am addressing them.
adding Li Jin too.

I plan to proceed this late this week or early next week to make it on time
before code freeze.
I am going to pretty actively respond so please give feedback if there's
any :-).



2019년 12월 30일 (월) 오후 6:45, Hyukjin Kwon <gu...@gmail.com>님이 작성:

> Hi all,
>
> I happen to come up with another idea about pandas redesign.
> Thanks Reynold, Bryan, Xiangrui, Takuya and Tim for offline discussions and
> helping me to write this proposal.
>
> Please take a look and let me know what you guys think.
>
> -
> https://docs.google.com/document/d/1-kV0FS_LF2zvaRh_GhkV32Uqksm_Sq8SvnBBmRyxm30/edit?usp=sharing
> - https://issues.apache.org/jira/browse/SPARK-28264
>
> I know it's a holiday season but please have some time to take a look so
> we can make it on time before code freeze (31st Jan).
>
>