You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Thomas graves <tg...@apache.org> on 2019/05/29 21:49:51 UTC

[RESULT][VOTE] SPIP: Public APIs for extended Columnar Processing Support

Hi all,

The vote passed with 9 +1's (4 binding) and 1 +0 and no -1's.

 +1s (* = binding) :
Bobby Evans*
Thomas Graves*
DB Tsai*
Felix Cheung*
Bryan Cutler
Kazuaki Ishizaki
Tyson Condie
Dongjoon Hyun
Jason Lowe

+0s:
Xiangrui Meng

Thanks,
Tom Graves

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org

Re: [RESULT][VOTE] SPIP: Public APIs for extended Columnar Processing Support

Posted by Bobby Evans <re...@gmail.com>.

Let me put up an initial patch probably around the beginning of next week
and we can talk about the maintenance involved with it there when you have
something more concrete to look at.

Thanks,

Bobby

On Wed, May 29, 2019 at 5:04 PM Reynold Xin <rx...@databricks.com> wrote:

> Thanks Tom.
>
> I finally had time to look at the updated SPIP 10 mins ago. I support the
> high level idea and +1 on the SPIP.
>
> That said, I think the proposed API is too complicated and invasive change
> to the existing internals. A much simpler API would be to expose a columnar
> batch iterator interface, i.e. an uber column oriented UDF with ability to
> manage life cycle. Once we have that, we can also refactor the existing
> Python UDFs to use that interface.
>
> As I said earlier (couple months ago when this was first surfaced?), I
> support the idea to enable *external* column oriented processing logic, but
> not changing Spark itself to have two processing mode, which is simply very
> complicated and would create very high maintenance burden for the project.
>
>
>
>
> On Wed, May 29, 2019 at 9:49 PM, Thomas graves <tg...@apache.org> wrote:
>
>> Hi all,
>>
>> The vote passed with 9 +1's (4 binding) and 1 +0 and no -1's.
>>
>> +1s (* = binding) :
>> Bobby Evans*
>> Thomas Graves*
>> DB Tsai*
>> Felix Cheung*
>> Bryan Cutler
>> Kazuaki Ishizaki
>> Tyson Condie
>> Dongjoon Hyun
>> Jason Lowe
>>
>> +0s:
>> Xiangrui Meng
>>
>> Thanks,
>> Tom Graves
>>
>> --------------------------------------------------------------------- To
>> unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>
>

Re: [RESULT][VOTE] SPIP: Public APIs for extended Columnar Processing Support

Posted by Reynold Xin <rx...@databricks.com>.

Thanks Tom.

I finally had time to look at the updated SPIP 10 mins ago. I support the high level idea and +1 on the SPIP.

That said, I think the proposed API is too complicated and invasive change to the existing internals. A much simpler API would be to expose a columnar batch iterator interface, i.e. an uber column oriented UDF with ability to manage life cycle. Once we have that, we can also refactor the existing Python UDFs to use that interface.

As I said earlier (couple months ago when this was first surfaced?), I support the idea to enable *external* column oriented processing logic, but not changing Spark itself to have two processing mode, which is simply very complicated and would create very high maintenance burden for the project.

On Wed, May 29, 2019 at 9:49 PM, Thomas graves < tgraves@apache.org > wrote:

> 
> 
> 
> Hi all,
> 
> 
> 
> The vote passed with 9 +1's (4 binding) and 1 +0 and no -1's.
> 
> 
> 
> +1s (* = binding) :
> Bobby Evans*
> Thomas Graves*
> DB Tsai*
> Felix Cheung*
> Bryan Cutler
> Kazuaki Ishizaki
> Tyson Condie
> Dongjoon Hyun
> Jason Lowe
> 
> 
> 
> +0s:
> Xiangrui Meng
> 
> 
> 
> Thanks,
> Tom Graves
> 
> 
> 
> --------------------------------------------------------------------- To
> unsubscribe e-mail: dev-unsubscribe@ spark. apache. org (
> dev-unsubscribe@spark.apache.org )
> 
> 
>