You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Aljoscha Krettek <al...@apache.org> on 2017/01/09 13:41:20 UTC

Re: Efficiently splitting a stream 3 ways

I think the split/select variant should be a bit faster because it creates
less object copies internally. It should also be more future proof because
it will benefit from improvements (if any) in the way split/select works.

There is also some ongoing work in adding support for side outputs which
allow outputting to several streams from one user function:
https://github.com/apache/flink/pull/2982

Cheers,
Aljoscha



On Thu, 22 Dec 2016 at 16:34 Lawrence Wagerfield <
lawrence@dmz.wagerfield.com> wrote:

> Hi,
>
> I'd like to know which is more efficient: splitting a stream 3 ways via
> `split` or via `filter`?
>
> --- FILTER ------
> val greater = stream.filter(_.n > 0)
> val less = stream.filter(_.n < 0)
> val equal = stream.filter(_.n == 0)
> -----------------
>
> - VS -
>
> --- SPLIT -------
> val split = stream.split(row =>
>   if (row.n > 0)
>   List("greater")
>   else if (row.n < 0)
> List("less")
>   else
>   List("equal")
> )
>
> val greater = split select "greater"
> val less = split select "less"
> val equal = split select "equal"
> -----------------
>
> Thanks!
> Lawrence
>

Re: Efficiently splitting a stream 3 ways

Posted by C B <ch...@gmail.com>.
On Jan 9, 2017 3:41 PM, "Aljoscha Krettek" <al...@apache.org> wrote:

> I think the split/select variant should be a bit faster because it creates
> less object copies internally. It should also be more future proof because
> it will benefit from improvements (if any) in the way split/select works.
>
> There is also some ongoing work in adding support for side outputs which
> allow outputting to several streams from one user function:
> https://github.com/apache/flink/pull/2982
>
> Cheers,
> Aljoscha
>
>
>
> On Thu, 22 Dec 2016 at 16:34 Lawrence Wagerfield <
> lawrence@dmz.wagerfield.com> wrote:
>
>> Hi,
>>
>> I'd like to know which is more efficient: splitting a stream 3 ways via
>> `split` or via `filter`?
>>
>> --- FILTER ------
>> val greater = stream.filter(_.n > 0)
>> val less = stream.filter(_.n < 0)
>> val equal = stream.filter(_.n == 0)
>> -----------------
>>
>> - VS -
>>
>> --- SPLIT -------
>> val split = stream.split(row =>
>>   if (row.n > 0)
>>   List("greater")
>>   else if (row.n < 0)
>> List("less")
>>   else
>>   List("equal")
>> )
>>
>> val greater = split select "greater"
>> val less = split select "less"
>> val equal = split select "equal"
>> -----------------
>>
>> Thanks!
>> Lawrence
>>
>