You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Al-Isawi Rami <Ra...@comptel.com> on 2016/06/07 12:41:52 UTC

Multi-field "sum" function just like "keyBy"

Hi,

Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.

-Rami
Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.

Re: Multi-field "sum" function just like "keyBy"

Posted by Gábor Gévay <gg...@gmail.com>.
Ah, sorry, you are right. You could also call keyBy again before the
second sum, but maybe someone else has a better idea.

Best,
Gábor



2016-06-07 16:18 GMT+02:00 Al-Isawi Rami <Ra...@comptel.com>:
> Thanks Gábor, but the first sum call will return
>
> SingleOutputStreamOperator
>
> I could not do another sum call on that. Would tell me how did you manage to
> do
>
> stream.sum().sum()
>
> Regards,
> -Rami
>
> On 7 Jun 2016, at 16:13, Gábor Gévay <gg...@gmail.com> wrote:
>
> Hello,
>
> In the case of "sum", you can just specify them one after the other, like:
>
> stream.sum(1).sum(2)
>
> This works, because summing the two fields are independent. However,
> in the case of "keyBy", the information is needed from both fields at
> the same time to produce the key.
>
> Best,
> Gábor
>
>
>
> 2016-06-07 14:41 GMT+02:00 Al-Isawi Rami <Ra...@comptel.com>:
>
> Hi,
>
> Is there any reason why “keyBy" accepts multi-field, while for example “sum”
> does not.
>
> -Rami
> Disclaimer: This message and any attachments thereto are intended solely for
> the addressed recipient(s) and may contain confidential information. If you
> are not the intended recipient, please notify the sender by reply e-mail and
> delete the e-mail (including any attachments thereto) without producing,
> distributing or retaining any copies thereof. Any review, dissemination or
> other use of, or taking of any action in reliance upon, this information by
> persons or entities other than the intended recipient(s) is prohibited.
> Thank you.
>
>
> Disclaimer: This message and any attachments thereto are intended solely for
> the addressed recipient(s) and may contain confidential information. If you
> are not the intended recipient, please notify the sender by reply e-mail and
> delete the e-mail (including any attachments thereto) without producing,
> distributing or retaining any copies thereof. Any review, dissemination or
> other use of, or taking of any action in reliance upon, this information by
> persons or entities other than the intended recipient(s) is prohibited.
> Thank you.

Re: Multi-field "sum" function just like "keyBy"

Posted by Al-Isawi Rami <Ra...@comptel.com>.
Thanks Gábor, but the first sum call will return

SingleOutputStreamOperator

I could not do another sum call on that. Would tell me how did you manage to do

stream.sum().sum()

Regards,
-Rami

On 7 Jun 2016, at 16:13, Gábor Gévay <gg...@gmail.com>> wrote:

Hello,

In the case of "sum", you can just specify them one after the other, like:

stream.sum(1).sum(2)

This works, because summing the two fields are independent. However,
in the case of "keyBy", the information is needed from both fields at
the same time to produce the key.

Best,
Gábor



2016-06-07 14:41 GMT+02:00 Al-Isawi Rami <Ra...@comptel.com>>:
Hi,

Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.

-Rami
Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.

Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.

Re: Multi-field "sum" function just like "keyBy"

Posted by Gábor Gévay <gg...@gmail.com>.
Hello,

In the case of "sum", you can just specify them one after the other, like:

stream.sum(1).sum(2)

This works, because summing the two fields are independent. However,
in the case of "keyBy", the information is needed from both fields at
the same time to produce the key.

Best,
Gábor



2016-06-07 14:41 GMT+02:00 Al-Isawi Rami <Ra...@comptel.com>:
> Hi,
>
> Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.
>
> -Rami
> Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.

Re: Multi-field "sum" function just like "keyBy"

Posted by Al-Isawi Rami <Ra...@comptel.com>.
Thanks Jamie, Yes your assumption is correct.

I can use keyBy as follows:
stream.keyBy(“pojo.field1”,”pojo.field2”,…)
Would make sense that I can use sum for example, to do its job for more than one field:
stream.sum(“pojo.field1”,”pojo.field2”,…)

I have created this Jira issue for it, hopefully, it will get picked someday.
https://issues.apache.org/jira/browse/FLINK-4029

-Rami


On 8 Jun 2016, at 04:25, Jamie Grier <ja...@data-artisans.com>> wrote:

I'm assuming what you're trying to do is essentially sum over two different fields of your data.  I would do this with my own ReduceFunction.


stream
  .keyBy("someKey")
  .reduce(CustomReduceFunction) // sum whatever fields you want and return the result

I think it does make sense that Flink could provide a generic sum function that could sum over multiple fields, though.

-Jamie


On Tue, Jun 7, 2016 at 5:41 AM, Al-Isawi Rami <Ra...@comptel.com>> wrote:
Hi,

Is there any reason why “keyBy" accepts multi-field, while for example “sum” does not.

-Rami
Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.



--

Jamie Grier
data Artisans, Director of Applications Engineering
@jamiegrier<https://twitter.com/jamiegrier>
jamie@data-artisans.com<ma...@data-artisans.com>


Disclaimer: This message and any attachments thereto are intended solely for the addressed recipient(s) and may contain confidential information. If you are not the intended recipient, please notify the sender by reply e-mail and delete the e-mail (including any attachments thereto) without producing, distributing or retaining any copies thereof. Any review, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. Thank you.

Re: Multi-field "sum" function just like "keyBy"

Posted by Jamie Grier <ja...@data-artisans.com>.
I'm assuming what you're trying to do is essentially sum over two different
fields of your data.  I would do this with my own ReduceFunction.


stream
  .keyBy("someKey")
  .reduce(CustomReduceFunction) // sum whatever fields you want and return
the result

I think it does make sense that Flink could provide a generic sum function
that could sum over multiple fields, though.

-Jamie


On Tue, Jun 7, 2016 at 5:41 AM, Al-Isawi Rami <Ra...@comptel.com>
wrote:

> Hi,
>
> Is there any reason why “keyBy" accepts multi-field, while for example
> “sum” does not.
>
> -Rami
> Disclaimer: This message and any attachments thereto are intended solely
> for the addressed recipient(s) and may contain confidential information. If
> you are not the intended recipient, please notify the sender by reply
> e-mail and delete the e-mail (including any attachments thereto) without
> producing, distributing or retaining any copies thereof. Any review,
> dissemination or other use of, or taking of any action in reliance upon,
> this information by persons or entities other than the intended
> recipient(s) is prohibited. Thank you.
>



-- 

Jamie Grier
data Artisans, Director of Applications Engineering
@jamiegrier <https://twitter.com/jamiegrier>
jamie@data-artisans.com