You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Sigalit Eliazov <e....@gmail.com> on 2022/07/25 12:09:46 UTC

sink triggers

Hi all,

I have a pipeline that reads input from a few sources, combines them and
creates a view of the data.
I need to send an output to kafka every X minutes.
What will be the best way to implement this?
Thanks
Sigalit

Re: sink triggers

Posted by Jan Lukavský <je...@seznam.cz>.
Hi Sigalit,

there might be several options, which one is the best would depend on 
the actual use-case. You might:

  a) use the CoGroupByKey transform with a fixed window [1], or

  b) use stateful processing [2] with timer triggering your output

Which one is the best depends on if you can use windowing semantics 
provided by Beam [3]. Windowing is needed for the CoGBK approach, the 
stateful approach works with globally windowed PCollections.

Hope this help, feel free to ask more questions you might have.

Best,

  Jan

[1] 
https://beam.apache.org/documentation/transforms/java/aggregation/cogroupbykey/

[2] https://beam.apache.org/blog/stateful-processing/

[3] https://beam.apache.org/documentation/programming-guide/#windowing

On 7/25/22 14:09, Sigalit Eliazov wrote:
> Hi all,
>
> I have a pipeline that reads input from a few sources, combines them 
> and creates a view of the data.
> I need to send an output to kafka every X minutes.
> What will be the best way to implement this?
> Thanks
> Sigalit
>