You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Shailesh Jain <sh...@stellapps.com> on 2017/11/08 15:48:44 UTC

Generate watermarks per key in a KeyedStream

Hi,

I'm working on implementing a use case wherein different physical devices
are sending events, and due to network/power issues, there can be a delay
in receiving events at Flink source. One of the operators within the flink
job is the Pattern operator, and there are certain patterns which are time
sensitive, so I'm using Event time characteristic. But the problem comes
when there are unpredictable delays in events from a particular device(s),
which causes those events to be dropped (as I cannot really define a static
bound to allow for lateness).

Since I'm using a KeyedStream, keyed on the source device ID, is there a
way to allow each CEP operator instance (one per key) to progress its time
based on the event time in the corresponding stream partition. Or in other
words, is there a way to generate watermarks per partition in a KeyedStream?

Thanks,
Shailesh

Re: Generate watermarks per key in a KeyedStream

Posted by Shailesh Jain <sh...@stellapps.com>.
Thanks for your reply, Xingcan.

On Wed, Nov 8, 2017 at 10:42 PM, Xingcan Cui <xi...@gmail.com> wrote:

> Hi Shailesh,
>
> actually, the watermarks are generated per partition, but all of them will
> be forcibly aligned to the minimum one during processing. That is decided
> by the semantics of watermark and KeyedStream, i.e., the watermarks belong
> to a whole stream and a stream is made up of different partitions (one per
> key).
>
> If the physical devices work in different time systems due to delay, the
> event streams from them should be treated separately.
>
> Hope that helps.
>
> Best,
> Xingcan
>
> On Wed, Nov 8, 2017 at 11:48 PM, Shailesh Jain <
> shailesh.jain@stellapps.com> wrote:
>
>> Hi,
>>
>> I'm working on implementing a use case wherein different physical devices
>> are sending events, and due to network/power issues, there can be a delay
>> in receiving events at Flink source. One of the operators within the flink
>> job is the Pattern operator, and there are certain patterns which are time
>> sensitive, so I'm using Event time characteristic. But the problem comes
>> when there are unpredictable delays in events from a particular device(s),
>> which causes those events to be dropped (as I cannot really define a static
>> bound to allow for lateness).
>>
>> Since I'm using a KeyedStream, keyed on the source device ID, is there a
>> way to allow each CEP operator instance (one per key) to progress its time
>> based on the event time in the corresponding stream partition. Or in other
>> words, is there a way to generate watermarks per partition in a KeyedStream?
>>
>> Thanks,
>> Shailesh
>>
>
>

Re: Generate watermarks per key in a KeyedStream

Posted by Xingcan Cui <xi...@gmail.com>.
Hi Shailesh,

actually, the watermarks are generated per partition, but all of them will
be forcibly aligned to the minimum one during processing. That is decided
by the semantics of watermark and KeyedStream, i.e., the watermarks belong
to a whole stream and a stream is made up of different partitions (one per
key).

If the physical devices work in different time systems due to delay, the
event streams from them should be treated separately.

Hope that helps.

Best,
Xingcan

On Wed, Nov 8, 2017 at 11:48 PM, Shailesh Jain <sh...@stellapps.com>
wrote:

> Hi,
>
> I'm working on implementing a use case wherein different physical devices
> are sending events, and due to network/power issues, there can be a delay
> in receiving events at Flink source. One of the operators within the flink
> job is the Pattern operator, and there are certain patterns which are time
> sensitive, so I'm using Event time characteristic. But the problem comes
> when there are unpredictable delays in events from a particular device(s),
> which causes those events to be dropped (as I cannot really define a static
> bound to allow for lateness).
>
> Since I'm using a KeyedStream, keyed on the source device ID, is there a
> way to allow each CEP operator instance (one per key) to progress its time
> based on the event time in the corresponding stream partition. Or in other
> words, is there a way to generate watermarks per partition in a KeyedStream?
>
> Thanks,
> Shailesh
>