You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Averell <lv...@gmail.com> on 2018/10/04 14:15:13 UTC

Identifying missing events in keyed streams

Hi everyone, 

I have a keyed stream which is expecting events every fixed interval (let's
say 1 minute). I want to raise alarms for any key which has received no
events in n-periods. What should be the cheapest way (in term of performance
) to do this?
I thought of some solutions, but don't know which one is the best:
1. Sliding window then count the number of events in each window <<< this
seems quite expensive when n is big.
2. Register a timer for every single event, record the last event timestamp
and check that timestamp when the timer expires. (This would be the best if
there's an option to cancel/modify a timer, but it seems that feature is not
available yet)
3. Session window: i haven't implemented this to verify its feasibility.
Thinking of firing the alarm on every window clear event.
4. CEP. I don't know whether it's possible or not. Haven't found a guide for
defining patterns of missing events.

Could you please give some advices?

Thanks and best regards, 
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Identifying missing events in keyed streams

Posted by Averell <lv...@gmail.com>.
Thank you Fabian.

Tried (2), and it's working well.
I found one more benefit of (2) over (3) is that it allow me to easily raise
multiple levels of alarms for each keyed stream (i.e: minor: missed 2
cycles, major: missed 5 cycles,...)

Thanks for your help.

Regards,
Averell



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Identifying missing events in keyed streams

Posted by Fabian Hueske <fh...@gmail.com>.
I'd go with 2) because the logic is simple and it is (IMO) much easier to
understand what is going on and what state is kept.

Am Do., 11. Okt. 2018 um 12:42 Uhr schrieb Averell <lv...@gmail.com>:

> Hi Fabian,
>
> Thanks for the suggestion.
> I will try with that support of removing timers.
>
> I have also tried approach (3) - using session windows, and it works: I set
> session gap to 2 minutes, and use an aggregation window function to keep
> the
> amount of in-memory data for each keyed stream to the minimum level.
>
> Could you please explain why (2) is better?
>
> Thanks and best regards,
> Averell
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: Identifying missing events in keyed streams

Posted by Averell <lv...@gmail.com>.
Hi Fabian,

Thanks for the suggestion.
I will try with that support of removing timers.

I have also tried approach (3) - using session windows, and it works: I set
session gap to 2 minutes, and use an aggregation window function to keep the
amount of in-memory data for each keyed stream to the minimum level.

Could you please explain why (2) is better?

Thanks and best regards,
Averell 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Identifying missing events in keyed streams

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Averell,

I'd go with approach 2). As of Flink 1.6.0 you can delete timers.

But even if you are on a pre-1.6 version, a ProcessFunction would be the
way to go, IMO.
You don't need to register a timer for each event.
Instead, you can register the first timer with the first event and have a
state that is updated with the timestamp of the last seen event.
When the timer fires, you check the if you need to raise an alert and
register a new timer such that it fires 1 minute after the last seen event
(last-seen + 1 minute - (now - last-seen)).

Best, Fabian

Am Do., 4. Okt. 2018 um 16:15 Uhr schrieb Averell <lv...@gmail.com>:

> Hi everyone,
>
> I have a keyed stream which is expecting events every fixed interval (let's
> say 1 minute). I want to raise alarms for any key which has received no
> events in n-periods. What should be the cheapest way (in term of
> performance
> ) to do this?
> I thought of some solutions, but don't know which one is the best:
> 1. Sliding window then count the number of events in each window <<< this
> seems quite expensive when n is big.
> 2. Register a timer for every single event, record the last event timestamp
> and check that timestamp when the timer expires. (This would be the best if
> there's an option to cancel/modify a timer, but it seems that feature is
> not
> available yet)
> 3. Session window: i haven't implemented this to verify its feasibility.
> Thinking of firing the alarm on every window clear event.
> 4. CEP. I don't know whether it's possible or not. Haven't found a guide
> for
> defining patterns of missing events.
>
> Could you please give some advices?
>
> Thanks and best regards,
> Averell
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>