You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Yufei Liu <li...@gmail.com> on 2019/08/30 08:11:51 UTC

[DISCUSS] Rework Behavior of "within" In CEP Library

Hi all,
I‘ve got several troubles when I use library CEP.

1. The funtion "within" in PatternAPI is kind of misleading. I can set within time in each part of pattern, but only the smallest one is functional. 
Pattern.begin("begin").where(...)
  .followBy("middle0").where(...).within(Time.second(1))
  .followBy("middle1").where(...).within(Time.second(2))
  .followBy("middle2").where(...).within(Time.second(3))

2. "within" is valid only when there are subsequent events triggered advance time, it might cause state leak in some cases.

3. CEP didn't support end with "notFollowBy" because it's meaningless in unbounded events, but end with "notFollowBy().within()" is meaningful. eg: Tracing user is inactive for a period of time. 
Maybe there is a way to bypass limit, like exclude matched followBy events, but I think it would be better if the framework can support this feature.


Here is my opinion:
I found the implemention of “within" is a property "windowTime" in NFA, which is decide whether the current partialMatches are timeout or not when advanceTime. It look like a state retention time for me, is't much more like a config of pattern stream, rather than a condition of PatternAPI. 
I think the real meaning of “within" is the maximum time interval between pages, and can set separately in each page.
(Change the meaning of current API is not a good idea, we can use another keyword instead of "within")

These are my initial idea about the features, 
1. Implement a TimeCondition extend IterativeCondition, and treat "within" as a condition in state transitions. And behavior of filter is compare createTime for previous and current state. 

2. Register a timer to clean timeout computationState, but it can increase the memory usage.

3. Create a special node if pattern end with notFollowBy().within(), and if reached this node then register a timer to enter a empty event if time arrived. 

The design of CEP might have their own concerns or trade off. My participation in this project is still relatively short, these just my personal opinion and some aspects may not be considered. If we can discuss these fetures and give some advice that would be great.

Best! :)