You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Lucas Konstantin Bärenfänger <lu...@stud.tu-darmstadt.de> on 2016/10/20 10:27:41 UTC

Feasability Question: Distributed FlinkCEP

Hi all,

here's what I want to do: Consider a query such as *(A and B) followed_by
(C or D)*. (Pseudo code, obviously.) I want to create a graph of
independent processing nodes, each running an independent instance of
FlinkCEP. I want each of them to represent an operator of the query above,
like so:


*  followed_by       <-- Processing node 1*

*  /         \*

* and        or      <-- Processing nodes 2 and 3*

*/   \      /   \A   B      C   D*

The three nodes would have to process the following (sub-)queries,
respectively. (Pseudo code, again.)


*Node1: {{node2ResultStream}} followed_by {{node3ResultStream}}*

*Node2: A and B*


*Node3: C or D*
Long story short: I want to execute the query in a distributed fashion. Is
that currently possible using FlinkCEP?

Thank you very much in advance!

Best
Lucas

Re: Feasability Question: Distributed FlinkCEP

Posted by Sameer W <sa...@axiomine.com>.

Could you not do separate followedBy and then perform a join on the
resulting alert stream.

Pattern p1= followedBy(/*1st*/)
Pattern p2= followedBy(/*1st*/)
DataStream alertStream1  = CEP.pattern(keyedDs, p1)
DataStream alertStream2  = CEP.pattern(keyedDs, p2)

Then just join the two alertStream's using a keyBy (some common key in the
Alert events) on Event Time, only emit the ones with alerts from both sides
if and'ing and either side if or'ing. Or another CEP operation on the two
Alert Streams after keying by on something common in the alert events. Or
if you just union the two streams and apply CEP on the resulting stream.

The pattern you mentioned seems only possible if each pattern works on
separate keys but you still want to decide if two separate keys produced an
alert.

Sameer

On Thu, Oct 20, 2016 at 6:27 AM, Lucas Konstantin Bärenfänger <
lucas.baerenfaenger@stud.tu-darmstadt.de> wrote:

> Hi all,
>
> here's what I want to do: Consider a query such as *(A and B) followed_by
> (C or D)*. (Pseudo code, obviously.) I want to create a graph of
> independent processing nodes, each running an independent instance of
> FlinkCEP. I want each of them to represent an operator of the query above,
> like so:
>
>
> *  followed_by       <-- Processing node 1*
>
> *  /         \*
>
> * and        or      <-- Processing nodes 2 and 3*
>
> */   \      /   \A   B      C   D*
>
> The three nodes would have to process the following (sub-)queries,
> respectively. (Pseudo code, again.)
>
>
> *Node1: {{node2ResultStream}} followed_by {{node3ResultStream}}*
>
> *Node2: A and B*
>
>
> *Node3: C or D*
> Long story short: I want to execute the query in a distributed fashion. Is
> that currently possible using FlinkCEP?
>
> Thank you very much in advance!
>
> Best
> Lucas
>