You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Pedro Silva <pe...@gmail.com> on 2021/09/10 16:33:57 UTC

CEP library support in Python

Hello,

Is Flink's CEP library
<https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/libs/cep/>
available
in python? From the documentation I see no references so I'm guessing the
answer is no but wanted some confirmation from the community or developers.

Are there plans to support this library in python or alternatively, another
library altogether that can be used in python?

Thank you and have a nice weekend,
Pedro Silva

Re: CEP library support in Python

Posted by Pedro Silva <pe...@gmail.com>.
Understood, I was looking for a way to define these metrics that is attainable for non-programmers to develop. 

Thank you for the answer Seth

Pedro

> On 15 Sep 2021, at 18:38, Seth Wiesman <sj...@gmail.com> wrote:
> 
> 
> Honestly, I don't think you need CEP or MATCH_RECOGNIZE for that use case. It can be solved with a simple process function that tracks the state for each id. Output a 1 when a job completes and a -1 if canceled. Output the sum. You can use a simple timer to clear the state for a job after 6 months have passed. 
> 
> Seth 
> 
>> On Wed, Sep 15, 2021 at 12:34 PM Pedro Silva <pe...@gmail.com> wrote:
>> Hello,
>> 
>> As anyone used streaming sql pattern matching as shown in this email thread to count certain transitions on a stream?
>> Is it feasible?
>> 
>> Thank you,
>> Pedro Silva
>> 
>>>> On 13 Sep 2021, at 11:16, Pedro Silva <pe...@gmail.com> wrote:
>>>> 
>>> 
>>> Hello Seth,
>>> 
>>> Thank you very much for your reply. I've taken a look at MATCH_RECOGNIZE but I have the following doubt. Can I implement a state machine that detect patterns with multiple end states?
>>> To give you a concrete example:
>>> 
>>> I'm trying to count the number of Jobs that have been cancelled and completed. The state machine associated with this Job concept is as follows:
>>> Started -> On-Going (Multiple Progress messages) -> Closed -> Completed ----\ 
>>>     \--------------\--------------------------------------------------------\----------------\------------- > Cancelled
>>> 
>>> At any point the Job can be cancelled from the previous state. 
>>> This cancel message can take anywhere from 1-2 weeks to be received.
>>> The duration of this state machine (Job lifecycle) is roughly 6 months.
>>> 
>>> How can I keep a count of the number of Jobs that have been completed but not cancelled such that when a cancel appears on a previously (completed | closed)  I decrease my counter but not when a cancel appears after a started or progress state (no counter increment or decrement) ?
>>> 
>>> I hope this example was clear.
>>> 
>>> Thank you for your time!
>>> Pedro Silva
>>> 
>>> 
>>>> Em sex., 10 de set. de 2021 às 20:18, Seth Wiesman <sj...@gmail.com> escreveu:
>>>> Hi Pedro, 
>>>> 
>>>> The DataStream CEP library is not available in Python but you can use `MATCH_RECOGNIZE` in the table API which is implemented on-top of the CEP library from Python. 
>>>> 
>>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/match_recognize/ 
>>>> 
>>>> Seth 
>>>> 
>>>>> On Fri, Sep 10, 2021 at 11:34 AM Pedro Silva <pe...@gmail.com> wrote:
>>>>> Hello,
>>>>> 
>>>>> Is Flink's CEP library available in python? From the documentation I see no references so I'm guessing the answer is no but wanted some confirmation from the community or developers.
>>>>> 
>>>>> Are there plans to support this library in python or alternatively, another library altogether that can be used in python?
>>>>> 
>>>>> Thank you and have a nice weekend,
>>>>> Pedro Silva

Re: CEP library support in Python

Posted by Seth Wiesman <sj...@gmail.com>.
Honestly, I don't think you need CEP or MATCH_RECOGNIZE for that use case.
It can be solved with a simple process function that tracks the state for
each id. Output a 1 when a job completes and a -1 if canceled. Output the
sum. You can use a simple timer to clear the state for a job after 6 months
have passed.

Seth

On Wed, Sep 15, 2021 at 12:34 PM Pedro Silva <pe...@gmail.com> wrote:

> Hello,
>
> As anyone used streaming sql pattern matching as shown in this email
> thread to count certain transitions on a stream?
> Is it feasible?
>
> Thank you,
> Pedro Silva
>
> On 13 Sep 2021, at 11:16, Pedro Silva <pe...@gmail.com> wrote:
>
> 
> Hello Seth,
>
> Thank you very much for your reply. I've taken a look at MATCH_RECOGNIZE
> but I have the following doubt. Can I implement a state machine that detect
> patterns with multiple end states?
> To give you a concrete example:
>
> I'm trying to count the number of *Jobs* that have been *cancelled* and
> *completed*. The state machine associated with this Job concept is as
> follows:
> Started -> On-Going (Multiple Progress messages) -> Closed -> Completed
> ----\
>
> \--------------\--------------------------------------------------------\----------------\-------------
> > Cancelled
>
> At any point the Job can be cancelled from the previous state.
> This cancel message can take anywhere from 1-2 weeks to be received.
> The duration of this state machine (Job lifecycle) is roughly 6 months.
>
> How can I keep a count of the number of Jobs that have been completed but
> not cancelled such that when a cancel appears on a previously (completed |
> closed)  I decrease my counter but not when a cancel appears after a
> started or progress state (no counter increment or decrement) ?
>
> I hope this example was clear.
>
> Thank you for your time!
> Pedro Silva
>
>
> Em sex., 10 de set. de 2021 às 20:18, Seth Wiesman <sj...@gmail.com>
> escreveu:
>
>> Hi Pedro,
>>
>> The DataStream CEP library is not available in Python but you can use
>> `MATCH_RECOGNIZE` in the table API which is implemented on-top of the CEP
>> library from Python.
>>
>>
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/match_recognize/
>>
>>
>> Seth
>>
>> On Fri, Sep 10, 2021 at 11:34 AM Pedro Silva <pe...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> Is Flink's CEP library
>>> <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/libs/cep/> available
>>> in python? From the documentation I see no references so I'm guessing the
>>> answer is no but wanted some confirmation from the community or developers.
>>>
>>> Are there plans to support this library in python or alternatively,
>>> another library altogether that can be used in python?
>>>
>>> Thank you and have a nice weekend,
>>> Pedro Silva
>>>
>>

Re: CEP library support in Python

Posted by Pedro Silva <pe...@gmail.com>.
Hello,

As anyone used streaming sql pattern matching as shown in this email thread to count certain transitions on a stream?
Is it feasible?

Thank you,
Pedro Silva

> On 13 Sep 2021, at 11:16, Pedro Silva <pe...@gmail.com> wrote:
> 
> 
> Hello Seth,
> 
> Thank you very much for your reply. I've taken a look at MATCH_RECOGNIZE but I have the following doubt. Can I implement a state machine that detect patterns with multiple end states?
> To give you a concrete example:
> 
> I'm trying to count the number of Jobs that have been cancelled and completed. The state machine associated with this Job concept is as follows:
> Started -> On-Going (Multiple Progress messages) -> Closed -> Completed ----\ 
>     \--------------\--------------------------------------------------------\----------------\------------- > Cancelled
> 
> At any point the Job can be cancelled from the previous state. 
> This cancel message can take anywhere from 1-2 weeks to be received.
> The duration of this state machine (Job lifecycle) is roughly 6 months.
> 
> How can I keep a count of the number of Jobs that have been completed but not cancelled such that when a cancel appears on a previously (completed | closed)  I decrease my counter but not when a cancel appears after a started or progress state (no counter increment or decrement) ?
> 
> I hope this example was clear.
> 
> Thank you for your time!
> Pedro Silva
> 
> 
>> Em sex., 10 de set. de 2021 às 20:18, Seth Wiesman <sj...@gmail.com> escreveu:
>> Hi Pedro, 
>> 
>> The DataStream CEP library is not available in Python but you can use `MATCH_RECOGNIZE` in the table API which is implemented on-top of the CEP library from Python. 
>> 
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/match_recognize/ 
>> 
>> Seth 
>> 
>>> On Fri, Sep 10, 2021 at 11:34 AM Pedro Silva <pe...@gmail.com> wrote:
>>> Hello,
>>> 
>>> Is Flink's CEP library available in python? From the documentation I see no references so I'm guessing the answer is no but wanted some confirmation from the community or developers.
>>> 
>>> Are there plans to support this library in python or alternatively, another library altogether that can be used in python?
>>> 
>>> Thank you and have a nice weekend,
>>> Pedro Silva

Re: CEP library support in Python

Posted by Pedro Silva <pe...@gmail.com>.
Hello Seth,

Thank you very much for your reply. I've taken a look at MATCH_RECOGNIZE
but I have the following doubt. Can I implement a state machine that detect
patterns with multiple end states?
To give you a concrete example:

I'm trying to count the number of *Jobs* that have been *cancelled* and
*completed*. The state machine associated with this Job concept is as
follows:
Started -> On-Going (Multiple Progress messages) -> Closed -> Completed
----\

\--------------\--------------------------------------------------------\----------------\-------------
> Cancelled

At any point the Job can be cancelled from the previous state.
This cancel message can take anywhere from 1-2 weeks to be received.
The duration of this state machine (Job lifecycle) is roughly 6 months.

How can I keep a count of the number of Jobs that have been completed but
not cancelled such that when a cancel appears on a previously (completed |
closed)  I decrease my counter but not when a cancel appears after a
started or progress state (no counter increment or decrement) ?

I hope this example was clear.

Thank you for your time!
Pedro Silva


Em sex., 10 de set. de 2021 às 20:18, Seth Wiesman <sj...@gmail.com>
escreveu:

> Hi Pedro,
>
> The DataStream CEP library is not available in Python but you can use
> `MATCH_RECOGNIZE` in the table API which is implemented on-top of the CEP
> library from Python.
>
>
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/match_recognize/
>
>
> Seth
>
> On Fri, Sep 10, 2021 at 11:34 AM Pedro Silva <pe...@gmail.com>
> wrote:
>
>> Hello,
>>
>> Is Flink's CEP library
>> <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/libs/cep/> available
>> in python? From the documentation I see no references so I'm guessing the
>> answer is no but wanted some confirmation from the community or developers.
>>
>> Are there plans to support this library in python or alternatively,
>> another library altogether that can be used in python?
>>
>> Thank you and have a nice weekend,
>> Pedro Silva
>>
>

Re: CEP library support in Python

Posted by Seth Wiesman <sj...@gmail.com>.
Hi Pedro,

The DataStream CEP library is not available in Python but you can use
`MATCH_RECOGNIZE` in the table API which is implemented on-top of the CEP
library from Python.

https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/match_recognize/


Seth

On Fri, Sep 10, 2021 at 11:34 AM Pedro Silva <pe...@gmail.com> wrote:

> Hello,
>
> Is Flink's CEP library
> <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/libs/cep/> available
> in python? From the documentation I see no references so I'm guessing the
> answer is no but wanted some confirmation from the community or developers.
>
> Are there plans to support this library in python or alternatively,
> another library altogether that can be used in python?
>
> Thank you and have a nice weekend,
> Pedro Silva
>