You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Rajesh Kalyanasundaram <Ra...@fico.com> on 2019/04/09 11:09:45 UTC

Dynamic window size for aggregations

Hi all,
We have a requirement to implement aggregations with TimedWindows which may have varying window sizes. For example, I may want a tumbling window that tumbles on 31st of Jan and 28th of Feb, 31st of March and so on.
We did initial analysis of TimedWindows. Found that the windoSize is being fixed at many places in the streams code.

Any guidance in supporting this would be very much appreciated.
Thank you.
Regards,
Rajesh
This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.

Re: Dynamic window size for aggregations

Posted by Boyang Chen <bc...@outlook.com>.
Hey Rajesh,

I do like the idea of customized windows. To make that work, we need to define a rule-based window size generation. In practical sense, fully randomized windows are rare and those are usually addressed as session windows. For calendar based window strategy, the goal is to define a cyclic formula for window creation. For example, we should define an API like:
new WindowStore(List[Duration] months -> (31, 28, 31, 30...))

and notice that the correct cycle is 4 years (1 extra day on February). This seems to be a very business specific use case that needs a lot of work to make it happen. Considering the amount of work, day based window should definitely easier for you and you just need to do secondary aggregation on top of the raw data. Do you want to talk more about your use case here, especially how much impact for your case if the rule-based window store API is developed?

Boyang

________________________________
From: Rajesh Kalyanasundaram <Ra...@fico.com>
Sent: Wednesday, April 10, 2019 1:05 AM
To: dev@kafka.apache.org
Subject: Re: Dynamic window size for aggregations

Thanks Boyang & Matthias for your replies.
Boyang you are right.. More generically, I want to create a calendar based window. If my window is in months then the window shall tumble or hop end of every month. If my window is in days, then the window shall tumble/advanceby end of every day.
Perhaps, the thread's subject is misleading in the sense that its not dynamic windows rather calendar based window. Obviously if my unit is days, then my window size is NOT dynamic.
Thanks
Regards,
Rajesh

On 09/04/19, 10:12 PM, "Boyang Chen" <bc...@outlook.com> wrote:

    Hey Rajesh,


    my understanding is that you want to create a month-based time window, is that correct?


    Boyang

    ________________________________
    From: Matthias J. Sax <ma...@confluent.io>
    Sent: Tuesday, April 9, 2019 11:42 PM
    To: dev@kafka.apache.org
    Subject: Re: Dynamic window size for aggregations

    Feel free to create a feature request JIRA.

    For now, you could use a custom processor (via `.transform()`) that uses
    an attached window store to implement the logic you need.


    -Matthias

    On 4/9/19 4:09 AM, Rajesh Kalyanasundaram wrote:
    > Hi all,
    > We have a requirement to implement aggregations with TimedWindows which may have varying window sizes. For example, I may want a tumbling window that tumbles on 31st of Jan and 28th of Feb, 31st of March and so on.
    > We did initial analysis of TimedWindows. Found that the windoSize is being fixed at many places in the streams code.
    >
    > Any guidance in supporting this would be very much appreciated.
    > Thank you.
    > Regards,
    > Rajesh
    > This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
    >



This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.

Re: Dynamic window size for aggregations

Posted by Rajesh Kalyanasundaram <Ra...@fico.com>.
Thanks Boyang & Matthias for your replies.
Boyang you are right.. More generically, I want to create a calendar based window. If my window is in months then the window shall tumble or hop end of every month. If my window is in days, then the window shall tumble/advanceby end of every day.
Perhaps, the thread's subject is misleading in the sense that its not dynamic windows rather calendar based window. Obviously if my unit is days, then my window size is NOT dynamic.
Thanks
Regards,
Rajesh

On 09/04/19, 10:12 PM, "Boyang Chen" <bc...@outlook.com> wrote:

    Hey Rajesh,


    my understanding is that you want to create a month-based time window, is that correct?


    Boyang

    ________________________________
    From: Matthias J. Sax <ma...@confluent.io>
    Sent: Tuesday, April 9, 2019 11:42 PM
    To: dev@kafka.apache.org
    Subject: Re: Dynamic window size for aggregations

    Feel free to create a feature request JIRA.

    For now, you could use a custom processor (via `.transform()`) that uses
    an attached window store to implement the logic you need.


    -Matthias

    On 4/9/19 4:09 AM, Rajesh Kalyanasundaram wrote:
    > Hi all,
    > We have a requirement to implement aggregations with TimedWindows which may have varying window sizes. For example, I may want a tumbling window that tumbles on 31st of Jan and 28th of Feb, 31st of March and so on.
    > We did initial analysis of TimedWindows. Found that the windoSize is being fixed at many places in the streams code.
    >
    > Any guidance in supporting this would be very much appreciated.
    > Thank you.
    > Regards,
    > Rajesh
    > This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
    >



This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.

Re: Dynamic window size for aggregations

Posted by Boyang Chen <bc...@outlook.com>.
Hey Rajesh,


my understanding is that you want to create a month-based time window, is that correct?


Boyang

________________________________
From: Matthias J. Sax <ma...@confluent.io>
Sent: Tuesday, April 9, 2019 11:42 PM
To: dev@kafka.apache.org
Subject: Re: Dynamic window size for aggregations

Feel free to create a feature request JIRA.

For now, you could use a custom processor (via `.transform()`) that uses
an attached window store to implement the logic you need.


-Matthias

On 4/9/19 4:09 AM, Rajesh Kalyanasundaram wrote:
> Hi all,
> We have a requirement to implement aggregations with TimedWindows which may have varying window sizes. For example, I may want a tumbling window that tumbles on 31st of Jan and 28th of Feb, 31st of March and so on.
> We did initial analysis of TimedWindows. Found that the windoSize is being fixed at many places in the streams code.
>
> Any guidance in supporting this would be very much appreciated.
> Thank you.
> Regards,
> Rajesh
> This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
>


Re: Dynamic window size for aggregations

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Feel free to create a feature request JIRA.

For now, you could use a custom processor (via `.transform()`) that uses
an attached window store to implement the logic you need.


-Matthias

On 4/9/19 4:09 AM, Rajesh Kalyanasundaram wrote:
> Hi all,
> We have a requirement to implement aggregations with TimedWindows which may have varying window sizes. For example, I may want a tumbling window that tumbles on 31st of Jan and 28th of Feb, 31st of March and so on.
> We did initial analysis of TimedWindows. Found that the windoSize is being fixed at many places in the streams code.
> 
> Any guidance in supporting this would be very much appreciated.
> Thank you.
> Regards,
> Rajesh
> This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
>