You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Parthasarathy, Mohan" <mp...@hpe.com> on 2019/09/24 19:58:40 UTC

Event time semantics across windows

Hi,

Here is a simple example:

Application is reading a topic where messages are being received from various clients identified by “client_id”. The messages are grouped by “client_id” and windowedBy 10 seconds with grace period of 5 seconds. The event time for the stream progresses when we receive any message type from any of the client. If one client is stuck sending data for more than the grace period but other clients send data, the time progresses and the data from the client that was stuck may never be processed again.

I am wondering why the event_time takes effect before the “groupBy”. If the event_time was associated with the window (where there is one per “client_id”), then it would have worked well in this case. Is there any reason for the current design ? Is there any way to solve this problem ?

Thanks
Mohan


Re: Event time semantics across windows

Posted by "Parthasarathy, Mohan" <mp...@hpe.com>.
I agree that there is overhead. It also happens when there are multiple applications producing into a single topic and if some application crashes/stops running for some time, then application consuming from this topic see a similar problem. I

Though it might be application specific, are there any guidelines on what the associated overhead is with the grace period and is it reasonable for it to be in hours ? I guess there is state in memory, rocksdb, Anything else to worry about ? 

Thanks
Mohan

On 9/25/19, 11:25 PM, "Matthias J. Sax" <ma...@confluent.io> wrote:

    Time is currently tracked per partition, not per key. There is a ticket
    to add key-based time tracking already:
    https://issues.apache.org/jira/browse/KAFKA-8769
    
    On issue with key-based time tracking is the increased overhead; that's
    why time is tracked per partition currently.
    
    The only workaround you can apply is to increase the grace period.
    Increasing `max.task.idle.ms` config may also help.
    
    
    -Matthias
    
    On 9/24/19 12:58 PM, Parthasarathy, Mohan wrote:
    > Hi,
    > 
    > Here is a simple example:
    > 
    > Application is reading a topic where messages are being received from various clients identified by “client_id”. The messages are grouped by “client_id” and windowedBy 10 seconds with grace period of 5 seconds. The event time for the stream progresses when we receive any message type from any of the client. If one client is stuck sending data for more than the grace period but other clients send data, the time progresses and the data from the client that was stuck may never be processed again.
    > 
    > I am wondering why the event_time takes effect before the “groupBy”. If the event_time was associated with the window (where there is one per “client_id”), then it would have worked well in this case. Is there any reason for the current design ? Is there any way to solve this problem ?
    > 
    > Thanks
    > Mohan
    > 
    
    


Re: Event time semantics across windows

Posted by "Matthias J. Sax" <ma...@confluent.io>.
Time is currently tracked per partition, not per key. There is a ticket
to add key-based time tracking already:
https://issues.apache.org/jira/browse/KAFKA-8769

On issue with key-based time tracking is the increased overhead; that's
why time is tracked per partition currently.

The only workaround you can apply is to increase the grace period.
Increasing `max.task.idle.ms` config may also help.


-Matthias

On 9/24/19 12:58 PM, Parthasarathy, Mohan wrote:
> Hi,
> 
> Here is a simple example:
> 
> Application is reading a topic where messages are being received from various clients identified by “client_id”. The messages are grouped by “client_id” and windowedBy 10 seconds with grace period of 5 seconds. The event time for the stream progresses when we receive any message type from any of the client. If one client is stuck sending data for more than the grace period but other clients send data, the time progresses and the data from the client that was stuck may never be processed again.
> 
> I am wondering why the event_time takes effect before the “groupBy”. If the event_time was associated with the window (where there is one per “client_id”), then it would have worked well in this case. Is there any reason for the current design ? Is there any way to solve this problem ?
> 
> Thanks
> Mohan
>