You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Philipp Grulich (JIRA)" <ji...@apache.org> on 2018/05/28 09:34:00 UTC

[jira] [Comment Edited] (FLINK-7001) Improve performance of Sliding Time Window with pane optimization

    [ https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492438#comment-16492438 ] 

Philipp Grulich edited comment on FLINK-7001 at 5/28/18 9:33 AM:
-----------------------------------------------------------------

Hi [~walterddr],

I think that sounds like a good starting point and I would definitely support you in this effort. 
 Should we maybe create a FLIP document to discuss the open problems?

I think for the RockSDB compatibility of the slicing approach we could also rely on the already used primitives like AppendingState.
 However, the backward compatibility could be a bigger problem, because there is no possible migration function from an in buckets stored window to a slice.

Best,
 Philipp


was (Author: pgrulich):
Hi [~walterddr],

I think that sounds like a good starting point and I would definitely support you in this effort. 
Should we maybe create a FIP document to discuss the open problems?

I think for the RockSDB compatibility of the slicing approach we could also rely on the already used primitives like AppendingState.
However, the backward compatibility could be a bigger problem, because there is no possible migration function from an in buckets stored window to a slice.

Best,
Philipp

> Improve performance of Sliding Time Window with pane optimization
> -----------------------------------------------------------------
>
>                 Key: FLINK-7001
>                 URL: https://issues.apache.org/jira/browse/FLINK-7001
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataStream API
>            Reporter: Jark Wu
>            Assignee: Jark Wu
>            Priority: Major
>
> Currently, the implementation of time-based sliding windows treats each window individually and replicates records to each window. For a window of 10 minute size that slides by 1 second the data is replicated 600 fold (10 minutes / 1 second). We can optimize sliding window by divide windows into panes (aligned with slide), so that we can avoid record duplication and leverage the checkpoint.
> I will attach a more detail design doc to the issue.
> The following issues are similar to this issue: FLINK-5387, FLINK-6990



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)