You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Rong Rong (JIRA)" <ji...@apache.org> on 2018/05/04 18:26:00 UTC
[jira] [Commented] (FLINK-7001) Improve performance of Sliding Time
Window with pane optimization
[ https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464239#comment-16464239 ]
Rong Rong commented on FLINK-7001:
----------------------------------
Hi [~jark],
Is there a FLIP being proposed based on the pain points discussed in here?
This inefficiency in windowing has been observed more and more frequently in our day-to-day operations lately.
We would like to contribute to the design and the implementation of this improvement if possible :-)
Thanks,
Rong
> Improve performance of Sliding Time Window with pane optimization
> -----------------------------------------------------------------
>
> Key: FLINK-7001
> URL: https://issues.apache.org/jira/browse/FLINK-7001
> Project: Flink
> Issue Type: Improvement
> Components: DataStream API
> Reporter: Jark Wu
> Assignee: Jark Wu
> Priority: Major
>
> Currently, the implementation of time-based sliding windows treats each window individually and replicates records to each window. For a window of 10 minute size that slides by 1 second the data is replicated 600 fold (10 minutes / 1 second). We can optimize sliding window by divide windows into panes (aligned with slide), so that we can avoid record duplication and leverage the checkpoint.
> I will attach a more detail design doc to the issue.
> The following issues are similar to this issue: FLINK-5387, FLINK-6990
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)