You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by Aitozi <gi...@git.apache.org> on 2018/02/03 01:24:12 UTC

[GitHub] flink pull request #5405: [FLINK-8477][Window]Add api to support user to ski...

GitHub user Aitozi opened a pull request:

    https://github.com/apache/flink/pull/5405

    [FLINK-8477][Window]Add api to support user to skip serval broken window

    In production, some application like monitor type , it need the accuarcy data,but in this scenario: if we start a job at 10:45:20s with a 1min window aggregate, we may produce a broken data of 10:45min ,so may lead to mistake. We can support a user api to choose to skip serveral windows to avoid the broken data by user self.
    
    ## Brief change log
    
      - add a streaming api 
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Aitozi/flink FLINK-8477

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5405.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5405
    
----
commit 9c6b77077bac2e0dfa4ea3bddf11bd27831ba3e4
Author: minwenjun <mi...@...>
Date:   2018-02-02T15:46:11Z

    Add api to support user to skip serval broken window

----


---

[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

Posted by Aitozi <gi...@git.apache.org>.
Github user Aitozi commented on the issue:

    https://github.com/apache/flink/pull/5405
  
    Hi @aljoscha , you have mentioned two points : 
    1. The events arrived may out of order in event-time processing 
    2. We can use windowFunction or ProcessWindowFunction to filter serverl window by specify the start time of window and the endtime.
    
    I have some differerent ideas: 
    1. when we deal with the out-of-order eventtime stream, we may specify the maxOutOfOrder to avoid the too much late elements skipped, so when the job restart/start the maxNumOfWindow to be skipped can be set to  maxOutOfOrder/(the length of the thumbling window), So that the late elements will not produce incorrect results. The num of the window need to be skipped is according to the degree of the out of order
    2. We need to skip the serveral broken window data , and we dont know which window is broken, we can just detect which window is first fired and the serval window after this is broken too. The num should very from the production (according to the maxOutOfOrder & the length of the window )


---

[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

Posted by Aitozi <gi...@git.apache.org>.
Github user Aitozi commented on the issue:

    https://github.com/apache/flink/pull/5405
  
    ping @aljoscha 


---

[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the issue:

    https://github.com/apache/flink/pull/5405
  
    I commented on the issue: https://issues.apache.org/jira/browse/FLINK-8477?focusedCommentId=16359834&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16359834


---

[GitHub] flink issue #5405: [FLINK-8477][Window]Add api to support user to skip serva...

Posted by Aitozi <gi...@git.apache.org>.
Github user Aitozi commented on the issue:

    https://github.com/apache/flink/pull/5405
  
    cc @aljoscha please help review this patch.
    ![image](https://user-images.githubusercontent.com/9486140/35761522-6e00f4b8-08c4-11e8-8063-7ec015802428.png)
    see the picture above, when user choose to use without a checkpoint to avoid catch up data after a crash , and use kafka#setStartFromLatest to consume the latest data. if use without the skip api , we can see that it can  produce a broken data which may lead to the alert in monitor Scenario。if user want to skip the broken window, can hava a choice to skip serveral window after the first fire.



---