You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/03/18 15:55:33 UTC

[GitHub] [flink] realdengziqi opened a new pull request #19158: [FLINK-26334][datastream] for 1.14.0 . Modified getWindowStartWithOff…

realdengziqi opened a new pull request #19158:
URL: https://github.com/apache/flink/pull/19158


   Co-authored-by: Lin WanNi <li...@foxmail.com>
   Co-authored-by: Guo YuanFang <16...@qq.com>
   
   ## What is the purpose of the change
   [https://issues.apache.org/jira/browse/FLINK-26334](https://issues.apache.org/jira/browse/FLINK-26334)
   about: [https://github.com/apache/flink/pull/18982](https://github.com/apache/flink/pull/18982)
   The goal of this PR is to fix the bug that:  the element couldn't be assigned to the correct window-start, if it's *timestamp - offset + windowSize < 0*.
   
   This bug located at _org.apache.flink.streaming.api.windowing.windows.TimeWindow_ .
   
    This problem will be triggered by the negative timestamp, and is caused by the calculation method of remainder in the JAVA compiler. 
   
   Specifically, when we try to calculate the window-start of an incoming element,  if _timestamp - offset + windowSize < 0_, based on the current calculation formula for window-start, **the element will be right shifted to the next window, which has a start time larger than the timestamp of current element**, seems violated the assignment principle for elements on window.
   
   ![image](https://user-images.githubusercontent.com/42276568/156824315-b3d277ce-1775-426d-a86e-76535e58b55e.png)
   
   This problem can be fixed by modifying the calculation formula inside the getWindowStartWithOffset() method as below:
   ```java
   public static long getWindowStartWithOffset(long timestamp, long offset, long windowSize) {
       return timestamp
               - (timestamp - offset) % windowSize
               - (windowSize & (timestamp - offset) >> 63);
   }
   ```
   After this modify, for the element who has negative timestamp, we can still get the correct window-start. Like the below graph showing:
   ![image](https://user-images.githubusercontent.com/42276568/156824911-8625f715-618b-4bf0-a7bd-85f3d7bde21b.png)
   
   ### The getWindowStartWithOffset() method in other package
   
   Considered the common usage of this method, we checked out the other getWindowStartWithOffset() method in the project, found one in the _org.apache.flink.table.runtime.operators.window.grouping.WindowsGrouping_
   
   Turn out this method correctly handled the negative timestamp situation. Below is the source code.
   
   ```java
   private long getWindowStartWithOffset(long timestamp, long offset, long windowSize) {
           long remainder = (timestamp - offset) % windowSize;
           // handle both positive and negative cases
           if (remainder < 0) {
               return timestamp - (remainder + windowSize);
           } else {
               return timestamp - remainder;
           }
       }
   ```
   ### further
   When we wrote the test case, we found that the algorithm we wrote would violate the convention that the window is closed on the left and open on the right. In addition, considering the readability of the code, we decided to use the same code as in `org.apache.flink.table.runtime.operators.window.grouping.WindowsGrouping`.
   
   ```java
   private long getWindowStartWithOffset(long timestamp, long offset, long windowSize) {
       long remainder = (timestamp - offset) % windowSize;
       // handle both positive and negative cases
       if (remainder < 0){
            return timestamp - (remainder + windowSize);
        }else {
            return timestamp - remainder;
        }
   } 
   ```
   
   In addition, in the process of modification, we found that the algorithm of `getWindowStartWithOffset` in `org.apache.flink.table.runtime.operators.window.TimeWindow` is the same as that in `org.apache.flink.streaming.api.windowing.windows.TimeWindow`. So it should cause the same problem. I think it should also be modified to support negative timestamps
   
   ## Brief change log
   - Fix getWindowStartWithOffset in *TimeWindow.java*
   
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as the tests in the flink-streaming-java [mvn clean verify]
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] fapaul merged pull request #19158: [FLINK-26334][datastream] for 1.14.0 . Modified getWindowStartWithOff…

Posted by GitBox <gi...@apache.org>.
fapaul merged pull request #19158:
URL: https://github.com/apache/flink/pull/19158


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] realdengziqi commented on pull request #19158: [FLINK-26334][datastream] for 1.14.0 . Modified getWindowStartWithOff…

Posted by GitBox <gi...@apache.org>.
realdengziqi commented on pull request #19158:
URL: https://github.com/apache/flink/pull/19158#issuecomment-1073643070


   @fapaul hi , paul. This is a backport PR for  the release-1.14 branch. will u help have a look and merge it if it's ok?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #19158: [FLINK-26334][datastream] for 1.14.0 . Modified getWindowStartWithOff…

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #19158:
URL: https://github.com/apache/flink/pull/19158#issuecomment-1072551258


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33376",
       "triggerID" : "a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33376) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #19158: [FLINK-26334][datastream] for 1.14.0 . Modified getWindowStartWithOff…

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #19158:
URL: https://github.com/apache/flink/pull/19158#issuecomment-1072551258


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #19158: [FLINK-26334][datastream] for 1.14.0 . Modified getWindowStartWithOff…

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #19158:
URL: https://github.com/apache/flink/pull/19158#issuecomment-1072551258


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33376",
       "triggerID" : "a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a2eb4bc43077668f7f9bddc51aa3eb8de9386b0a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=33376) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org