You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/07/04 14:56:55 UTC

[GitHub] [flink] reswqa opened a new pull request, #20157: [FLINK-28326][runtime] fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime.

reswqa opened a new pull request, #20157:
URL: https://github.com/apache/flink/pull/20157

   ## What is the purpose of the change
   
   fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime: Due to the race condition, the main thread may recycle buffer before the request thread requests from bufferPool, causing the test to fail.
   
   
   ## Brief change log
   
     - *Migrate ResultPartitionTest and PartitionTestUtils to junit5 and assertJ.*
     - *fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime.*
   
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as *(please describe tests)*.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] flinkbot commented on pull request #20157: [FLINK-28326][runtime] fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime.

Posted by GitBox <gi...@apache.org>.
flinkbot commented on PR #20157:
URL: https://github.com/apache/flink/pull/20157#issuecomment-1173919858

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "c3e0f700b95236c15eaa2bf5703b08a105396d00",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c3e0f700b95236c15eaa2bf5703b08a105396d00",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c3e0f700b95236c15eaa2bf5703b08a105396d00 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] reswqa commented on a diff in pull request #20157: [FLINK-28326][runtime] fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime.

Posted by GitBox <gi...@apache.org>.
reswqa commented on code in PR #20157:
URL: https://github.com/apache/flink/pull/20157#discussion_r914982785


##########
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/ResultPartitionTest.java:
##########
@@ -405,20 +395,37 @@ public void testIdleAndBackPressuredTime() throws IOException, InterruptedExcept
         // wait until request thread start to run.
         syncLock.await();
 
-        Thread.sleep(100);
+        // wait until request buffer blocking.
+        boolean bufferPoolBlocking = false;
+
+        for (int i = 0; i < 50; i++) {
+            StackTraceElement[] stackTrace = requestThread.getStackTrace();
+            bufferPoolBlocking = isInBlockingBufferRequest(stackTrace);
+
+            if (bufferPoolBlocking) {
+                break;
+            } else {
+                // Retry
+                Thread.sleep(500);
+            }
+        }
+
+        // Verify that Thread was in blocking request
+        assertThat(bufferPoolBlocking)
+                .isTrue()
+                .withFailMessage("Did not trigger blocking buffer request.");

Review Comment:
   Thanks for your review @zentol, we must add timeout to this test if it is modified like your suggestion, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] zentol commented on a diff in pull request #20157: [FLINK-28326][runtime] fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime.

Posted by GitBox <gi...@apache.org>.
zentol commented on code in PR #20157:
URL: https://github.com/apache/flink/pull/20157#discussion_r914777339


##########
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/ResultPartitionTest.java:
##########
@@ -405,20 +395,37 @@ public void testIdleAndBackPressuredTime() throws IOException, InterruptedExcept
         // wait until request thread start to run.
         syncLock.await();
 
-        Thread.sleep(100);
+        // wait until request buffer blocking.
+        boolean bufferPoolBlocking = false;
+
+        for (int i = 0; i < 50; i++) {
+            StackTraceElement[] stackTrace = requestThread.getStackTrace();
+            bufferPoolBlocking = isInBlockingBufferRequest(stackTrace);
+
+            if (bufferPoolBlocking) {
+                break;
+            } else {
+                // Retry
+                Thread.sleep(500);
+            }
+        }
+
+        // Verify that Thread was in blocking request
+        assertThat(bufferPoolBlocking)
+                .isTrue()
+                .withFailMessage("Did not trigger blocking buffer request.");

Review Comment:
   ```suggestion
           while (!isInBlockingBufferRequest(requestThread.getStackTrace()) {
                   Thread.sleep(50);
            }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] reswqa commented on pull request #20157: [FLINK-28326][runtime] fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime.

Posted by GitBox <gi...@apache.org>.
reswqa commented on PR #20157:
URL: https://github.com/apache/flink/pull/20157#issuecomment-1179124729

   Hi @zentol, I have update this pull request by your suggestion, PTAL~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] zentol merged pull request #20157: [FLINK-28326][runtime] fix unstable test ResultPartitionTest.testIdleAndBackPressuredTime.

Posted by GitBox <gi...@apache.org>.
zentol merged PR #20157:
URL: https://github.com/apache/flink/pull/20157


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org