You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/04/27 14:54:46 UTC

[GitHub] [ozone] sodonnel opened a new pull request, #3357: HDDS-6658. BackgroundPipelineCreator does not always stop quickly

sodonnel opened a new pull request, #3357:
URL: https://github.com/apache/ozone/pull/3357

   ## What changes were proposed in this pull request?
   
   On my laptop, TestSCMSafeModeManager.testSafeModePipelineExitRule() always takes just over 200 seconds. Checking a few PR runs, it does not seem to be slow on PRs, just locally.
   
   Debugging the code, it seems to hang in the BackgroundPipelineCreator.stop() method, where it is waiting for the tread to join:
   
   ```
     public void stop() {
       if (!running.compareAndSet(true, false)) {
         LOG.warn("{} is not running, just ignore.", THREAD_NAME);
         return;
       }
   
       LOG.info("Stopping {}.", THREAD_NAME);
   
       // in case RatisPipelineUtilsThread is sleeping
       synchronized (monitor) {
         monitor.notifyAll();
       }
   
       try {
         thread.join();  //  ----> Hangs here
       } catch (InterruptedException e) {
         LOG.warn("Interrupted during join {}.", THREAD_NAME);
         Thread.currentThread().interrupt();
       }
     }
   ```
   
   It is clearly hanging as the background thread did not exit, and I believe it is because `notify()` is being used to try to exit the thread, when it should really be interrupted. There is a chance that notify is called while the thread is not waiting, and if so, it will just fall back into the wait state and not exit until it wakes up again.
   
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6658
   
   ## How was this patch tested?
   
   Validated slow test is now fast locally.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3357: HDDS-6658. BackgroundPipelineCreator does not always stop quickly

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3357:
URL: https://github.com/apache/ozone/pull/3357#issuecomment-1111516617

   Unit test failure:
   
   ```
   Failures: 
   Error:    TestBackgroundPipelineScrubber.testRun:97 
   ```
   This is https://github.com/apache/ozone/pull/3359 / #3359


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo commented on a diff in pull request #3357: HDDS-6658. BackgroundPipelineCreator does not always stop quickly

Posted by GitBox <gi...@apache.org>.
szetszwo commented on code in PR #3357:
URL: https://github.com/apache/ozone/pull/3357#discussion_r860482773


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/BackgroundPipelineCreator.java:
##########
@@ -148,8 +148,8 @@ public void stop() {
     LOG.info("Stopping {}.", THREAD_NAME);
 
     // in case RatisPipelineUtilsThread is sleeping
-    synchronized (monitor) {
-      monitor.notifyAll();
+    synchronized (this) {
+      thread.interrupt();

Review Comment:
   We should remove "synchronized" since interrupt() does not require it.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/BackgroundPipelineCreator.java:
##########
@@ -80,7 +81,6 @@ public class BackgroundPipelineCreator implements SCMService {
    * configured interval and tries to create pipelines.
    */
   private Thread thread;
-  private final Object monitor = new Object();

Review Comment:
   I suggest to keep the private monitor field.  It can avoid code outside this class calling wait/notify/notifyAll.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on pull request #3357: HDDS-6658. BackgroundPipelineCreator does not always stop quickly

Posted by GitBox <gi...@apache.org>.
sodonnel commented on PR #3357:
URL: https://github.com/apache/ozone/pull/3357#issuecomment-1112028735

   @szetszwo Thanks for the review. I have rolled back the earlier changes and implemented again in a new commit. Please have another look when you get a chance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] szetszwo merged pull request #3357: HDDS-6658. BackgroundPipelineCreator does not always stop quickly

Posted by GitBox <gi...@apache.org>.
szetszwo merged PR #3357:
URL: https://github.com/apache/ozone/pull/3357


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org