You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Siddhant Sangwan (Jira)" <ji...@apache.org> on 2022/06/20 11:10:00 UTC

[jira] [Commented] (HDDS-6928) ozone container balancer CLI went in hung state due to deadlock

    [ https://issues.apache.org/jira/browse/HDDS-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17556328#comment-17556328 ] 

Siddhant Sangwan commented on HDDS-6928:
----------------------------------------

This is a deadlock scenario. The culprit is balancingThread.join() being called inside ContainerBalancer#stopBalancingThread() while this method's callers SCMService#stop() and ContainerBalancer#stopBalancer() are holding the lock.

{code}
  private void stopBalancingThread() {
    Thread balancingThread;
    lock.lock();
    try {
      balancingThread = currentBalancingThread;
      currentBalancingThread = null;
    } finally {
      lock.unlock();
    }
    // wait for balancingThread to die
    if (balancingThread != null &&
        balancingThread.getId() != Thread.currentThread().getId()) {
      balancingThread.interrupt();
      try {
        balancingThread.join();
      } catch (InterruptedException exception) {
        Thread.currentThread().interrupt();
      }
    }
    LOG.info("Container Balancer stopped successfully.");
  }
{code} 

> ozone container balancer CLI went in hung state due to deadlock
> ---------------------------------------------------------------
>
>                 Key: HDDS-6928
>                 URL: https://issues.apache.org/jira/browse/HDDS-6928
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: SCM
>            Reporter: Nilotpal Nandi
>            Assignee: Siddhant Sangwan
>            Priority: Major
>
> steps taken :
> -------------
> 1. Run container balancer using CLI, balancer went in running state.
> 2. Run SCM failover.
> 3. Run container balancer again using CLI
> Container balancer CLI (stop/status) went in hung state.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org