You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Janus Chow (Jira)" <ji...@apache.org> on 2022/02/25 09:47:00 UTC

[jira] [Created] (HDDS-6377) Redundant loop while doing triggerHeartbeat in DatanodeStateMachine

Janus Chow created HDDS-6377:
--------------------------------

             Summary: Redundant loop while doing triggerHeartbeat in DatanodeStateMachine
                 Key: HDDS-6377
                 URL: https://issues.apache.org/jira/browse/HDDS-6377
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Janus Chow
            Assignee: Janus Chow


The code related to checking heartbeat is as follows.

 
{code:java}

L1    while (context.getState() != DatanodeStates.SHUTDOWN) {
L2      try {
L3        LOG.debug("Executing cycle Number : {}", context.getExecutionCount());
L4        long heartbeatFrequency = context.getHeartbeatFrequency();
L5        nextHB.set(Time.monotonicNow() + heartbeatFrequency);
L6        context.execute(executorService, heartbeatFrequency,
L7            TimeUnit.MILLISECONDS);
L8      } catch (InterruptedException e) {
L9        // Someone has sent interrupt signal, this could be because
L10      // 1. Trigger heartbeat immediately
L11      // 2. Shutdown has be initiated.
L12      Thread.currentThread().interrupt();
L13    } catch (Exception e) {
L14      LOG.error("Unable to finish the execution.", e);
L15    }      
L16
L17    now = Time.monotonicNow();
L18    if (now < nextHB.get()) {
L19      if (!Thread.interrupted()) {
L20        try {
L21          Thread.sleep(nextHB.get() - now);
L22        } catch (InterruptedException e) {
L23          //triggerHeartbeat is called during the sleep
L24          Thread.currentThread().interrupt();
L25        }
L26      }
L27    }
     {code}
The redundant case happens as follows:
 # triggerHeartBeat() called while stateMachineThread sleeping at L21.
 # IterruptedException catched in L22, "interrupted" state reset to false.
 # L24 set "interrupted" state to true.
 # Then back to while loop, in try-catch block of L2, since "interrupted" state was set to true, it will go to L8, then L12 set the "interrupted" state to true.
 # In L19, "Thread.interrupted()" was checked, since the current value is true, it will skip the sleep and go to next loop of while, and "interrupted" state is reset to false here.
 # Then in try-catch block of L2, since the "interrupted" state is false, now the heartbeat is triggered.

The issue is in the above step3, we don't need to set the "interrupted" state back to true, so that the next loop can execute the heartbeat directly.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org