You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Janus Chow (Jira)" <ji...@apache.org> on 2022/02/25 09:47:00 UTC
[jira] [Created] (HDDS-6377) Redundant loop while doing triggerHeartbeat in DatanodeStateMachine
Janus Chow created HDDS-6377:
--------------------------------
Summary: Redundant loop while doing triggerHeartbeat in DatanodeStateMachine
Key: HDDS-6377
URL: https://issues.apache.org/jira/browse/HDDS-6377
Project: Apache Ozone
Issue Type: Bug
Reporter: Janus Chow
Assignee: Janus Chow
The code related to checking heartbeat is as follows.
{code:java}
L1 while (context.getState() != DatanodeStates.SHUTDOWN) {
L2 try {
L3 LOG.debug("Executing cycle Number : {}", context.getExecutionCount());
L4 long heartbeatFrequency = context.getHeartbeatFrequency();
L5 nextHB.set(Time.monotonicNow() + heartbeatFrequency);
L6 context.execute(executorService, heartbeatFrequency,
L7 TimeUnit.MILLISECONDS);
L8 } catch (InterruptedException e) {
L9 // Someone has sent interrupt signal, this could be because
L10 // 1. Trigger heartbeat immediately
L11 // 2. Shutdown has be initiated.
L12 Thread.currentThread().interrupt();
L13 } catch (Exception e) {
L14 LOG.error("Unable to finish the execution.", e);
L15 }
L16
L17 now = Time.monotonicNow();
L18 if (now < nextHB.get()) {
L19 if (!Thread.interrupted()) {
L20 try {
L21 Thread.sleep(nextHB.get() - now);
L22 } catch (InterruptedException e) {
L23 //triggerHeartbeat is called during the sleep
L24 Thread.currentThread().interrupt();
L25 }
L26 }
L27 }
{code}
The redundant case happens as follows:
# triggerHeartBeat() called while stateMachineThread sleeping at L21.
# IterruptedException catched in L22, "interrupted" state reset to false.
# L24 set "interrupted" state to true.
# Then back to while loop, in try-catch block of L2, since "interrupted" state was set to true, it will go to L8, then L12 set the "interrupted" state to true.
# In L19, "Thread.interrupted()" was checked, since the current value is true, it will skip the sleep and go to next loop of while, and "interrupted" state is reset to false here.
# Then in try-catch block of L2, since the "interrupted" state is false, now the heartbeat is triggered.
The issue is in the above step3, we don't need to set the "interrupted" state back to true, so that the next loop can execute the heartbeat directly.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org