You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/09/29 06:38:23 UTC

[GitHub] [ozone] ChenSammi opened a new pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

ChenSammi opened a new pull request #2693:
URL: https://github.com/apache/ozone/pull/2693


   https://issues.apache.org/jira/browse/HDDS-5794


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r718520243



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableTimeSum.incrementAndGet();

Review comment:
       Will this be `threadPoolNotAvailableCount ` instead?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r730742018



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime = threadPoolNotAvailableTimeSum.addAndGet(
+            lastHeartbeatSent.get() - System.currentTimeMillis());

Review comment:
       I am not able to understand this calculation of `unavailableTime`
   If the lastHeartbeatSent was at 5.
   ThreadPool goes unavailable at 8. So the value we set is 5-8=(-3) Negative 3?
   In the next iteration say it is now 11, & thread pool isn't available we set -3 -11 = -15 Negative 15?
   The pool isn't available from 5 to 11? so should 6 right?
   @ChenSammi Can you explain a bit more about the logic here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r731530625



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime = threadPoolNotAvailableTimeSum.addAndGet(
+            System.currentTimeMillis() - lastHeartbeatSent.get());

Review comment:
       So basically there is no need to sum up the unavailable time.   "System.currentTimeMillis() - lastHeartbeatSent.get()" is enough.  I will update it accordingly. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r731496350



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime = threadPoolNotAvailableTimeSum.addAndGet(
+            System.currentTimeMillis() - lastHeartbeatSent.get());

Review comment:
       Trying to decode the calculation after the change:
   Say:
   Last Heartbeat sent at : 5
   First Unavailable at 8: So here unavailableTime = 8 - 5 = 3
   Second Unavailable at 11: So here unavailableTime = 11 - 5 + 3 = 9
   Shouldn't be this 6? Last heartbeat was at 5 & we are at 11, the 
   unavailable time should be 11-5=6
   Can you help clarify




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * second" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#issuecomment-948330777


   Thanks @ayushtkn  for the code review. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r718520243



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableTimeSum.incrementAndGet();

Review comment:
       Will this be `threadPoolNotAvailableCount ` instead?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r730859614



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime = threadPoolNotAvailableTimeSum.addAndGet(
+            lastHeartbeatSent.get() - System.currentTimeMillis());

Review comment:
       Sorry,my mistake, it should be “System.currentTimeMillis() - lastHeartbeatSent.get() ”




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r730859614



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime = threadPoolNotAvailableTimeSum.addAndGet(
+            lastHeartbeatSent.get() - System.currentTimeMillis());

Review comment:
       Sorry, it should be “System.currentTimeMillis() - lastHeartbeatSent.get() ”.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r724652747



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableTimeSum.incrementAndGet();

Review comment:
       Yes,  thanks for pointing it out.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * second" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r732435509



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +618,19 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime =
+            System.currentTimeMillis() - lastHeartbeatSent.get();
+        if (unavailableTime > time && count % getLogWarnInterval(conf) == 0) {
+          LOG.warn("No available thread in pool for the past {} seconds " +
+              "and {} times.", unit.toSeconds(unavailableTime), count);
         }
         return;
       }
-
+      
       threadPoolNotAvailableCount.set(0);
       task.execute(service);
+      lastHeartbeatSent.set(System.currentTimeMillis());

Review comment:
       I wrote a small test to show the System.currentTimeMills.  Between the two runs,  I changed the Linux system time.  The System.currentTimeMills is really infected by the Linux time.  That's the meaning of "because it will be broken by settimeofday".   I will change it to  Time.monotonicNow().  Thanks @ayushtkn for the info. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r731596554



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +618,19 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime =
+            System.currentTimeMillis() - lastHeartbeatSent.get();
+        if (unavailableTime > time && count % getLogWarnInterval(conf) == 0) {
+          LOG.warn("No available thread in pool for the past {} seconds " +
+              "and {} times.", unit.toSeconds(unavailableTime), count);
         }
         return;
       }
-
+      
       threadPoolNotAvailableCount.set(0);
       task.execute(service);
+      lastHeartbeatSent.set(System.currentTimeMillis());

Review comment:
       Any reason for not using ``Time.monotonicNow()``? That should be a safer bet IMO?

##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +618,19 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime =
+            System.currentTimeMillis() - lastHeartbeatSent.get();
+        if (unavailableTime > time && count % getLogWarnInterval(conf) == 0) {
+          LOG.warn("No available thread in pool for the past {} seconds " +
+              "and {} times.", unit.toSeconds(unavailableTime), count);
         }
         return;
       }
-
+      

Review comment:
       nit:
   Avoid this




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r724652747



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +620,20 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableTimeSum.incrementAndGet();

Review comment:
       Yes,  thanks for pointing it out.   @ayushtkn , could you help to take another look of the change? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi merged pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * second" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi merged pull request #2693:
URL: https://github.com/apache/ozone/pull/2693


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * sencond" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r732354311



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +618,19 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime =
+            System.currentTimeMillis() - lastHeartbeatSent.get();
+        if (unavailableTime > time && count % getLogWarnInterval(conf) == 0) {
+          LOG.warn("No available thread in pool for the past {} seconds " +
+              "and {} times.", unit.toSeconds(unavailableTime), count);
         }
         return;
       }
-
+      
       threadPoolNotAvailableCount.set(0);
       task.execute(service);
+      lastHeartbeatSent.set(System.currentTimeMillis());

Review comment:
       Yes, I see that Time.monotonicNow() used in some places, but it involves a division operation.  What's the paticular benifit of using Time.monotonicNow over System.currentTimeMills?  I don't know the story behind, and would you like to shed some light on it? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * second" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r732361413



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +618,19 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime =
+            System.currentTimeMillis() - lastHeartbeatSent.get();
+        if (unavailableTime > time && count % getLogWarnInterval(conf) == 0) {
+          LOG.warn("No available thread in pool for the past {} seconds " +
+              "and {} times.", unit.toSeconds(unavailableTime), count);
         }
         return;
       }
-
+      
       threadPoolNotAvailableCount.set(0);
       task.execute(service);
+      lastHeartbeatSent.set(System.currentTimeMillis());

Review comment:
       I think that is to be used if we want to calculate difference in times. The first cut info I got is from the javaoc
   https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java#L51
   
   In hadoop they changed everywhere- https://issues.apache.org/jira/browse/HDFS-6841
   
   Will try to find some more descriptive doc & share




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ChenSammi commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * second" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ChenSammi commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r732435509



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +618,19 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime =
+            System.currentTimeMillis() - lastHeartbeatSent.get();
+        if (unavailableTime > time && count % getLogWarnInterval(conf) == 0) {
+          LOG.warn("No available thread in pool for the past {} seconds " +
+              "and {} times.", unit.toSeconds(unavailableTime), count);
         }
         return;
       }
-
+      
       threadPoolNotAvailableCount.set(0);
       task.execute(service);
+      lastHeartbeatSent.set(System.currentTimeMillis());

Review comment:
       I wrote a small test to show the System.currentTimeMills.  Between the two runs,  I changed the Linux system.  The System.currentTimeMills is really infected by the Linux time.  That's the meaning of "because it will be broken by settimeofday".   I will change it to  Time.monotonicNow().  Thanks @ayushtkn for the info. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ayushtkn commented on a change in pull request #2693: HDDS-5794. The misleading "No available thread in pool for past * second" log message in DN StateContext

Posted by GitBox <gi...@apache.org>.
ayushtkn commented on a change in pull request #2693:
URL: https://github.com/apache/ozone/pull/2693#discussion_r732361413



##########
File path: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java
##########
@@ -616,16 +618,19 @@ public void execute(ExecutorService service, long time, TimeUnit unit)
       }
 
       if (!isThreadPoolAvailable(service)) {
-        long count = threadPoolNotAvailableCount.getAndIncrement();
-        if (count % getLogWarnInterval(conf) == 0) {
-          LOG.warn("No available thread in pool for past {} seconds.",
-              unit.toSeconds(time) * (count + 1));
+        long count = threadPoolNotAvailableCount.incrementAndGet();
+        long unavailableTime =
+            System.currentTimeMillis() - lastHeartbeatSent.get();
+        if (unavailableTime > time && count % getLogWarnInterval(conf) == 0) {
+          LOG.warn("No available thread in pool for the past {} seconds " +
+              "and {} times.", unit.toSeconds(unavailableTime), count);
         }
         return;
       }
-
+      
       threadPoolNotAvailableCount.set(0);
       task.execute(service);
+      lastHeartbeatSent.set(System.currentTimeMillis());

Review comment:
       I think that is to be used if we want to calculate difference in times. The first cut info I got is from the javaoc
   https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java#L51
   
   Will try to find some more descriptive doc & share




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org