You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/11/30 17:59:00 UTC

[GitHub] [accumulo] dlmarion opened a new pull request #2372: Exception and cancellation changes in Compactor process

dlmarion opened a new pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372


   The Thread in the Compactor that runs the FileCompactor was throwing an
   exception as part of the error handling routine. It's not necessary for
   that to occur to convey the exception information to the main thread,
   the exception information was already being conveyed. This will reduce
   the error logging in the Compactor log regarding the uncaught exception
   being thrown and handled by the AccumuloUncaughtExceptionHandler.
   
   Another background thread checks every 5 seconds to determine whether
   or not the current compaction should be canceled. Call this method
   when the compaction completes in case something happens that would
   cause the compaction to be canceled and it has not been noticed yet.
   
   Closes #2350


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r765999980



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -712,6 +711,8 @@ public void run() {
           compactionThread.join();
           LOG.trace("Compaction thread finished.");
 
+          checkIfCanceled();

Review comment:
       Would something like the following achieve the goal of less noisy logs while not always doing the check?
   
   ```suggestion
      if(err.get() != null){
              //maybe the error occured because the table was deleted or something like that, so force a cancel check to reduce noise in the logs
             checkIfCanceled();
      }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r765974395



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -712,6 +711,8 @@ public void run() {
           compactionThread.join();
           LOG.trace("Compaction thread finished.");
 
+          checkIfCanceled();

Review comment:
       The intent of doing periodic cancelation checks is to stop long running external compactions that are canceled.  If lots of quick external compaction (like 30 s) run, then a cancelation check will now be done for each of those where it was not before adding this.   During the commit process atomic checks will be done, so doing the check here for every compaction does not offer a benefit over that. The compaction could still be canceled after this check before the atomic checks are done.  So maybe checking here places additional load on the metadata table and ZK w/o a clear benefit.
   
   Thinking about only checking long running compactions, we could also make the checkIfCanceled method ignore young compactions. We currently check every 5 mins, if  a compaction is running and its age is <5min we could return and not do the check until a later call when the age is > 5min (or the check period).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] dlmarion commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
dlmarion commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r765984183



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -712,6 +711,8 @@ public void run() {
           compactionThread.join();
           LOG.trace("Compaction thread finished.");
 
+          checkIfCanceled();

Review comment:
       I think this was an issue with short-lived compactions. IIRC, the original issue was that @milleruntime deleted the ci table during some testing and a bunch of errors were logged and the compaction was not canceled (at least not in coordinator log).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] dlmarion commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
dlmarion commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r765743836



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -184,7 +184,7 @@ protected void startCancelChecker(ScheduledThreadPoolExecutor schedExecutor,
         TimeUnit.MILLISECONDS);
   }
 
-  protected void checkIfCanceled() {
+  protected synchronized void checkIfCanceled() {

Review comment:
       I have removed the synchronized modifier from this method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r763268668



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -184,7 +184,7 @@ protected void startCancelChecker(ScheduledThreadPoolExecutor schedExecutor,
         TimeUnit.MILLISECONDS);
   }
 
-  protected void checkIfCanceled() {
+  protected synchronized void checkIfCanceled() {

Review comment:
       > I noticed the cancel() method call on CompactionJobHolder is synchronized, so maybe that is enough.
   
   I think the existing sync in CompactionJobHolder will prevent any race conditions from happening.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] dlmarion commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
dlmarion commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r760442481



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -184,7 +184,7 @@ protected void startCancelChecker(ScheduledThreadPoolExecutor schedExecutor,
         TimeUnit.MILLISECONDS);
   }
 
-  protected void checkIfCanceled() {
+  protected synchronized void checkIfCanceled() {

Review comment:
       it's being called by a background thread every 5 seconds and from the main thread when the compaction is completed. I was just trying to serialize the calls.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] dlmarion commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
dlmarion commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r760301214



##########
File path: test/src/main/java/org/apache/accumulo/test/compaction/ExternalDoNothingCompactor.java
##########
@@ -53,6 +53,9 @@ protected Runnable createCompactionJob(TExternalCompactionJob job, LongAdder tot
       LongAdder totalInputBytes, CountDownLatch started, CountDownLatch stopped,
       AtomicReference<Throwable> err) {
 
+    // Set this to true so that only 1 external compaction is run
+    this.shutdown = true;

Review comment:
       This method is called [here](https://github.com/apache/accumulo/blob/main/server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java#L670) to create a Thread, then the Thread is started a few lines later. The shutdown variable is checked in the enclosing while loop, so I don't think it matters in this case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r765974395



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -712,6 +711,8 @@ public void run() {
           compactionThread.join();
           LOG.trace("Compaction thread finished.");
 
+          checkIfCanceled();

Review comment:
       The intent of doing periodic cancelation checks is to stop long running external compactions that are canceled.  If lots of quick external compaction (like 30 s) run, then a cancelation check will now be done for each of those where it was not before adding this   During the commit process atomic checks will be done, so doing the check here for every compaction does not offer a benefit over that. The compaction could still be canceled after this check before the atomic checks are done.  So maybe checking here places additional load on the metadata table and ZK w/o a clear benefit.
   
   Thinking about only checking long running compactions, we could also make the checkIfCanceled method ignore young compactions. We currently check every 5 mins, if  a compaction is running and its age is <5min we could return and not do the check until a later call when the age is > 5min (or the check period).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r765999980



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -712,6 +711,8 @@ public void run() {
           compactionThread.join();
           LOG.trace("Compaction thread finished.");
 
+          checkIfCanceled();

Review comment:
       Would something like the following achieve the goal of less noisy logs while not always doing the check?
   
   ```suggestion
      if(err.get() != null){
              //maybe the error occured because the table was deleted or something like that, so force a cancel check
             checkIfCanceled();
      }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r765999980



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -712,6 +711,8 @@ public void run() {
           compactionThread.join();
           LOG.trace("Compaction thread finished.");
 
+          checkIfCanceled();

Review comment:
       Would something like the following achieve the goal of less noisy logs while not always doing the check?
   
   ```suggestion
      if(err.get() != null){
              //maybe the error occured because the table was deleted or something like that, so force a cancel check to possibly reduce noise in the logs
             checkIfCanceled();
      }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r760172587



##########
File path: test/src/main/java/org/apache/accumulo/test/compaction/ExternalDoNothingCompactor.java
##########
@@ -53,6 +53,9 @@ protected Runnable createCompactionJob(TExternalCompactionJob job, LongAdder tot
       LongAdder totalInputBytes, CountDownLatch started, CountDownLatch stopped,
       AtomicReference<Throwable> err) {
 
+    // Set this to true so that only 1 external compaction is run
+    this.shutdown = true;

Review comment:
       Is this the best place to set shutdown to true? Would it be better to set this right before `throw new CompactionCanceledException();`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r760432712



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -184,7 +184,7 @@ protected void startCancelChecker(ScheduledThreadPoolExecutor schedExecutor,
         TimeUnit.MILLISECONDS);
   }
 
-  protected void checkIfCanceled() {
+  protected synchronized void checkIfCanceled() {

Review comment:
       What does the addition of synchonized here defend against?   There are reads from metadata table and ZK in this method, so the sync could possibly cause problems.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] keith-turner commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
keith-turner commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r763269860



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -184,7 +184,7 @@ protected void startCancelChecker(ScheduledThreadPoolExecutor schedExecutor,
         TimeUnit.MILLISECONDS);
   }
 
-  protected void checkIfCanceled() {
+  protected synchronized void checkIfCanceled() {

Review comment:
       > Could add another check to this method to see if the compactionThread is already interrupted so we don't keep calling interrupt while waiting for the thread to die.
   
   It may be useful to keep interrupting a thread because some code may ignore interrupts w/o throwing an exception. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] dlmarion merged pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
dlmarion merged pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo] milleruntime commented on a change in pull request #2372: Exception and cancellation changes in Compactor process

Posted by GitBox <gi...@apache.org>.
milleruntime commented on a change in pull request #2372:
URL: https://github.com/apache/accumulo/pull/2372#discussion_r763209525



##########
File path: server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java
##########
@@ -184,7 +184,7 @@ protected void startCancelChecker(ScheduledThreadPoolExecutor schedExecutor,
         TimeUnit.MILLISECONDS);
   }
 
-  protected void checkIfCanceled() {
+  protected synchronized void checkIfCanceled() {

Review comment:
       I noticed the `cancel()` method call on `CompactionJobHolder` is synchronized, so maybe that is enough. Could add another check to this method to see if the compactionThread is already interrupted so we don't keep calling interrupt while waiting for the thread to die.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org