You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/05/13 22:10:00 UTC

[GitHub] [accumulo] keith-turner commented on a change in pull request #2084: Add retry counter for log recovery with MinC

keith-turner commented on a change in pull request #2084:
URL: https://github.com/apache/accumulo/pull/2084#discussion_r632132187



##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/MinorCompactor.java
##########
@@ -131,12 +134,21 @@ public CompactionStats call() {
           reportedProblem = true;
         } catch (RuntimeException | NoClassDefFoundError e) {
           // if this is coming from a user iterator, it is possible that the user could change the
-          // iterator config and that the
-          // minor compaction would succeed
+          // iterator config and that the minor compaction would succeed
+          // If the minor compaction stalls for too long during recovery, it can interfere with
+          // other tables loading
+          // Throw exception if this happens so assignments can be rescheduled.
+          if (retryCounter >= 4 && mincReason.equals(MinorCompactionReason.RECOVERY)) {
+            log.warn("Minc is stuck for too long during recovery, throwing error for reschedule.");
+            ProblemReports.getInstance(tabletServer.getContext()).report(new ProblemReport(
+                getExtent().tableId(), ProblemType.FILE_WRITE, outputFileName, e));
+            throw new IllegalStateException(e);

Review comment:
       IllegalStateException does not feel like the right exception to me, please ignore if it you don't agree.
   
   ```suggestion
               throw new RuntimeException(e);
   ```

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/MinorCompactor.java
##########
@@ -131,12 +134,21 @@ public CompactionStats call() {
           reportedProblem = true;
         } catch (RuntimeException | NoClassDefFoundError e) {
           // if this is coming from a user iterator, it is possible that the user could change the
-          // iterator config and that the
-          // minor compaction would succeed
+          // iterator config and that the minor compaction would succeed
+          // If the minor compaction stalls for too long during recovery, it can interfere with
+          // other tables loading
+          // Throw exception if this happens so assignments can be rescheduled.
+          if (retryCounter >= 4 && mincReason.equals(MinorCompactionReason.RECOVERY)) {
+            log.warn("Minc is stuck for too long during recovery, throwing error for reschedule.");
+            ProblemReports.getInstance(tabletServer.getContext()).report(new ProblemReport(

Review comment:
       Maybe the entire if stmt could be moved after the existing ProblemReport code so that it does not need to be repeated.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/MinorCompactor.java
##########
@@ -131,12 +134,21 @@ public CompactionStats call() {
           reportedProblem = true;
         } catch (RuntimeException | NoClassDefFoundError e) {
           // if this is coming from a user iterator, it is possible that the user could change the
-          // iterator config and that the
-          // minor compaction would succeed
+          // iterator config and that the minor compaction would succeed
+          // If the minor compaction stalls for too long during recovery, it can interfere with
+          // other tables loading
+          // Throw exception if this happens so assignments can be rescheduled.
+          if (retryCounter >= 4 && mincReason.equals(MinorCompactionReason.RECOVERY)) {
+            log.warn("Minc is stuck for too long during recovery, throwing error for reschedule.");

Review comment:
       Including the key extent in log messages can be invaluable for debugging.  The existing log messages around failures don't include the extent.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org