You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "keith-turner (via GitHub)" <gi...@apache.org> on 2023/05/05 16:59:03 UTC

[GitHub] [accumulo] keith-turner commented on a diff in pull request #3380: Used Retry to backoff when processing tablet locator failures

keith-turner commented on code in PR #3380:
URL: https://github.com/apache/accumulo/pull/3380#discussion_r1186314959


##########
core/src/main/java/org/apache/accumulo/core/clientImpl/TabletServerBatchReaderIterator.java:
##########
@@ -274,11 +281,13 @@ private void binRanges(TabletLocator tabletLocator, List<Range> ranges,
               failures.size());
         }
 
+        retry.useRetry();
         try {
-          Thread.sleep(100);
-        } catch (InterruptedException e) {
-          throw new RuntimeException(e);
+          retry.waitForNextAttempt(log, "binRanges retry failures");
+        } catch (InterruptedException e1) {
+          log.debug("Retry interrupted", e1);

Review Comment:
   Why make this change?



##########
core/src/main/java/org/apache/accumulo/core/clientImpl/TabletServerBatchReaderIterator.java:
##########
@@ -248,6 +251,10 @@ private void binRanges(TabletLocator tabletLocator, List<Range> ranges,
 
     int lastFailureSize = Integer.MAX_VALUE;
 
+    Retry retry = Retry.builder().infiniteRetries().retryAfter(100, MILLISECONDS)
+        .incrementBy(100, MILLISECONDS).maxWait(1, SECONDS).backOffFactor(1.07)

Review Comment:
   I think the max wait should be higher.  If lots are batch scan client are repeatedly failing, it would be good if their collective pressure is lower.
   
   ```suggestion
           .incrementBy(100, MILLISECONDS).maxWait(10, SECONDS).backOffFactor(1.07)
   ```



##########
core/src/main/java/org/apache/accumulo/core/clientImpl/TabletServerBatchReaderIterator.java:
##########
@@ -274,11 +281,13 @@ private void binRanges(TabletLocator tabletLocator, List<Range> ranges,
               failures.size());
         }
 
+        retry.useRetry();

Review Comment:
   Not a a problem with this PR, but I have not been calling this.   I looked at the code and it will influence the message logged by waitForNextAttempt



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org