You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "EdColeman (via GitHub)" <gi...@apache.org> on 2023/11/27 01:50:19 UTC

[PR] Add delay to allow for ZooKeeper, Update prop names [accumulo]

EdColeman opened a new pull request, #3983:
URL: https://github.com/apache/accumulo/pull/3983

   The ExternalCompaction  tests seem to fail occasionally with the compaction coordinator unable to read information from ZooKeeper to create the metrics used in the test.
   
    - Adds a delay with a back off initializing metrics to allow more time for the test cluster ZooKeeper to start up.
    - Modifies the compaction service property names to align with changes in #3915  (reduces deprecated property name logging)
   
   This seems to be a partial fix and there seem to be other issues running the tests reliably.
    - test still occasionally hang (seems prevalent running in an IDE with ExternalCompaction_2_IT ) this was the most recent Jenkins failure, so this test was run more so it could be generic to the other tests.
    - some of the tests may be passing with a false positive.  Sometimes  the following is in the logs:
   
   ```
   ExternalCompaction_2_IT. testDeleteTableCancelsUserExternalCompaction:
   2023-11-26T20:35:52,091 [lock.ServiceLock] WARN : Child found with invalid format: wsn2:9133 (does not start with zlock#)
   ```
   These issues can be a follow-up if these changes are accepted before those are resolved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add delay to allow for ZooKeeper, Update prop names [accumulo]

Posted by "EdColeman (via GitHub)" <gi...@apache.org>.
EdColeman commented on PR #3983:
URL: https://github.com/apache/accumulo/pull/3983#issuecomment-1919974745

   Closing in favor of #4210
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add delay to allow for ZooKeeper, Update prop names [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.
dlmarion commented on PR #3983:
URL: https://github.com/apache/accumulo/pull/3983#issuecomment-1828013669

   > > 2023-11-26T20:35:52,091 [lock.ServiceLock] WARN : Child found with invalid format: wsn2:9133 (does not start with zlock#)
   > 
   > I have been noticing this too in the `elasticity` branch. Interesting to see that it's in `main` also. I'm going to try and track this down.
   
   Looks like I introduced this in #3951 . Working on a fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add delay to allow for ZooKeeper, Update prop names [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.
dlmarion commented on PR #3983:
URL: https://github.com/apache/accumulo/pull/3983#issuecomment-1827740434

   > 2023-11-26T20:35:52,091 [lock.ServiceLock] WARN : Child found with invalid format: wsn2:9133 (does not start with zlock#)
   
   
   I have been noticing this too in the `elasticity` branch. Interesting to see that it's in `main` also. I'm going to try and track this down.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add delay to allow for ZooKeeper, Update prop names [accumulo]

Posted by "dlmarion (via GitHub)" <gi...@apache.org>.
dlmarion commented on code in PR #3983:
URL: https://github.com/apache/accumulo/pull/3983#discussion_r1406094731


##########
minicluster/src/main/java/org/apache/accumulo/miniclusterImpl/MiniAccumuloClusterControl.java:
##########
@@ -150,20 +150,28 @@ private static TExternalCompactionList getRunningCompactions(ClientContext conte
   @Override
   public synchronized void startCoordinator(Class<? extends CompactionCoordinator> coordinator)
       throws IOException {
+
+    final int maxTries = 10;
+    int retryCount = 0;
+    long retryDelay = 1000;
+
     if (coordinatorProcess == null) {
       coordinatorProcess = cluster
           ._exec(coordinator, ServerType.COMPACTION_COORDINATOR, new HashMap<>()).getProcess();
       // Wait for coordinator to start
       TExternalCompactionList metrics = null;
-      while (metrics == null) {
+      while (metrics == null && retryCount++ < maxTries) {
         try {
           metrics = getRunningCompactions(cluster.getServerContext());
         } catch (TException e) {
           log.debug(
-              "Error getting running compactions from coordinator, message: " + e.getMessage());
-          UtilWaitThread.sleep(250);
+              "Error getting running compactions from coordinator, will retry in {} message: {}",
+              retryDelay, e.getMessage());
+          UtilWaitThread.sleep(retryDelay);
+          retryDelay = (long) (retryDelay * 1.2); // max delay ~ 10 seconds.

Review Comment:
   Curious if changing the `duration` parameter of `Wait.waitFor(Condition, duration)` to a `Supplier<Long>` would allow you to implement a backoff in the Supplier and then just call `Wait.waitFor` here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] Add delay to allow for ZooKeeper, Update prop names [accumulo]

Posted by "EdColeman (via GitHub)" <gi...@apache.org>.
EdColeman closed pull request #3983: Add delay to allow for ZooKeeper, Update prop names
URL: https://github.com/apache/accumulo/pull/3983


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org