You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2020/08/12 18:57:21 UTC

[GitHub] [accumulo] ctubbsii commented on a change in pull request #1677: fixes #1086 - Disable Zookeeper retries from TabletServer

ctubbsii commented on a change in pull request #1677:
URL: https://github.com/apache/accumulo/pull/1677#discussion_r469468185



##########
File path: core/src/main/java/org/apache/accumulo/fate/zookeeper/ZooReaderWriter.java
##########
@@ -46,16 +46,32 @@
 
   private static final Logger log = LoggerFactory.getLogger(ZooReaderWriter.class);
 
-  public ZooReaderWriter(AccumuloConfiguration conf) {
-    this(conf.get(Property.INSTANCE_ZK_HOST),
+  public static ZooReaderWriter retriesEnabled(AccumuloConfiguration conf) {
+    return new ZooReaderWriter(conf.get(Property.INSTANCE_ZK_HOST),
         (int) conf.getTimeInMillis(Property.INSTANCE_ZK_TIMEOUT),
-        conf.get(Property.INSTANCE_SECRET));
+        conf.get(Property.INSTANCE_SECRET), true);
+  }
+
+  public static ZooReaderWriter retriesDisabled(AccumuloConfiguration conf) {
+    return new ZooReaderWriter(conf.get(Property.INSTANCE_ZK_HOST),
+        (int) conf.getTimeInMillis(Property.INSTANCE_ZK_TIMEOUT),
+        conf.get(Property.INSTANCE_SECRET), false);
+  }
+
+  public static ZooReaderWriter retriesEnabled(String keepers, int timeoutInMillis, String secret) {
+    return new ZooReaderWriter(keepers, timeoutInMillis, secret, true);
+  }
+
+  public static ZooReaderWriter retriesDisabled(String keepers, int timeoutInMillis,
+      String secret) {
+    return new ZooReaderWriter(keepers, timeoutInMillis, secret, false);
   }
 
   private final byte[] auth;
 
-  public ZooReaderWriter(String keepers, int timeoutInMillis, String secret) {
-    super(keepers, timeoutInMillis);
+  private ZooReaderWriter(String keepers, int timeoutInMillis, String secret,
+      boolean enableRetries) {
+    super(keepers, timeoutInMillis, enableRetries);

Review comment:
       Could pass in the retry factory rather than pass a boolean around. That would make the code more clear.

##########
File path: server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java
##########
@@ -1384,4 +1385,14 @@ public final RateLimiter getMajorCompactionWriteLimiter() {
   public CompactionManager getCompactionManager() {
     return compactionManager;
   }
+
+  /**
+   * For the tablet server we don't want ZooReaderWriter to keep retrying after a error
+   * communicating with ZooKeeper.
+   */
+  @Override
+  protected ServerContext createServerContext(SiteConfiguration siteConfig) {
+    return new ServerContext(siteConfig, false);
+  }
+

Review comment:
       I worry that this will have a larger impact on the various ways ZooKeeper is used in a TabletServer, rather than be a narrow fix for ZooLock. The ServerContext is universal for the process. The only ZooReaderWriter instance for which we want to avoid retries is the one used by ZooLock. Is there a reason we need to change the behavior across the entire tserver?

##########
File path: core/src/main/java/org/apache/accumulo/fate/zookeeper/ZooReader.java
##########
@@ -38,24 +38,34 @@
 public class ZooReader {
   private static final Logger log = LoggerFactory.getLogger(ZooReader.class);
 
-  protected static final RetryFactory RETRY_FACTORY = Retry.builder().maxRetries(10)
+  private static final RetryFactory DEFAULT_RETRY_FACTORY = Retry.builder().maxRetries(10)
+      .retryAfter(250, MILLISECONDS).incrementBy(250, MILLISECONDS).maxWait(5, TimeUnit.SECONDS)
+      .backOffFactor(1.5).logInterval(3, TimeUnit.MINUTES).createFactory();
+
+  private static final RetryFactory DISABLED_RETRY_FACTORY = Retry.builder().maxRetries(0)
       .retryAfter(250, MILLISECONDS).incrementBy(250, MILLISECONDS).maxWait(5, TimeUnit.SECONDS)
       .backOffFactor(1.5).logInterval(3, TimeUnit.MINUTES).createFactory();
 
   protected final String keepers;
   protected final int timeout;
+  protected final RetryFactory retryFactory;
 
   public ZooReader(String keepers, int timeout) {
+    this(keepers, timeout, true);
+  }
+
+  public ZooReader(String keepers, int timeout, boolean enableRetries) {

Review comment:
       Could pass in the RetryFactory rather than have a boolean param.

##########
File path: server/base/src/main/java/org/apache/accumulo/server/ServerContext.java
##########
@@ -68,13 +68,22 @@
   private CryptoService cryptoService = null;
 
   public ServerContext(SiteConfiguration siteConfig) {
-    this(new ServerInfo(siteConfig));
+    this(new ServerInfo(siteConfig), true);
+  }
+
+  public ServerContext(SiteConfiguration siteConfig, boolean enableZookeeperRetries) {
+    this(new ServerInfo(siteConfig), enableZookeeperRetries);

Review comment:
       Could pass in the retry factory rather than pass around booleans.

##########
File path: server/base/src/main/java/org/apache/accumulo/server/ServerContext.java
##########
@@ -68,13 +68,22 @@
   private CryptoService cryptoService = null;
 
   public ServerContext(SiteConfiguration siteConfig) {
-    this(new ServerInfo(siteConfig));
+    this(new ServerInfo(siteConfig), true);
+  }
+
+  public ServerContext(SiteConfiguration siteConfig, boolean enableZookeeperRetries) {
+    this(new ServerInfo(siteConfig), enableZookeeperRetries);
   }
 
   private ServerContext(ServerInfo info) {
+    this(info, true);
+  }
+
+  private ServerContext(ServerInfo info, boolean enableRetries) {
     super(SingletonReservation.noop(), info, info.getSiteConfiguration());
     this.info = info;
-    zooReaderWriter = new ZooReaderWriter(info.getSiteConfiguration());
+    zooReaderWriter = enableRetries ? ZooReaderWriter.retriesEnabled(info.getSiteConfiguration())
+        : ZooReaderWriter.retriesDisabled(info.getSiteConfiguration());

Review comment:
       This worries me. It seems outside the scope of ServerContext to be handling this kind of thing. ServerContext should merely be a holder of the ZRW that is available to the server for updating/reading from ZK. The affected code is in ZooLock, and is not a problem with ZooReaderWriter's retry behavior more generally.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org