You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/05/16 14:35:41 UTC

[GitHub] [ozone] siddhantsangwan opened a new pull request, #3423: HDDS-6280. Support Container Balancer HA

siddhantsangwan opened a new pull request, #3423:
URL: https://github.com/apache/ozone/pull/3423

   ## What changes were proposed in this pull request?
   
   Make Container Balancer highly available by persisting its configurations (state) through RocksDB and replicating through Ratis. This change introduces a new protobuf message, `ContainerBalancerConfiguration`, containing configurations that need to be persisted. On leader change or restart, `ContainerBalancer` checks if it needs to start by reading this persisted ContainerBalancerConfiguration proto in the new leader. 
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6280
   
   ## How was this patch tested?
   
   WIP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r878017833


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -801,24 +852,70 @@ private void resetState() {
   /**
    * Receives a notification for raft or safe mode related status changes.
    * Stops ContainerBalancer if it's running and the current SCM becomes a
-   * follower or enters safe mode.
+   * follower or enters safe mode. Starts ContainerBalancer if the current
+   * SCM becomes leader, is out of safe mode and balancer should run.
    */
   @Override
   public void notifyStatusChanged() {
-    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
-      if (isBalancerRunning()) {
-        stopBalancingThread();
+    lock.lock();
+    try {
+      if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+        if (isBalancerRunning()) {
+          stop();
+        }
+      } else {
+        if (shouldRun()) {
+          try {
+            start();
+          } catch (IllegalContainerBalancerStateException |
+              InvalidContainerBalancerConfigurationException e) {
+            LOG.warn("Could not start ContainerBalancer on raft/safe-mode " +
+                "status change.", e);
+          }
+        }
       }
+    } finally {
+      lock.unlock();
     }
   }
 
   /**
-   * Checks if ContainerBalancer should run.
-   * @return false
+   * Checks if ContainerBalancer should start (after a leader change, restart
+   * etc.) by reading persisted state.
+   * @return true if the persisted state is true, otherwise false
    */
   @Override
   public boolean shouldRun() {
-    return false;
+    try {
+      ContainerBalancerConfigurationProto proto =
+          readConfiguration(ContainerBalancerConfigurationProto.class);
+      if (proto == null) {
+        LOG.warn("Could not find persisted configuration for {} when checking" +
+            " if ContainerBalancer should run. ContainerBalancer should not " +
+            "run now.", this.getServiceName());
+        return false;
+      }
+      return proto.getShouldRun();
+    } catch (IOException e) {
+      LOG.warn("Could not read persisted configuration for checking if " +
+          "ContainerBalancer should start. ContainerBalancer should not start" +
+          " now.", e);
+      return false;
+    }

Review Comment:
   Agreed. However, since `start` might be called independently of `shouldRun` (like in SCMServiceManager), our only option is calling `shouldRun` in `start` itself. So, if we want to validate only once, we'll have to do this in `shouldRun`. But this has the following problems:
   
   1. Storing configuration inside `shouldRun` can be unexpected for users of this method.
   2. We need access to the configuration proto in `start` for ContainerBalancerConfigurationProto#getNextIterationIndex.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on PR #3423:
URL: https://github.com/apache/ozone/pull/3423#issuecomment-1141661942

   @JacksonYao287 @lokeshj1703 @nandakumar131 After the latest changes, balancer seems to be working as expected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r886533328


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/io/ByteStringCodec.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdds.scm.ha.io;
+
+import com.google.protobuf.ByteString;
+import com.google.protobuf.InvalidProtocolBufferException;
+
+/**
+ * A dummy codec that serializes a ByteString object to ByteString.
+ */
+public class ByteStringCodec implements Codec {

Review Comment:
   We already have a ByteStringCodec class. Is it possible to reuse it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r880292696


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -179,19 +189,46 @@ private void balance() {
         }
       }
 
-      // stop balancing if iteration is not initialized
+      // initialize this iteration. stop balancing on initialization failure
       if (!initializeIteration()) {
-        stopBalancer();
+        // just return if the reason for initialization failure is that
+        // balancer has been stopped in another thread
+        if (!isBalancerRunning()) {
+          return;
+        }

Review Comment:
   `stopBalancer` calls `validateState(true)`, which checks if balancer is currently running. Is this what you're looking for?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on PR #3423:
URL: https://github.com/apache/ozone/pull/3423#issuecomment-1132731318

   I think we need to somehow clearly distinguish between the `ContainerBalancer#startBalancer` and `SCMService#start` methods. `startBalancer` is a command for persisting configuration and starting balancer. `start` should be used when there's been a leader change or restart - to read configuration and start balancer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r879154865


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -830,12 +927,47 @@ public String getServiceName() {
   }
 
   /**
-   * Starts ContainerBalancer as an SCMService.
+   * Starts ContainerBalancer as an SCMService. Validates state, reads and
+   * validates persisted configuration, and then starts the balancing
+   * thread.
+   * @throws IllegalContainerBalancerStateException if balancer should not
+   * run according to persisted configuration
+   * @throws InvalidContainerBalancerConfigurationException if failed to
+   * retrieve persisted configuration, or the configuration is null
    */
   @Override
-  public void start() {
-    if (shouldRun()) {
+  public void start() throws IllegalContainerBalancerStateException,
+      InvalidContainerBalancerConfigurationException {
+    lock.lock();
+    try {
+      // should be leader-ready, out of safe mode, and not running already
+      validateState(false);
+      ContainerBalancerConfigurationProto proto;
+      try {
+        proto = readConfiguration(ContainerBalancerConfigurationProto.class);
+      } catch (IOException e) {
+        throw new InvalidContainerBalancerConfigurationException("Could not " +
+            "retrieve persisted configuration while starting " +
+            "Container Balancer as an SCMService. Will not start now.", e);
+      }
+      if (proto == null) {
+        throw new InvalidContainerBalancerConfigurationException("Persisted " +
+            "configuration for ContainerBalancer is null during start. Will " +
+            "not start now.");
+      }

Review Comment:
   In that case I think we should not throw an exception here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on PR #3423:
URL: https://github.com/apache/ozone/pull/3423#issuecomment-1138181332

   During manual testing in a cluster, I discovered that if a Storage Container Manager process is stopped, it would lead to balancer also stopping because of `containerBalancer.stopBalancer();` being called in `StorageContainerManager#stop`. This is undesirable since we want balancer to start running in the new leader. I'm now testing if calling `containerBalancer.stop()` instead has the desired effect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan merged pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan merged PR #3423:
URL: https://github.com/apache/ozone/pull/3423


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r877979963


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -830,12 +927,47 @@ public String getServiceName() {
   }
 
   /**
-   * Starts ContainerBalancer as an SCMService.
+   * Starts ContainerBalancer as an SCMService. Validates state, reads and
+   * validates persisted configuration, and then starts the balancing
+   * thread.
+   * @throws IllegalContainerBalancerStateException if balancer should not
+   * run according to persisted configuration
+   * @throws InvalidContainerBalancerConfigurationException if failed to
+   * retrieve persisted configuration, or the configuration is null
    */
   @Override
-  public void start() {
-    if (shouldRun()) {
+  public void start() throws IllegalContainerBalancerStateException,
+      InvalidContainerBalancerConfigurationException {
+    lock.lock();
+    try {
+      // should be leader-ready, out of safe mode, and not running already
+      validateState(false);
+      ContainerBalancerConfigurationProto proto;
+      try {
+        proto = readConfiguration(ContainerBalancerConfigurationProto.class);
+      } catch (IOException e) {
+        throw new InvalidContainerBalancerConfigurationException("Could not " +
+            "retrieve persisted configuration while starting " +
+            "Container Balancer as an SCMService. Will not start now.", e);
+      }
+      if (proto == null) {
+        throw new InvalidContainerBalancerConfigurationException("Persisted " +
+            "configuration for ContainerBalancer is null during start. Will " +
+            "not start now.");
+      }

Review Comment:
   Null is returned when no value is found for this key in RocksDB. Null is correct if balancer has not been started at all in any SCM, in which case we should not start here as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r886571635


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/io/ByteStringCodec.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdds.scm.ha.io;
+
+import com.google.protobuf.ByteString;
+import com.google.protobuf.InvalidProtocolBufferException;
+
+/**
+ * A dummy codec that serializes a ByteString object to ByteString.
+ */
+public class ByteStringCodec implements Codec {

Review Comment:
   `encode()` in `SCMRatisRequest` requires a codec to be present in `CodecFactory` that converts an Object to ByteString:
   
   ```
         final MethodArgument.Builder argBuilder = MethodArgument.newBuilder();
         // Set actual method parameter type, not actual argument type.
         // This is done to avoid MethodNotFoundException in case if argument is
         // subclass type, where as method is defined with super class type.
         argBuilder.setType(parameterTypes[paramCounter++].getName());
         argBuilder.setValue(CodecFactory.getCodec(argument.getClass())
             .serialize(argument));
         args.add(argBuilder.build());
   
   ```
   The existing `ByteStringCodec` class encodes a ByteString to a byte array and vice-versa. That's why I had to introduce this dummy class. I thought this would be simpler than making changes to `encode()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on PR #3423:
URL: https://github.com/apache/ozone/pull/3423#issuecomment-1143536453

   Thanks for the reviews! I've merged this PR to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r879181122


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -179,19 +188,53 @@ private void balance() {
         }
       }
 
-      // stop balancing if iteration is not initialized
+      // initialize this iteration. stop balancing on initialization failure
       if (!initializeIteration()) {
-        stopBalancer();
+        // just return if the reason for initialization failure is that
+        // balancer has been stopped in another thread
+        if (!isBalancerRunning()) {
+          return;
+        }
+        // otherwise, try to stop balancer
+        try {
+          stopBalancer();
+        } catch (IOException | IllegalContainerBalancerStateException e) {
+          LOG.warn("Tried and failed to stop Container Balancer when it " +
+              "could not initialize an iteration", e);
+        }
         return;
       }
 
-      //if no new move option is generated, it means the cluster can
-      //not be balanced any more , so just stop
       IterationResult iR = doIteration();
       metrics.incrementNumIterations(1);
       LOG.info("Result of this iteration of Container Balancer: {}", iR);
+
+      // persist next iteration index
+      if (iR == IterationResult.ITERATION_COMPLETED) {
+        lock.lock();
+        try {
+          saveConfiguration(
+              config.toProtobufBuilder()
+                  .setShouldRun(true)
+                  .setNextIterationIndex(i + 1)
+                  .build());
+        } catch (IOException e) {
+          LOG.warn("Could not persist next iteration index value for " +
+              "ContainerBalancer after completing an iteration", e);
+        } finally {
+          lock.unlock();
+        }
+      }
+
+      // if no new move option is generated, it means the cluster cannot be
+      // balanced anymore; so just stop balancer
       if (iR == IterationResult.CAN_NOT_BALANCE_ANY_MORE) {
-        stopBalancer();
+        try {
+          stopBalancer();
+        } catch (IOException | IllegalContainerBalancerStateException e) {
+          LOG.warn("Tried and failed to stop Container Balancer when result " +
+              "of the latest iteration was " + iR, e);
+        }

Review Comment:
   You have different logs based on when stop is called. So I am suggesting to make it a parameter.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r879166777


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -179,19 +188,53 @@ private void balance() {
         }
       }
 
-      // stop balancing if iteration is not initialized
+      // initialize this iteration. stop balancing on initialization failure
       if (!initializeIteration()) {
-        stopBalancer();
+        // just return if the reason for initialization failure is that
+        // balancer has been stopped in another thread
+        if (!isBalancerRunning()) {
+          return;
+        }
+        // otherwise, try to stop balancer
+        try {
+          stopBalancer();
+        } catch (IOException | IllegalContainerBalancerStateException e) {
+          LOG.warn("Tried and failed to stop Container Balancer when it " +
+              "could not initialize an iteration", e);
+        }
         return;
       }
 
-      //if no new move option is generated, it means the cluster can
-      //not be balanced any more , so just stop
       IterationResult iR = doIteration();
       metrics.incrementNumIterations(1);
       LOG.info("Result of this iteration of Container Balancer: {}", iR);
+
+      // persist next iteration index
+      if (iR == IterationResult.ITERATION_COMPLETED) {
+        lock.lock();
+        try {
+          saveConfiguration(
+              config.toProtobufBuilder()
+                  .setShouldRun(true)
+                  .setNextIterationIndex(i + 1)
+                  .build());
+        } catch (IOException e) {
+          LOG.warn("Could not persist next iteration index value for " +
+              "ContainerBalancer after completing an iteration", e);
+        } finally {
+          lock.unlock();
+        }
+      }
+
+      // if no new move option is generated, it means the cluster cannot be
+      // balanced anymore; so just stop balancer
       if (iR == IterationResult.CAN_NOT_BALANCE_ANY_MORE) {
-        stopBalancer();
+        try {
+          stopBalancer();
+        } catch (IOException | IllegalContainerBalancerStateException e) {
+          LOG.warn("Tried and failed to stop Container Balancer when result " +
+              "of the latest iteration was " + iR, e);
+        }

Review Comment:
   What would stopReason be?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on PR #3423:
URL: https://github.com/apache/ozone/pull/3423#issuecomment-1137099240

   @lokeshj1703 @JacksonYao287 thanks for reviewing! I've updated the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r881458132


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -801,24 +846,70 @@ private void resetState() {
   /**
    * Receives a notification for raft or safe mode related status changes.
    * Stops ContainerBalancer if it's running and the current SCM becomes a
-   * follower or enters safe mode.
+   * follower or enters safe mode. Starts ContainerBalancer if the current
+   * SCM becomes leader, is out of safe mode and balancer should run.
    */
   @Override
   public void notifyStatusChanged() {
-    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
-      if (isBalancerRunning()) {
-        stopBalancingThread();
+    lock.lock();
+    try {
+      if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+        if (isBalancerRunning()) {
+          stop();
+        }
+      } else {
+        if (shouldRun()) {
+          try {
+            start();
+          } catch (IllegalContainerBalancerStateException |
+              InvalidContainerBalancerConfigurationException e) {
+            LOG.warn("Could not start ContainerBalancer on raft/safe-mode " +
+                "status change.", e);
+          }
+        }
       }
+    } finally {
+      lock.unlock();
     }
   }
 
   /**
-   * Checks if ContainerBalancer should run.
-   * @return false
+   * Checks if ContainerBalancer should start (after a leader change, restart
+   * etc.) by reading persisted state.
+   * @return true if the persisted state is true, otherwise false
    */
   @Override
   public boolean shouldRun() {
-    return false;
+    try {

Review Comment:
   Since we're already reading the persisted status for telling whether balancer should run, we probably don't need to check and update ServiceStatus. The persisted status is the most reliable information in balancer's case since it's not a background service.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r876723181


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -227,7 +279,7 @@ private void balance() {
    */
   private boolean initializeIteration() {
     if (scmContext.isInSafeMode()) {
-      LOG.error("Container Balancer cannot operate while SCM is in Safe Mode.");
+      LOG.warn("Container Balancer cannot operate while SCM is in Safe Mode.");

Review Comment:
   I think we should keep this as error?



##########
hadoop-hdds/interface-client/src/main/proto/hdds.proto:
##########
@@ -431,3 +431,22 @@ message ReplicationManagerReportProto {
     repeated KeyIntValue stat = 2;
     repeated KeyContainerIDList statSample = 3;
 }
+
+message ContainerBalancerConfigurationProto {
+    optional string utilizationThreshold = 5;
+    optional int32 datanodesInvolvedMaxPercentagePerIteration = 6;
+    optional int64 sizeMovedMaxPerIteration = 7;
+    optional int64 sizeEnteringTargetMax = 8;
+    optional int64 sizeLeavingSourceMax = 9;
+    optional int32 iterations = 10;
+    optional string excludeContainers = 11;
+    optional int64 moveTimeout = 12;
+    optional int64 balancingIterationInterval = 13;
+    optional string includeDatanodes = 14;
+    optional string excludeDatanodes = 15;
+    optional bool moveNetworkTopologyEnable = 16;
+    optional bool triggerDuBeforeMoveEnable = 17;
+
+    required bool shouldRun = 18;
+    optional int32 nextIterationIndex = 19;
+}

Review Comment:
   Is it possible to convert ContainerBalancerConfiguration into a string format or some other format. I think we can convert it into xml and store the xml itself. We can avoid creating these fields using that approach.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -179,19 +188,53 @@ private void balance() {
         }
       }
 
-      // stop balancing if iteration is not initialized
+      // initialize this iteration. stop balancing on initialization failure
       if (!initializeIteration()) {
-        stopBalancer();
+        // just return if the reason for initialization failure is that
+        // balancer has been stopped in another thread
+        if (!isBalancerRunning()) {
+          return;
+        }
+        // otherwise, try to stop balancer
+        try {
+          stopBalancer();
+        } catch (IOException | IllegalContainerBalancerStateException e) {
+          LOG.warn("Tried and failed to stop Container Balancer when it " +
+              "could not initialize an iteration", e);
+        }
         return;
       }
 
-      //if no new move option is generated, it means the cluster can
-      //not be balanced any more , so just stop
       IterationResult iR = doIteration();
       metrics.incrementNumIterations(1);
       LOG.info("Result of this iteration of Container Balancer: {}", iR);
+
+      // persist next iteration index
+      if (iR == IterationResult.ITERATION_COMPLETED) {
+        lock.lock();
+        try {
+          saveConfiguration(
+              config.toProtobufBuilder()
+                  .setShouldRun(true)
+                  .setNextIterationIndex(i + 1)
+                  .build());
+        } catch (IOException e) {
+          LOG.warn("Could not persist next iteration index value for " +
+              "ContainerBalancer after completing an iteration", e);
+        } finally {
+          lock.unlock();
+        }
+      }
+
+      // if no new move option is generated, it means the cluster cannot be
+      // balanced anymore; so just stop balancer
       if (iR == IterationResult.CAN_NOT_BALANCE_ANY_MORE) {
-        stopBalancer();
+        try {
+          stopBalancer();
+        } catch (IOException | IllegalContainerBalancerStateException e) {
+          LOG.warn("Tried and failed to stop Container Balancer when result " +
+              "of the latest iteration was " + iR, e);
+        }

Review Comment:
   I think it is better to add try catch inside stopBalancer itself. stopBalancer can include stopReason as argument.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -179,19 +188,53 @@ private void balance() {
         }
       }
 
-      // stop balancing if iteration is not initialized
+      // initialize this iteration. stop balancing on initialization failure
       if (!initializeIteration()) {
-        stopBalancer();
+        // just return if the reason for initialization failure is that
+        // balancer has been stopped in another thread
+        if (!isBalancerRunning()) {
+          return;
+        }
+        // otherwise, try to stop balancer
+        try {
+          stopBalancer();
+        } catch (IOException | IllegalContainerBalancerStateException e) {
+          LOG.warn("Tried and failed to stop Container Balancer when it " +
+              "could not initialize an iteration", e);
+        }
         return;
       }
 
-      //if no new move option is generated, it means the cluster can
-      //not be balanced any more , so just stop
       IterationResult iR = doIteration();
       metrics.incrementNumIterations(1);
       LOG.info("Result of this iteration of Container Balancer: {}", iR);
+
+      // persist next iteration index
+      if (iR == IterationResult.ITERATION_COMPLETED) {
+        lock.lock();
+        try {
+          saveConfiguration(
+              config.toProtobufBuilder()
+                  .setShouldRun(true)
+                  .setNextIterationIndex(i + 1)
+                  .build());
+        } catch (IOException e) {
+          LOG.warn("Could not persist next iteration index value for " +
+              "ContainerBalancer after completing an iteration", e);
+        } finally {
+          lock.unlock();
+        }

Review Comment:
   We can include lock inside an internal function for saveConfiguration



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -830,12 +927,47 @@ public String getServiceName() {
   }
 
   /**
-   * Starts ContainerBalancer as an SCMService.
+   * Starts ContainerBalancer as an SCMService. Validates state, reads and
+   * validates persisted configuration, and then starts the balancing
+   * thread.
+   * @throws IllegalContainerBalancerStateException if balancer should not
+   * run according to persisted configuration
+   * @throws InvalidContainerBalancerConfigurationException if failed to
+   * retrieve persisted configuration, or the configuration is null
    */
   @Override
-  public void start() {
-    if (shouldRun()) {
+  public void start() throws IllegalContainerBalancerStateException,
+      InvalidContainerBalancerConfigurationException {
+    lock.lock();
+    try {
+      // should be leader-ready, out of safe mode, and not running already
+      validateState(false);
+      ContainerBalancerConfigurationProto proto;
+      try {
+        proto = readConfiguration(ContainerBalancerConfigurationProto.class);
+      } catch (IOException e) {
+        throw new InvalidContainerBalancerConfigurationException("Could not " +
+            "retrieve persisted configuration while starting " +
+            "Container Balancer as an SCMService. Will not start now.", e);
+      }
+      if (proto == null) {
+        throw new InvalidContainerBalancerConfigurationException("Persisted " +
+            "configuration for ContainerBalancer is null during start. Will " +
+            "not start now.");
+      }

Review Comment:
   Is this an error case? If the config is null could it also mean balancer has not been started yet?



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java:
##########
@@ -131,6 +134,8 @@ public final class ContainerBalancerConfiguration {
           "data node is very high")
   private boolean triggerDuEnable = false;
 
+  private int nextIterationIndex = 0;
+

Review Comment:
   not required.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -801,24 +852,70 @@ private void resetState() {
   /**
    * Receives a notification for raft or safe mode related status changes.
    * Stops ContainerBalancer if it's running and the current SCM becomes a
-   * follower or enters safe mode.
+   * follower or enters safe mode. Starts ContainerBalancer if the current
+   * SCM becomes leader, is out of safe mode and balancer should run.
    */
   @Override
   public void notifyStatusChanged() {
-    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
-      if (isBalancerRunning()) {
-        stopBalancingThread();
+    lock.lock();
+    try {
+      if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+        if (isBalancerRunning()) {
+          stop();
+        }
+      } else {
+        if (shouldRun()) {
+          try {
+            start();
+          } catch (IllegalContainerBalancerStateException |
+              InvalidContainerBalancerConfigurationException e) {
+            LOG.warn("Could not start ContainerBalancer on raft/safe-mode " +
+                "status change.", e);
+          }
+        }
       }
+    } finally {
+      lock.unlock();
     }
   }
 
   /**
-   * Checks if ContainerBalancer should run.
-   * @return false
+   * Checks if ContainerBalancer should start (after a leader change, restart
+   * etc.) by reading persisted state.
+   * @return true if the persisted state is true, otherwise false
    */
   @Override
   public boolean shouldRun() {
-    return false;
+    try {
+      ContainerBalancerConfigurationProto proto =
+          readConfiguration(ContainerBalancerConfigurationProto.class);
+      if (proto == null) {
+        LOG.warn("Could not find persisted configuration for {} when checking" +
+            " if ContainerBalancer should run. ContainerBalancer should not " +
+            "run now.", this.getServiceName());
+        return false;
+      }
+      return proto.getShouldRun();
+    } catch (IOException e) {
+      LOG.warn("Could not read persisted configuration for checking if " +
+          "ContainerBalancer should start. ContainerBalancer should not start" +
+          " now.", e);
+      return false;
+    }

Review Comment:
   We are validating the configuration twice, once in start method and once in this function. I think we should validate only once and store the configuration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r879165642


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -830,12 +927,47 @@ public String getServiceName() {
   }
 
   /**
-   * Starts ContainerBalancer as an SCMService.
+   * Starts ContainerBalancer as an SCMService. Validates state, reads and
+   * validates persisted configuration, and then starts the balancing
+   * thread.
+   * @throws IllegalContainerBalancerStateException if balancer should not
+   * run according to persisted configuration
+   * @throws InvalidContainerBalancerConfigurationException if failed to
+   * retrieve persisted configuration, or the configuration is null
    */
   @Override
-  public void start() {
-    if (shouldRun()) {
+  public void start() throws IllegalContainerBalancerStateException,
+      InvalidContainerBalancerConfigurationException {
+    lock.lock();
+    try {
+      // should be leader-ready, out of safe mode, and not running already
+      validateState(false);
+      ContainerBalancerConfigurationProto proto;
+      try {
+        proto = readConfiguration(ContainerBalancerConfigurationProto.class);
+      } catch (IOException e) {
+        throw new InvalidContainerBalancerConfigurationException("Could not " +
+            "retrieve persisted configuration while starting " +
+            "Container Balancer as an SCMService. Will not start now.", e);
+      }
+      if (proto == null) {
+        throw new InvalidContainerBalancerConfigurationException("Persisted " +
+            "configuration for ContainerBalancer is null during start. Will " +
+            "not start now.");
+      }

Review Comment:
   I am considering this as an exceptional situation because `start` should have been called from `notifyStatusChanged()` only after checking that balancer should run. So we should be able to get persisted configuration here as well, instead of getting null.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r879299342


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -179,19 +189,46 @@ private void balance() {
         }
       }
 
-      // stop balancing if iteration is not initialized
+      // initialize this iteration. stop balancing on initialization failure
       if (!initializeIteration()) {
-        stopBalancer();
+        // just return if the reason for initialization failure is that
+        // balancer has been stopped in another thread
+        if (!isBalancerRunning()) {
+          return;
+        }

Review Comment:
   can we add a balancer running check `currentThread = null` inside `stopBalancer`, so that the two operations will be protected by a single lock. if we first call `isBalancerRunning()` and then  call `stopBalancer()` , there might be a case that between the two operation in a single thread, `stopbalancer` is called from another thread.



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -801,24 +846,70 @@ private void resetState() {
   /**
    * Receives a notification for raft or safe mode related status changes.
    * Stops ContainerBalancer if it's running and the current SCM becomes a
-   * follower or enters safe mode.
+   * follower or enters safe mode. Starts ContainerBalancer if the current
+   * SCM becomes leader, is out of safe mode and balancer should run.
    */
   @Override
   public void notifyStatusChanged() {
-    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
-      if (isBalancerRunning()) {
-        stopBalancingThread();
+    lock.lock();
+    try {
+      if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+        if (isBalancerRunning()) {
+          stop();
+        }
+      } else {
+        if (shouldRun()) {
+          try {
+            start();
+          } catch (IllegalContainerBalancerStateException |
+              InvalidContainerBalancerConfigurationException e) {
+            LOG.warn("Could not start ContainerBalancer on raft/safe-mode " +
+                "status change.", e);
+          }
+        }
       }
+    } finally {
+      lock.unlock();
     }
   }
 
   /**
-   * Checks if ContainerBalancer should run.
-   * @return false
+   * Checks if ContainerBalancer should start (after a leader change, restart
+   * etc.) by reading persisted state.
+   * @return true if the persisted state is true, otherwise false
    */
   @Override
   public boolean shouldRun() {
-    return false;
+    try {

Review Comment:
   should we check the status and the delay here?
   ```
     public boolean shouldRun() {
       serviceLock.lock();
       try {
         // If safe mode is off, then this SCMService starts to run with a delay.
         return serviceStatus == ServiceStatus.RUNNING &&
             clock.millis() - lastTimeToBeReadyInMillis >= waitTimeInMillis;
       } finally {
         serviceLock.unlock();
       }
     }
   ```



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -801,24 +846,70 @@ private void resetState() {
   /**
    * Receives a notification for raft or safe mode related status changes.
    * Stops ContainerBalancer if it's running and the current SCM becomes a
-   * follower or enters safe mode.
+   * follower or enters safe mode. Starts ContainerBalancer if the current
+   * SCM becomes leader, is out of safe mode and balancer should run.
    */
   @Override
   public void notifyStatusChanged() {
-    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
-      if (isBalancerRunning()) {
-        stopBalancingThread();
+    lock.lock();
+    try {
+      if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+        if (isBalancerRunning()) {

Review Comment:
   should we check and set the ServiceStatus here? for example 
   ```
         if (scmContext.isLeaderReady() && !scmContext.isInSafeMode()) {
           if (serviceStatus != ServiceStatus.RUNNING) {
             LOG.info("Service {} transitions to RUNNING.", getServiceName());
             serviceStatus = ServiceStatus.RUNNING;
             lastTimeToBeReadyInMillis = clock.millis();
           }
         } else {
           if (serviceStatus != ServiceStatus.PAUSING) {
             LOG.info("Service {} transitions to PAUSING.", getServiceName());
             serviceStatus = ServiceStatus.PAUSING;
           }
         }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r881473582


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java:
##########
@@ -801,24 +846,70 @@ private void resetState() {
   /**
    * Receives a notification for raft or safe mode related status changes.
    * Stops ContainerBalancer if it's running and the current SCM becomes a
-   * follower or enters safe mode.
+   * follower or enters safe mode. Starts ContainerBalancer if the current
+   * SCM becomes leader, is out of safe mode and balancer should run.
    */
   @Override
   public void notifyStatusChanged() {
-    if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
-      if (isBalancerRunning()) {
-        stopBalancingThread();
+    lock.lock();
+    try {
+      if (!scmContext.isLeader() || scmContext.isInSafeMode()) {
+        if (isBalancerRunning()) {

Review Comment:
   In addition to my previous reply, I think holding state in ServiceStatus along with persisting it in RocksDB would make the logic a bit complex. We'd then have three ways of checking state - ServiceStatus, RocksDB, and checking the current thread for null. What do you think @JacksonYao287 ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a diff in pull request #3423: HDDS-6280. Support Container Balancer HA

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on code in PR #3423:
URL: https://github.com/apache/ozone/pull/3423#discussion_r886533333


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/io/ByteStringCodec.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * <p>
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * <p>
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hdds.scm.ha.io;
+
+import com.google.protobuf.ByteString;
+import com.google.protobuf.InvalidProtocolBufferException;
+
+/**
+ * A dummy codec that serializes a ByteString object to ByteString.
+ */
+public class ByteStringCodec implements Codec {

Review Comment:
   We already have a ByteStringCodec class. Is it possible to reuse it?



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/StatefulServiceStateManagerImpl.java:
##########
@@ -52,10 +57,13 @@ private StatefulServiceStateManagerImpl(
   public void saveConfiguration(String serviceName, ByteString bytes)
       throws IOException {
     transactionBuffer.addToBuffer(statefulServiceConfig, serviceName, bytes);
+    LOG.debug("Added specified bytes to the transaction buffer for key {} to " +
+        "table {}", serviceName, statefulServiceConfig.getName());
     if (transactionBuffer instanceof SCMHADBTransactionBuffer) {
       SCMHADBTransactionBuffer buffer =
               (SCMHADBTransactionBuffer) transactionBuffer;
       buffer.flush();
+      LOG.debug("Transaction buffer flushed");

Review Comment:
   Enclose in LOG.isDebugEnabled



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/StatefulServiceStateManagerImpl.java:
##########
@@ -52,10 +57,13 @@ private StatefulServiceStateManagerImpl(
   public void saveConfiguration(String serviceName, ByteString bytes)
       throws IOException {
     transactionBuffer.addToBuffer(statefulServiceConfig, serviceName, bytes);
+    LOG.debug("Added specified bytes to the transaction buffer for key {} to " +
+        "table {}", serviceName, statefulServiceConfig.getName());
     if (transactionBuffer instanceof SCMHADBTransactionBuffer) {
       SCMHADBTransactionBuffer buffer =
               (SCMHADBTransactionBuffer) transactionBuffer;
       buffer.flush();
+      LOG.debug("Transaction buffer flushed");

Review Comment:
   Enclose in LOG.isDebugEnabled



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/StatefulServiceStateManagerImpl.java:
##########
@@ -52,10 +57,13 @@ private StatefulServiceStateManagerImpl(
   public void saveConfiguration(String serviceName, ByteString bytes)
       throws IOException {
     transactionBuffer.addToBuffer(statefulServiceConfig, serviceName, bytes);
+    LOG.debug("Added specified bytes to the transaction buffer for key {} to " +

Review Comment:
   Enclose in LOG.isDebugEnabled



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/ha/StatefulServiceStateManagerImpl.java:
##########
@@ -52,10 +57,13 @@ private StatefulServiceStateManagerImpl(
   public void saveConfiguration(String serviceName, ByteString bytes)
       throws IOException {
     transactionBuffer.addToBuffer(statefulServiceConfig, serviceName, bytes);
+    LOG.debug("Added specified bytes to the transaction buffer for key {} to " +

Review Comment:
   Enclose in LOG.isDebugEnabled



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org