You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2021/10/29 11:33:36 UTC

[GitHub] [ozone] siddhantsangwan opened a new pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

siddhantsangwan opened a new pull request #2786:
URL: https://github.com/apache/ozone/pull/2786


   ## What changes were proposed in this pull request?
   
   Changes to support certain Datanodes being excluded from balancing. The configurations can be provided as a list of comma separated host names or IP addresses.
   
   Include Nodes: Include only these nodes in the balancing process.
   Exclude Nodes: Exclude these nodes from the balancing process.
   Nodes that are specified in both these lists should be excluded.
   
   The configurations are read and these nodes are excluded in the `initializeIteration` method in `ContainerBalancer`.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-5897
   
   ## How was this patch tested?
   
   Added a UT to `TestContainerBalancer`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#issuecomment-963028954


   > thanks @siddhantsangwan for this work, the changes generally looks good! i have a concern that if we have many Include Nodes and Exclude Nodes, how can we pass these host names or IP addresses to balancer? typing these host names or IP addresses one by one in command line , or specifying a local file in command line which contains these informations.
   
   We can have setters in `ContainerBalancerConfiguration` that accept a file containing hostnames. At the command line, can we have it such that either a string of hostnames or a file can be specified?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r749287270



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
##########
@@ -731,6 +738,24 @@ boolean canSizeEnterTarget(DatanodeDetails target, long size) {
     return potentialTargets;
   }
 
+  /**
+   * Consults the configurations {@link ContainerBalancer#includeNodes} and
+   * {@link ContainerBalancer#excludeNodes} to check if the specified
+   * Datanode should be excluded from balancing.
+   * @param datanode DatanodeDetails to check
+   * @return true if Datanode should be excluded, else false
+   */
+  boolean shouldExcludeDatanode(DatanodeDetails datanode) {

Review comment:
       This method checks the exclude nodes list first, so it'll return true if that node is specified in both the lists.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#issuecomment-972617748


   @lokeshj1703 Thanks for reviewing. I've created [HDDS-6005](https://issues.apache.org/jira/browse/HDDS-6005) for CLI changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai merged pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
adoroszlai merged pull request #2786:
URL: https://github.com/apache/ozone/pull/2786


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r749101823



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -292,6 +312,86 @@ public void setBalancingInterval(Duration balancingInterval) {
     }
   }
 
+  /**
+   * Gets a set of datanode hostnames or ip addresses that will be the exclusive
+   * participants in balancing.
+   * @return Set of hostname or ip address strings, or an empty set if the
+   * configuration is empty
+   */
+  public Set<String> getIncludeNodes() {
+    if (includeNodes.isEmpty()) {
+      return Collections.emptySet();
+    }
+    return Arrays.stream(includeNodes.split(","))
+        .map(String::trim)
+        .collect(Collectors.toSet());
+  }
+
+  /**
+   * Sets the datanodes that will be the exclusive participants in balancing.
+   * Applicable only if the specified string is non-empty.
+   * @param includeNodes a String of datanode hostnames or ip addresses
+   *                     separated by commas
+   */
+  public void setIncludeNodes(String includeNodes) {
+    this.includeNodes = includeNodes;
+  }
+
+  /**
+   * Sets the datanodes that will be the exclusive participants in balancing.
+   * Applicable only if the specified file is non-empty.
+   * @param includeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setIncludeNodes(File includeNodes) throws IOException {
+    try (Stream<String> strings = Files.lines(includeNodes.toPath(),
+        StandardCharsets.UTF_8)) {
+      this.includeNodes = strings.collect(Collectors.joining(","));
+    } catch (IOException e) {
+      LOG.debug("Could not read the specified includeNodes file.");
+      throw new IOException(e);
+    }
+  }
+
+  /**
+   * Gets a set of datanode hostnames or ip addresses that will be excluded
+   * from balancing.
+   * @return Set of hostname or ip address strings, or an empty set if the
+   * configuration is empty
+   */
+  public Set<String> getExcludeNodes() {
+    if (excludeNodes.isEmpty()) {
+      return Collections.emptySet();
+    }
+    return Arrays.stream(excludeNodes.split(","))
+        .map(String::trim)
+        .collect(Collectors.toSet());
+  }
+
+  /**
+   * Sets the datanodes that will be excluded from balancing.
+   * @param excludeNodes a String of datanode hostnames or ip addresses
+   *                     separated by commas
+   */
+  public void setExcludeNodes(String excludeNodes) {
+    this.excludeNodes = excludeNodes;
+  }
+
+  /**
+   * Sets the datanodes that will be excluded from balancing.
+   * @param excludeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setExcludeNodes(File excludeNodes) throws IOException {

Review comment:
       File can not be passed since command would be run in a remote cluster.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
##########
@@ -731,6 +738,24 @@ boolean canSizeEnterTarget(DatanodeDetails target, long size) {
     return potentialTargets;
   }
 
+  /**
+   * Consults the configurations {@link ContainerBalancer#includeNodes} and
+   * {@link ContainerBalancer#excludeNodes} to check if the specified
+   * Datanode should be excluded from balancing.
+   * @param datanode DatanodeDetails to check
+   * @return true if Datanode should be excluded, else false
+   */
+  boolean shouldExcludeDatanode(DatanodeDetails datanode) {

Review comment:
       We will need to handle case where node is both included and excluded.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r751905828



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
##########
@@ -731,6 +738,24 @@ boolean canSizeEnterTarget(DatanodeDetails target, long size) {
     return potentialTargets;
   }
 
+  /**
+   * Consults the configurations {@link ContainerBalancer#includeNodes} and
+   * {@link ContainerBalancer#excludeNodes} to check if the specified
+   * Datanode should be excluded from balancing.
+   * @param datanode DatanodeDetails to check
+   * @return true if Datanode should be excluded, else false
+   */
+  boolean shouldExcludeDatanode(DatanodeDetails datanode) {

Review comment:
       According to our latest discussion, in which the following points were identified:
   
   - We can recover from this scenario by excluding the node. In the worst case, this will lead to the exclusion of a datanode that the user actually wanted to include.
   - HDFS Balancer excludes the datanode if it's mentioned in both include and exclude lists.
   
   Let's exclude the datanode from balancing in this scenario.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#issuecomment-954672995


   @lokeshj1703 @JacksonYao287 Please take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r749114442



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -356,6 +377,21 @@ public void setExcludeNodes(String excludeNodes) {
     this.excludeNodes = excludeNodes;
   }
 
+  /**
+   * Sets the datanodes that will be excluded from balancing.
+   * @param excludeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setExcludeNodes(File excludeNodes) throws IOException {
+    try (Stream<String> strings = Files.lines(excludeNodes.toPath(),
+        StandardCharsets.UTF_8)) {
+      this.excludeNodes = strings.collect(Collectors.joining(","));
+    } catch (IOException e) {
+      LOG.debug("Could not read the specified excludeNodes file.");
+      throw new IOException(e);

Review comment:
       Same comments apply here.  BTW, can you please extract the logic duplicated in these methods?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -332,6 +337,22 @@ public void setIncludeNodes(String includeNodes) {
     this.includeNodes = includeNodes;
   }
 
+  /**
+   * Sets the datanodes that will be the exclusive participants in balancing.
+   * Applicable only if the specified file is non-empty.
+   * @param includeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setIncludeNodes(File includeNodes) throws IOException {
+    try (Stream<String> strings = Files.lines(includeNodes.toPath(),
+        StandardCharsets.UTF_8)) {
+      this.includeNodes = strings.collect(Collectors.joining(","));
+    } catch (IOException e) {
+      LOG.debug("Could not read the specified includeNodes file.");
+      throw new IOException(e);

Review comment:
       No need to wrap it in another `IOException`.
   
   ```suggestion
         throw e;
   ```

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -332,6 +337,22 @@ public void setIncludeNodes(String includeNodes) {
     this.includeNodes = includeNodes;
   }
 
+  /**
+   * Sets the datanodes that will be the exclusive participants in balancing.
+   * Applicable only if the specified file is non-empty.
+   * @param includeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setIncludeNodes(File includeNodes) throws IOException {
+    try (Stream<String> strings = Files.lines(includeNodes.toPath(),
+        StandardCharsets.UTF_8)) {
+      this.includeNodes = strings.collect(Collectors.joining(","));
+    } catch (IOException e) {
+      LOG.debug("Could not read the specified includeNodes file.");

Review comment:
       This message does not seem to be too helpful.  First I need to enable debug log level.  Second, it does not include the filename.
   
   ```suggestion
         LOG.info("Could not read includeNodes file: {}.", includeNodes.getPath());
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r749963553



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
##########
@@ -731,6 +738,24 @@ boolean canSizeEnterTarget(DatanodeDetails target, long size) {
     return potentialTargets;
   }
 
+  /**
+   * Consults the configurations {@link ContainerBalancer#includeNodes} and
+   * {@link ContainerBalancer#excludeNodes} to check if the specified
+   * Datanode should be excluded from balancing.
+   * @param datanode DatanodeDetails to check
+   * @return true if Datanode should be excluded, else false
+   */
+  boolean shouldExcludeDatanode(DatanodeDetails datanode) {

Review comment:
       I think we should throw an IllegalArgumentsException. Otherwise it can be categorised as inconsistent behaviour.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r744455287



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -109,6 +109,20 @@
       "iteration of Container Balancer.")
   private long balancingInterval;
 
+  @Config(key = "include.datanodes", type = ConfigType.STRING, defaultValue =
+      "", tags = {ConfigTag.BALANCER}, description = "A list of Datanode " +
+      "hostnames or ip addresses separated by commas. Only the Datanodes " +
+      "specified in this list are balanced. This configuration is empty by " +
+      "default.")

Review comment:
       Sure!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r749187234



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -332,6 +337,22 @@ public void setIncludeNodes(String includeNodes) {
     this.includeNodes = includeNodes;
   }
 
+  /**
+   * Sets the datanodes that will be the exclusive participants in balancing.
+   * Applicable only if the specified file is non-empty.
+   * @param includeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setIncludeNodes(File includeNodes) throws IOException {
+    try (Stream<String> strings = Files.lines(includeNodes.toPath(),
+        StandardCharsets.UTF_8)) {
+      this.includeNodes = strings.collect(Collectors.joining(","));
+    } catch (IOException e) {
+      LOG.debug("Could not read the specified includeNodes file.");

Review comment:
       My bad. Improving this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r749892519



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
##########
@@ -731,6 +738,24 @@ boolean canSizeEnterTarget(DatanodeDetails target, long size) {
     return potentialTargets;
   }
 
+  /**
+   * Consults the configurations {@link ContainerBalancer#includeNodes} and
+   * {@link ContainerBalancer#excludeNodes} to check if the specified
+   * Datanode should be excluded from balancing.
+   * @param datanode DatanodeDetails to check
+   * @return true if Datanode should be excluded, else false
+   */
+  boolean shouldExcludeDatanode(DatanodeDetails datanode) {

Review comment:
       This is bad configuration. I think we should throw an error in this case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#issuecomment-968652061


   @JacksonYao287 @adoroszlai Thanks for reviewing. I've updated the PR. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r751969394



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -292,6 +312,86 @@ public void setBalancingInterval(Duration balancingInterval) {
     }
   }
 
+  /**
+   * Gets a set of datanode hostnames or ip addresses that will be the exclusive
+   * participants in balancing.
+   * @return Set of hostname or ip address strings, or an empty set if the
+   * configuration is empty
+   */
+  public Set<String> getIncludeNodes() {
+    if (includeNodes.isEmpty()) {
+      return Collections.emptySet();
+    }
+    return Arrays.stream(includeNodes.split(","))
+        .map(String::trim)
+        .collect(Collectors.toSet());
+  }
+
+  /**
+   * Sets the datanodes that will be the exclusive participants in balancing.
+   * Applicable only if the specified string is non-empty.
+   * @param includeNodes a String of datanode hostnames or ip addresses
+   *                     separated by commas
+   */
+  public void setIncludeNodes(String includeNodes) {
+    this.includeNodes = includeNodes;
+  }
+
+  /**
+   * Sets the datanodes that will be the exclusive participants in balancing.
+   * Applicable only if the specified file is non-empty.
+   * @param includeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setIncludeNodes(File includeNodes) throws IOException {
+    try (Stream<String> strings = Files.lines(includeNodes.toPath(),
+        StandardCharsets.UTF_8)) {
+      this.includeNodes = strings.collect(Collectors.joining(","));
+    } catch (IOException e) {
+      LOG.debug("Could not read the specified includeNodes file.");
+      throw new IOException(e);
+    }
+  }
+
+  /**
+   * Gets a set of datanode hostnames or ip addresses that will be excluded
+   * from balancing.
+   * @return Set of hostname or ip address strings, or an empty set if the
+   * configuration is empty
+   */
+  public Set<String> getExcludeNodes() {
+    if (excludeNodes.isEmpty()) {
+      return Collections.emptySet();
+    }
+    return Arrays.stream(excludeNodes.split(","))
+        .map(String::trim)
+        .collect(Collectors.toSet());
+  }
+
+  /**
+   * Sets the datanodes that will be excluded from balancing.
+   * @param excludeNodes a String of datanode hostnames or ip addresses
+   *                     separated by commas
+   */
+  public void setExcludeNodes(String excludeNodes) {
+    this.excludeNodes = excludeNodes;
+  }
+
+  /**
+   * Sets the datanodes that will be excluded from balancing.
+   * @param excludeNodes a File of datanode hostnames or ip addresses
+   * @throws IOException if an I/O error occurs when opening the file
+   */
+  public void setExcludeNodes(File excludeNodes) throws IOException {

Review comment:
       Do we not need these setters then? `setExcludeNodes` that accepts a string seems enough




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] siddhantsangwan commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
siddhantsangwan commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r749951651



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancer.java
##########
@@ -731,6 +738,24 @@ boolean canSizeEnterTarget(DatanodeDetails target, long size) {
     return potentialTargets;
   }
 
+  /**
+   * Consults the configurations {@link ContainerBalancer#includeNodes} and
+   * {@link ContainerBalancer#excludeNodes} to check if the specified
+   * Datanode should be excluded from balancing.
+   * @param datanode DatanodeDetails to check
+   * @return true if Datanode should be excluded, else false
+   */
+  boolean shouldExcludeDatanode(DatanodeDetails datanode) {

Review comment:
       Should I simply throw an error or log it instead and recover by excluding the node?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#issuecomment-981380768


   Thanks @siddhantsangwan for the patch, @lokeshj1703 and @JacksonYao287 for the review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] JacksonYao287 commented on pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
JacksonYao287 commented on pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#issuecomment-960524302


   thanks @siddhantsangwan  for this work, the changes generally looks good!
   i have a concern that if we have many Include Nodes and Exclude Nodes, how can we pass these host names or IP addresses to balancer? typing these host names or IP addresses one by one in command line , or specifying a local file in command line which contains these informations.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] adoroszlai commented on a change in pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
adoroszlai commented on a change in pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#discussion_r742662663



##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -109,6 +109,20 @@
       "iteration of Container Balancer.")
   private long balancingInterval;
 
+  @Config(key = "include.datanodes", type = ConfigType.STRING, defaultValue =
+      "", tags = {ConfigTag.BALANCER}, description = "A list of Datanode " +
+      "hostnames or ip addresses separated by commas. Only the Datanodes " +
+      "specified in this list are balanced. This configuration is empty by " +
+      "default.")

Review comment:
       Can you please mention that the include list is only applicable if it is not empty?

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -292,6 +306,42 @@ public void setBalancingInterval(Duration balancingInterval) {
     }
   }
 
+  /**
+   * Gets a set of datanode hostnames or ip addresses that will be the exclusive
+   * participants in balancing.
+   * @return Set of hostname strings
+   */
+  public Set<String> getIncludeNodes() {
+    if (includeNodes.isEmpty()) {
+      return new HashSet<>();
+    }
+    return Arrays.stream(includeNodes.split(","))
+        .map(String::trim)
+        .collect(Collectors.toSet());
+  }
+
+  public void setIncludeNodes(String includeNodes) {
+    this.includeNodes = includeNodes;
+  }
+
+  /**
+   * Gets a set of datanode hostnames or ip addresses that will be excluded
+   * from balancing.
+   * @return Set of hostname strings
+   */
+  public Set<String> getExcludeNodes() {
+    if (excludeNodes.isEmpty()) {
+      return new HashSet<>();

Review comment:
       Please use `Collections.emptySet()` here, too.

##########
File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerConfiguration.java
##########
@@ -292,6 +306,42 @@ public void setBalancingInterval(Duration balancingInterval) {
     }
   }
 
+  /**
+   * Gets a set of datanode hostnames or ip addresses that will be the exclusive
+   * participants in balancing.
+   * @return Set of hostname strings
+   */
+  public Set<String> getIncludeNodes() {
+    if (includeNodes.isEmpty()) {
+      return new HashSet<>();

Review comment:
       Nit: `Collections.emptySet()` is cheaper.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] lokeshj1703 commented on pull request #2786: HDDS-5897. Support configuration for including/excluding datanodes for balancing

Posted by GitBox <gi...@apache.org>.
lokeshj1703 commented on pull request #2786:
URL: https://github.com/apache/ozone/pull/2786#issuecomment-976374426


   @adoroszlai Do you have any other comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org