You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by GitBox <gi...@apache.org> on 2022/06/15 13:01:10 UTC

[GitHub] [ozone] symious opened a new pull request, #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

symious opened a new pull request, #3519:
URL: https://github.com/apache/ozone/pull/3519

   ## What changes were proposed in this pull request?
   
   Currently, the volumeChoosingPolicy is RoundRobin logic, this ticket is to add a new policy that will honor the capacity of the volumes while choosing volumes.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-6891
   
   ## How was this patch tested?
   
   unit test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ferhui merged pull request #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

Posted by GitBox <gi...@apache.org>.
ferhui merged PR #3519:
URL: https://github.com/apache/ozone/pull/3519


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3519:
URL: https://github.com/apache/ozone/pull/3519#discussion_r899060680


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/CapacityVolumeChoosingPolicy.java:
##########
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.common.volume;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.ozone.container.common.interfaces.VolumeChoosingPolicy;
+import org.apache.hadoop.util.DiskChecker.DiskOutOfSpaceException;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Random;
+import java.util.stream.Collectors;
+
+/**
+ * Volume choosing policy that randomly choose volume with remaining
+ * space to satisfy the size constraints.
+ * <p>
+ * The Algorithm is as follows, Pick 2 random volumes from a given pool of
+ * volumeSet and then pick the volume with lower utilization. This leads to a
+ * higher probability of volume with lower utilization to be picked.
+ * <p>
+ * Same algorithm as the SCMContainerPlacementCapacity.
+ */
+public class CapacityVolumeChoosingPolicy implements VolumeChoosingPolicy {
+
+  public static final Log LOG = LogFactory.getLog(
+      CapacityVolumeChoosingPolicy.class);
+
+  // Stores the index of the next volume to be returned.
+  private final Random random = new Random();
+
+  @Override
+  public HddsVolume chooseVolume(List<HddsVolume> volumes,
+      long maxContainerSize) throws IOException {
+
+    // No volumes available to choose from
+    if (volumes.size() < 1) {
+      throw new DiskOutOfSpaceException("No more available volumes");
+    }
+
+    List<HddsVolume> filtered = volumes.stream()
+        .filter(v ->
+            v.getAvailable() - v.getCommittedBytes() > maxContainerSize)
+        .collect(Collectors.toList());
+    if (filtered.size() < 1) {
+      throw new DiskOutOfSpaceException("Out of space: "
+          + "All volumes are less than the container size (=" + maxContainerSize
+          + " B).");
+    } else if (filtered.size() == 1) {
+      return filtered.get(0);
+    } else {
+      int firstIndex = random.nextInt(filtered.size());
+      int secondIndex = random.nextInt(filtered.size());

Review Comment:
   With a small number of volumes, there is a reasonable chance of selecting the same index twice here I think? We might need to do something to ensure you get two unique indexes.
   
   Also what about the case with only 2 volumes available? Do we always want to select the smallest one, or do we want to select the smaller one more often than the other? Maybe we need another if branch for 2 volumes available to handle it specially?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] sodonnel commented on a diff in pull request #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

Posted by GitBox <gi...@apache.org>.
sodonnel commented on code in PR #3519:
URL: https://github.com/apache/ozone/pull/3519#discussion_r899233663


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/CapacityVolumeChoosingPolicy.java:
##########
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.common.volume;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.ozone.container.common.interfaces.VolumeChoosingPolicy;
+import org.apache.hadoop.util.DiskChecker.DiskOutOfSpaceException;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Random;
+import java.util.stream.Collectors;
+
+/**
+ * Volume choosing policy that randomly choose volume with remaining
+ * space to satisfy the size constraints.
+ * <p>
+ * The Algorithm is as follows, Pick 2 random volumes from a given pool of
+ * volumeSet and then pick the volume with lower utilization. This leads to a
+ * higher probability of volume with lower utilization to be picked.
+ * <p>
+ * Same algorithm as the SCMContainerPlacementCapacity.
+ */
+public class CapacityVolumeChoosingPolicy implements VolumeChoosingPolicy {
+
+  public static final Log LOG = LogFactory.getLog(
+      CapacityVolumeChoosingPolicy.class);
+
+  // Stores the index of the next volume to be returned.
+  private final Random random = new Random();
+
+  @Override
+  public HddsVolume chooseVolume(List<HddsVolume> volumes,
+      long maxContainerSize) throws IOException {
+
+    // No volumes available to choose from
+    if (volumes.size() < 1) {
+      throw new DiskOutOfSpaceException("No more available volumes");
+    }
+
+    List<HddsVolume> filtered = volumes.stream()
+        .filter(v ->
+            v.getAvailable() - v.getCommittedBytes() > maxContainerSize)
+        .collect(Collectors.toList());
+    if (filtered.size() < 1) {
+      throw new DiskOutOfSpaceException("Out of space: "
+          + "All volumes are less than the container size (=" + maxContainerSize
+          + " B).");
+    } else if (filtered.size() == 1) {
+      return filtered.get(0);
+    } else {
+      int firstIndex = random.nextInt(filtered.size());
+      int secondIndex = random.nextInt(filtered.size());

Review Comment:
   I see that now. It would be worth adding a comment to the code like the table above, as its not totally obvious that this is the intention of the code. It would help someone in the future if they need to figure something out related to this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on pull request #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

Posted by GitBox <gi...@apache.org>.
symious commented on PR #3519:
URL: https://github.com/apache/ozone/pull/3519#issuecomment-1156443134

   @sodonnel @adoroszlai Could you help to review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on a diff in pull request #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

Posted by GitBox <gi...@apache.org>.
symious commented on code in PR #3519:
URL: https://github.com/apache/ozone/pull/3519#discussion_r899322346


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/CapacityVolumeChoosingPolicy.java:
##########
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.common.volume;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.ozone.container.common.interfaces.VolumeChoosingPolicy;
+import org.apache.hadoop.util.DiskChecker.DiskOutOfSpaceException;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Random;
+import java.util.stream.Collectors;
+
+/**
+ * Volume choosing policy that randomly choose volume with remaining
+ * space to satisfy the size constraints.
+ * <p>
+ * The Algorithm is as follows, Pick 2 random volumes from a given pool of
+ * volumeSet and then pick the volume with lower utilization. This leads to a
+ * higher probability of volume with lower utilization to be picked.
+ * <p>
+ * Same algorithm as the SCMContainerPlacementCapacity.
+ */
+public class CapacityVolumeChoosingPolicy implements VolumeChoosingPolicy {
+
+  public static final Log LOG = LogFactory.getLog(
+      CapacityVolumeChoosingPolicy.class);
+
+  // Stores the index of the next volume to be returned.
+  private final Random random = new Random();
+
+  @Override
+  public HddsVolume chooseVolume(List<HddsVolume> volumes,
+      long maxContainerSize) throws IOException {
+
+    // No volumes available to choose from
+    if (volumes.size() < 1) {
+      throw new DiskOutOfSpaceException("No more available volumes");
+    }
+
+    List<HddsVolume> filtered = volumes.stream()
+        .filter(v ->
+            v.getAvailable() - v.getCommittedBytes() > maxContainerSize)
+        .collect(Collectors.toList());
+    if (filtered.size() < 1) {
+      throw new DiskOutOfSpaceException("Out of space: "
+          + "All volumes are less than the container size (=" + maxContainerSize
+          + " B).");
+    } else if (filtered.size() == 1) {
+      return filtered.get(0);
+    } else {
+      int firstIndex = random.nextInt(filtered.size());
+      int secondIndex = random.nextInt(filtered.size());

Review Comment:
   Sure, added the comment, please have a check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] ferhui commented on pull request #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

Posted by GitBox <gi...@apache.org>.
ferhui commented on PR #3519:
URL: https://github.com/apache/ozone/pull/3519#issuecomment-1164255364

   @symious Thanks for your contribution. @sodonnel Thanks for your review! Merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org


[GitHub] [ozone] symious commented on a diff in pull request #3519: HDDS-6891. Add CapacityVolumeChoosingPolicy

Posted by GitBox <gi...@apache.org>.
symious commented on code in PR #3519:
URL: https://github.com/apache/ozone/pull/3519#discussion_r899068851


##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/CapacityVolumeChoosingPolicy.java:
##########
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.container.common.volume;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.ozone.container.common.interfaces.VolumeChoosingPolicy;
+import org.apache.hadoop.util.DiskChecker.DiskOutOfSpaceException;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Random;
+import java.util.stream.Collectors;
+
+/**
+ * Volume choosing policy that randomly choose volume with remaining
+ * space to satisfy the size constraints.
+ * <p>
+ * The Algorithm is as follows, Pick 2 random volumes from a given pool of
+ * volumeSet and then pick the volume with lower utilization. This leads to a
+ * higher probability of volume with lower utilization to be picked.
+ * <p>
+ * Same algorithm as the SCMContainerPlacementCapacity.
+ */
+public class CapacityVolumeChoosingPolicy implements VolumeChoosingPolicy {
+
+  public static final Log LOG = LogFactory.getLog(
+      CapacityVolumeChoosingPolicy.class);
+
+  // Stores the index of the next volume to be returned.
+  private final Random random = new Random();
+
+  @Override
+  public HddsVolume chooseVolume(List<HddsVolume> volumes,
+      long maxContainerSize) throws IOException {
+
+    // No volumes available to choose from
+    if (volumes.size() < 1) {
+      throw new DiskOutOfSpaceException("No more available volumes");
+    }
+
+    List<HddsVolume> filtered = volumes.stream()
+        .filter(v ->
+            v.getAvailable() - v.getCommittedBytes() > maxContainerSize)
+        .collect(Collectors.toList());
+    if (filtered.size() < 1) {
+      throw new DiskOutOfSpaceException("Out of space: "
+          + "All volumes are less than the container size (=" + maxContainerSize
+          + " B).");
+    } else if (filtered.size() == 1) {
+      return filtered.get(0);
+    } else {
+      int firstIndex = random.nextInt(filtered.size());
+      int secondIndex = random.nextInt(filtered.size());

Review Comment:
   @sodonnel Thanks for the review.
   
   Say we have volume1(available: 200G) and volume2 (available: 100G), the expected result is we use volume1 more often than volume2.
   
   The choosing results could be as follows:
   | first choice| second choice | ratio | result |
   | ----------- | ---------------- | ----- | ------ |
   | volume1   | volume2           | 25% | volume1|
   | volume2   | volume1           | 25% | volume1|
   | volume1   | volume1           | 25% | volume1|
   | volume2   | volume2           | 25% | volume2|
   
   So the ratio of choosing volume1 is 75% and volume2 is 25%. 
   I think do not ensure two unique indexes just solves the special case of 2 volumes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org