You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by "rahulrane50 (via GitHub)" <gi...@apache.org> on 2023/05/25 01:10:46 UTC

[GitHub] [helix] rahulrane50 commented on a diff in pull request #2493: Waged - Hard constraint - Fix for n - n+1 issue.

rahulrane50 commented on code in PR #2493:
URL: https://github.com/apache/helix/pull/2493#discussion_r1204883835


##########
helix-core/src/main/java/org/apache/helix/controller/dataproviders/InstanceCapacityDataProvider.java:
##########
@@ -0,0 +1,59 @@
+package org.apache.helix.controller.dataproviders;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Map;
+
+/**
+ * An interface to provide capacity data for instance.
+ * It will consider the pending state transition and assigned partitions capacity.
+ * It will be dynamic and will always provide the available "headroom".
+ *
+ * The actual implementation will be stateful.
+ */
+public interface InstanceCapacityDataProvider {
+
+ /**
+   * Get the instance remaining capacity. 
+   * Capacity and weight both are represented as Key-Value.
+   * Returns the capacity map of available head room for the instance.
+   * @param instanceName - instance name to query
+   * @return Map<String, Integer> - capacity pair for all defined attributes for the instance.
+   */
+  public Map<String, Integer> getInstanceAvailableCapacity(String instanceName);
+
+  /**
+   * Check if partition can be placed on the instance.
+   *
+   * @param instanceName - instance name 
+   * @param partitionCapacity - Partition capacity expresed in capacity map.
+   * @return boolean - True if the partition can be placed, False otherwise
+   */
+  public boolean isInstanceCapacityAvailable(String instanceName, Map<String, Integer> partitionCapacity);

Review Comment:
   nit. Should it be "partitionWeight" but "partitionCapacity"? I see this class as one which provides "available" capacity of an instance and this method as to check given partition "weight" would it ok to place it on that instance.



##########
helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedInstanceCapacity.java:
##########
@@ -0,0 +1,164 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+import javax.annotation.Nullable;
+
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.apache.helix.controller.stages.CurrentStateOutput;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.Message;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.controller.dataproviders.InstanceCapacityDataProvider;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class WagedInstanceCapacity implements InstanceCapacityDataProvider {
+  private static final Logger LOG = LoggerFactory.getLogger(WagedInstanceCapacity.class);
+
+  // Available Capacity per Instance
+  private Map<String, Map<String, Integer>> _perInstance;
+  private ResourceControllerDataProvider _cache;
+
+  public WagedInstanceCapacity(ResourceControllerDataProvider clusterData) {
+    _cache = clusterData;
+    _perInstance = new HashMap<>();
+
+    ClusterConfig clusterConfig = _cache.getClusterConfig();
+    for (InstanceConfig instanceConfig : _cache.getInstanceConfigMap().values()) {
+      Map<String, Integer> instanceCapacity =
+        WagedValidationUtil.validateAndGetInstanceCapacity(clusterConfig, instanceConfig);
+      _perInstance.put(instanceConfig.getInstanceName(), instanceCapacity);
+    }
+  }
+
+  /**
+   * Create Default Capacity Map.
+   * This is a utility method to create a default capacity map matching instance capacity map for participants.
+   * This is required as non-WAGED partitions will be placed on same instance and we don't know their actual capacity.
+   * This will generate default values of 0 for all the capacity keys.
+   */
+  public Map<String, Integer> createDefaultParticipantWeight() {
+    // copy the value of first Instance capacity.
+    Map<String, Integer> partCapacity = new HashMap<>(_perInstance.values().iterator().next());
+
+    // reset the value of all capacity to 0.
+    for (String key : partCapacity.keySet()) {
+      partCapacity.put(key, 0);
+    }
+    return partCapacity;
+  }
+
+  /**
+   * Process the pending messages based on the Current states
+   * @param currentState - Current state of the resources.
+   */
+  public void processPendingMessages(CurrentStateOutput currentState) {

Review Comment:
   +1 to this.
   Everything else makes sense to me but this method looks like one method which goes over "all" resources and then reduce available instance capacity. 
   First we can update method name that suggests so. (may be like iteratePendingMsgsAndUpdateInstanceCapacity so something like that). And if that's too long name then atleast description should be updated. 



##########
helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedInstanceCapacity.java:
##########
@@ -0,0 +1,164 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+import javax.annotation.Nullable;
+
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.apache.helix.controller.stages.CurrentStateOutput;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.Message;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.controller.dataproviders.InstanceCapacityDataProvider;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class WagedInstanceCapacity implements InstanceCapacityDataProvider {

Review Comment:
   If I understand correctly then sole purpose of this class is update available instance capacity based on pending messages. If that's true then we can change class name which closely presents that or else add comments for that. 



##########
helix-core/src/main/java/org/apache/helix/controller/rebalancer/waged/WagedInstanceCapacity.java:
##########
@@ -0,0 +1,160 @@
+package org.apache.helix.controller.rebalancer.waged;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+import javax.annotation.Nullable;
+
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.apache.helix.controller.stages.CurrentStateOutput;
+import org.apache.helix.model.ClusterConfig;
+import org.apache.helix.model.InstanceConfig;
+import org.apache.helix.model.Message;
+import org.apache.helix.model.Partition;
+import org.apache.helix.model.ResourceConfig;
+import org.apache.helix.controller.dataproviders.InstanceCapacityDataProvider;
+import org.apache.helix.controller.dataproviders.ResourceControllerDataProvider;
+import org.apache.helix.controller.rebalancer.util.WagedValidationUtil;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class WagedInstanceCapacity implements InstanceCapacityDataProvider {
+  private static final Logger LOG = LoggerFactory.getLogger(WagedInstanceCapacity.class);
+
+  // Available Capacity per Instance
+  private final Map<String, Map<String, Integer>> _instanceCapacityMap;
+  private final ResourceControllerDataProvider _cache;
+
+  public WagedInstanceCapacity(ResourceControllerDataProvider clusterData) {

Review Comment:
   Another dumb que i had was AssignableNode also maintains this right? I understand it doesn't represent "temporary state" of an node but "ideally" we should update "remaining capacity" of the instance only when actual replica is mapped to that instance i..,e successful ST. here we make assumption that pending ST is going to be successful and hence proactively reduce instance capacity which may or may not be true (because that's actually going to be decided in your part 2 solution DelayedAutoRebalancer PR). I can suggest to clearly update the comments that it's temp "available" capacity of the instance.



##########
helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateOutput.java:
##########
@@ -460,4 +460,9 @@ public Map<String, ResourceAssignment> getAssignment(Set<String> resourceSet) {
 
     return currentStateAssignment;
   }
+
+
+  public Map<String, Map<Partition, Map<String, Message>>> getPendingMessages() {

Review Comment:
   We do have getPendingMessageMap() method in CurrentStateOutput.java right? Do we want another getter which gets for all resources?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org