You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2022/01/27 19:18:31 UTC

[GitHub] [helix] qqu0127 opened a new pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

qqu0127 opened a new pull request #1948:
URL: https://github.com/apache/helix/pull/1948


   As part of Virtual topology group project, we introduce its assignment logic, test and benchmark in this PR.
   
   ### Issues
   
   - [ ] My PR addresses the following Helix issues and references them in the PR description:
   Resolve https://github.com/apache/helix/issues/1947
   
   ### Description
   
   - [ ] Here are some details about my PR, including screenshots of any UI changes:
   
   This PR introduces a set of virtual grouping schemes with unit test and benchmark. This is for computing a virtualization based on physical zones/fault domains on cloud environment. 
   The assignment is stateless and deterministic. We introduced 3 schemes for compare. We also introduce some metrics for the assignment in regard to evenness and fault tolerance.  
   
   The following PR will include integration, config change and service logic.
   
   ### Tests
   
   - [ ] The following tests are written for this issue:
   
   new unit test TestVirtualTopologyGroup
   
   - The following is the result of the "mvn test" command on the appropriate module:
   
   (If CI test fails due to known issue, please specify the issue and test PR locally. Then copy & paste the result of "mvn test" to here.)
   
   ### Changes that Break Backward Compatibility (Optional)
   
   - My PR contains changes that break backward compatibility or previous assumptions for certain methods or API. They include:
   
   (Consider including all behavior changes for public methods or API. Also include these changes in merge description so that other developers are aware of these changes. This allows them to make relevant code changes in feature branches accounting for the new method/API behavior.)
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] qqu0127 commented on a change in pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
qqu0127 commented on a change in pull request #1948:
URL: https://github.com/apache/helix/pull/1948#discussion_r797963293



##########
File path: helix-core/src/test/java/org/apache/helix/cloud/virtualTopologyGroup/TestVirtualTopologyGroupAssignment.java
##########
@@ -0,0 +1,160 @@
+package org.apache.helix.cloud.virtualTopologyGroup;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.cloud.constants.VirtualTopologyGroupConstants;
+import org.apache.helix.cloud.topology.VirtualGroupAssignmentAlgorithm;
+import org.apache.helix.cloud.topology.VirtualTopologyGroupScheme;
+import org.apache.helix.util.HelixUtil;
+import org.testng.Assert;
+import org.testng.annotations.BeforeTest;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Test;
+
+public class TestVirtualTopologyGroupAssignment {
+
+  private static final String GROUP_NAME = "test_virtual_group";
+  private final List<String> _flattenExpected = Arrays.asList(
+      "1", "2", "3",
+      "4", "5", "6",
+      "7", "8", "9",
+      "a", "b", "c", "d");
+  private Map<String, Set<String>> _zoneMapping = new HashMap<>();
+
+  @BeforeTest
+  public void prepare() {
+    _zoneMapping = new HashMap<>();
+    _zoneMapping.put("c", Sets.newHashSet("9", "8", "7"));
+    _zoneMapping.put("a", Sets.newHashSet("2", "3", "1"));
+    _zoneMapping.put("z", Sets.newHashSet("b", "c", "d", "a"));
+    _zoneMapping.put("b", Sets.newHashSet("5", "4", "6"));
+  }
+
+  @Test
+  public void testFlattenZoneMapping() {
+    Assert.assertEquals(HelixUtil.sortAndFlattenZoneMapping(_zoneMapping), _flattenExpected);
+  }
+
+  @Test(dataProvider = "getMappingTests")
+  public void testAssignmentScheme(int numGroups, Map<String, Set<String>> expected,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Assert.assertEquals(algorithm.computeAssignment(numGroups, GROUP_NAME, _zoneMapping), expected);
+  }
+
+  @Test
+  public void assignmentBenchmark() {
+    Map<String, Set<String>> testMapping = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    singleAssignmentBenchmark(testMapping, 13);
+    singleAssignmentBenchmark(testMapping, 20);
+
+    int numGroups = 20;
+    Map<String, Set<String>> testMapping2 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping2.get("zone_2").add("2_201");
+    testMapping2.get("zone_2").add("2_202");
+    testMapping2.get("zone_3").add("3_301");
+    testMapping2.get("zone_3").add("3_302");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping2, numGroups, VirtualTopologyGroupScheme.FIFO);
+
+    Map<String, Set<String>> testMapping3 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping3.get("zone_2").remove("2_13");
+    testMapping3.get("zone_2").remove("2_8");
+    testMapping3.get("zone_4").remove("4_8");
+    testMapping3.get("zone_4").remove("4_17");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping3, numGroups, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void singleAssignmentBenchmark(Map<String, Set<String>> testMapping, int numGroups) {
+    System.out.println("Test mapping with numGroups: " + numGroups);
+    System.out.println(testMapping);
+    System.out.println("==========");
+    validate(numGroups, testMapping, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void compareAssignmentUnderDifferentZoneMapping(
+      Map<String, Set<String>> baseMapping,
+      Map<String, Set<String>> testMapping,
+      int numGroups,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Map<String, Set<String>> baseAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, baseMapping);
+    Map<String, Set<String>> testAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, testMapping);
+    AssignmentEvaluation base = new AssignmentEvaluation(baseAssignment, baseMapping);
+    AssignmentEvaluation test = new AssignmentEvaluation(testAssignment, testMapping);
+    System.out.println("Diff for " + algorithm + " : " + base.compareAssignments(test));
+  }

Review comment:
       To clarify, it's for computing the stability of the assignment under cluster change (add or remove instances). It's part of the benchmark, so it's not validating anything, but compute and print statistics for reference.
   I do see some value for the benchmark utils, but if it's not the coding convention, I'm totally ok to remove.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1948:
URL: https://github.com/apache/helix/pull/1948#discussion_r797969090



##########
File path: helix-core/src/test/java/org/apache/helix/cloud/virtualTopologyGroup/TestVirtualTopologyGroupAssignment.java
##########
@@ -0,0 +1,160 @@
+package org.apache.helix.cloud.virtualTopologyGroup;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.cloud.constants.VirtualTopologyGroupConstants;
+import org.apache.helix.cloud.topology.VirtualGroupAssignmentAlgorithm;
+import org.apache.helix.cloud.topology.VirtualTopologyGroupScheme;
+import org.apache.helix.util.HelixUtil;
+import org.testng.Assert;
+import org.testng.annotations.BeforeTest;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Test;
+
+public class TestVirtualTopologyGroupAssignment {
+
+  private static final String GROUP_NAME = "test_virtual_group";
+  private final List<String> _flattenExpected = Arrays.asList(
+      "1", "2", "3",
+      "4", "5", "6",
+      "7", "8", "9",
+      "a", "b", "c", "d");
+  private Map<String, Set<String>> _zoneMapping = new HashMap<>();
+
+  @BeforeTest
+  public void prepare() {
+    _zoneMapping = new HashMap<>();
+    _zoneMapping.put("c", Sets.newHashSet("9", "8", "7"));
+    _zoneMapping.put("a", Sets.newHashSet("2", "3", "1"));
+    _zoneMapping.put("z", Sets.newHashSet("b", "c", "d", "a"));
+    _zoneMapping.put("b", Sets.newHashSet("5", "4", "6"));
+  }
+
+  @Test
+  public void testFlattenZoneMapping() {
+    Assert.assertEquals(HelixUtil.sortAndFlattenZoneMapping(_zoneMapping), _flattenExpected);
+  }
+
+  @Test(dataProvider = "getMappingTests")
+  public void testAssignmentScheme(int numGroups, Map<String, Set<String>> expected,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Assert.assertEquals(algorithm.computeAssignment(numGroups, GROUP_NAME, _zoneMapping), expected);
+  }
+
+  @Test
+  public void assignmentBenchmark() {
+    Map<String, Set<String>> testMapping = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    singleAssignmentBenchmark(testMapping, 13);
+    singleAssignmentBenchmark(testMapping, 20);
+
+    int numGroups = 20;
+    Map<String, Set<String>> testMapping2 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping2.get("zone_2").add("2_201");
+    testMapping2.get("zone_2").add("2_202");
+    testMapping2.get("zone_3").add("3_301");
+    testMapping2.get("zone_3").add("3_302");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping2, numGroups, VirtualTopologyGroupScheme.FIFO);
+
+    Map<String, Set<String>> testMapping3 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping3.get("zone_2").remove("2_13");
+    testMapping3.get("zone_2").remove("2_8");
+    testMapping3.get("zone_4").remove("4_8");
+    testMapping3.get("zone_4").remove("4_17");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping3, numGroups, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void singleAssignmentBenchmark(Map<String, Set<String>> testMapping, int numGroups) {
+    System.out.println("Test mapping with numGroups: " + numGroups);
+    System.out.println(testMapping);
+    System.out.println("==========");
+    validate(numGroups, testMapping, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void compareAssignmentUnderDifferentZoneMapping(
+      Map<String, Set<String>> baseMapping,
+      Map<String, Set<String>> testMapping,
+      int numGroups,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Map<String, Set<String>> baseAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, baseMapping);
+    Map<String, Set<String>> testAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, testMapping);
+    AssignmentEvaluation base = new AssignmentEvaluation(baseAssignment, baseMapping);
+    AssignmentEvaluation test = new AssignmentEvaluation(testAssignment, testMapping);
+    System.out.println("Diff for " + algorithm + " : " + base.compareAssignments(test));
+  }

Review comment:
       So what's the expectation for people that run the test suite? Do they need to check the detail in the log to make sure the test runs correctly? Then it does not make much sense. If you can make it pass or fail with a standard, then you can keep it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] desaikomal commented on pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
desaikomal commented on pull request #1948:
URL: https://github.com/apache/helix/pull/1948#issuecomment-1028296573


   This is my first review, so please correct me if i am wrong.
   1. Since this is a new feature, can we update the description with link to feature document
   2. Can you fill out API/backward compatibility related sections too?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] qqu0127 commented on pull request #1948: Introduce VirtualTopologyGroup and its assignment logic.

Posted by GitBox <gi...@apache.org>.
qqu0127 commented on pull request #1948:
URL: https://github.com/apache/helix/pull/1948#issuecomment-1029294417


   This PR is ready to merge, approved by @zhangmeng916 
   
   Final commit message:
   
   Introduce VirtualTopologyGroup and its assignment logic
   We introduce VirtualTopologyGroup as a virtualization layer on top of physical fault zone, this commit includes its core component and assignment logic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] qqu0127 commented on pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
qqu0127 commented on pull request #1948:
URL: https://github.com/apache/helix/pull/1948#issuecomment-1028328887


   > don't we have naming convention wrt Interface and implementation Java Classes?
   I think I'm following the convention, what name specifically are you suggesting?
   
   > 
   > I agree that currently you have enum with one value and so in future, you will have more, then you will extend same interface differently?? re-using methods will not be possible with existing code.
   If we are following the enum approach, then each impl is a enum value with different method implementation. I don't see any blocker for code reusing even across impls, despite minor refactor. @desaikomal 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1948:
URL: https://github.com/apache/helix/pull/1948#discussion_r797992524



##########
File path: helix-core/src/main/java/org/apache/helix/cloud/topology/VirtualGroupAssignmentAlgorithm.java
##########
@@ -0,0 +1,38 @@
+package org.apache.helix.cloud.topology;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Map;
+import java.util.Set;
+
+
+public interface VirtualGroupAssignmentAlgorithm {

Review comment:
       Add to @desaikomal 's comment regarding this. The convention is more like `baseProvider` and `dataProvider`, or `verifier` and `zkVerifier`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] qqu0127 commented on a change in pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
qqu0127 commented on a change in pull request #1948:
URL: https://github.com/apache/helix/pull/1948#discussion_r797970116



##########
File path: helix-core/src/test/java/org/apache/helix/cloud/virtualTopologyGroup/TestVirtualTopologyGroupAssignment.java
##########
@@ -0,0 +1,160 @@
+package org.apache.helix.cloud.virtualTopologyGroup;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.cloud.constants.VirtualTopologyGroupConstants;
+import org.apache.helix.cloud.topology.VirtualGroupAssignmentAlgorithm;
+import org.apache.helix.cloud.topology.VirtualTopologyGroupScheme;
+import org.apache.helix.util.HelixUtil;
+import org.testng.Assert;
+import org.testng.annotations.BeforeTest;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Test;
+
+public class TestVirtualTopologyGroupAssignment {
+
+  private static final String GROUP_NAME = "test_virtual_group";
+  private final List<String> _flattenExpected = Arrays.asList(
+      "1", "2", "3",
+      "4", "5", "6",
+      "7", "8", "9",
+      "a", "b", "c", "d");
+  private Map<String, Set<String>> _zoneMapping = new HashMap<>();
+
+  @BeforeTest
+  public void prepare() {
+    _zoneMapping = new HashMap<>();
+    _zoneMapping.put("c", Sets.newHashSet("9", "8", "7"));
+    _zoneMapping.put("a", Sets.newHashSet("2", "3", "1"));
+    _zoneMapping.put("z", Sets.newHashSet("b", "c", "d", "a"));
+    _zoneMapping.put("b", Sets.newHashSet("5", "4", "6"));
+  }
+
+  @Test
+  public void testFlattenZoneMapping() {
+    Assert.assertEquals(HelixUtil.sortAndFlattenZoneMapping(_zoneMapping), _flattenExpected);
+  }
+
+  @Test(dataProvider = "getMappingTests")
+  public void testAssignmentScheme(int numGroups, Map<String, Set<String>> expected,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Assert.assertEquals(algorithm.computeAssignment(numGroups, GROUP_NAME, _zoneMapping), expected);
+  }
+
+  @Test
+  public void assignmentBenchmark() {
+    Map<String, Set<String>> testMapping = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    singleAssignmentBenchmark(testMapping, 13);
+    singleAssignmentBenchmark(testMapping, 20);
+
+    int numGroups = 20;
+    Map<String, Set<String>> testMapping2 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping2.get("zone_2").add("2_201");
+    testMapping2.get("zone_2").add("2_202");
+    testMapping2.get("zone_3").add("3_301");
+    testMapping2.get("zone_3").add("3_302");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping2, numGroups, VirtualTopologyGroupScheme.FIFO);
+
+    Map<String, Set<String>> testMapping3 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping3.get("zone_2").remove("2_13");
+    testMapping3.get("zone_2").remove("2_8");
+    testMapping3.get("zone_4").remove("4_8");
+    testMapping3.get("zone_4").remove("4_17");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping3, numGroups, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void singleAssignmentBenchmark(Map<String, Set<String>> testMapping, int numGroups) {
+    System.out.println("Test mapping with numGroups: " + numGroups);
+    System.out.println(testMapping);
+    System.out.println("==========");
+    validate(numGroups, testMapping, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void compareAssignmentUnderDifferentZoneMapping(
+      Map<String, Set<String>> baseMapping,
+      Map<String, Set<String>> testMapping,
+      int numGroups,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Map<String, Set<String>> baseAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, baseMapping);
+    Map<String, Set<String>> testAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, testMapping);
+    AssignmentEvaluation base = new AssignmentEvaluation(baseAssignment, baseMapping);
+    AssignmentEvaluation test = new AssignmentEvaluation(testAssignment, testMapping);
+    System.out.println("Diff for " + algorithm + " : " + base.compareAssignments(test));
+  }

Review comment:
       Unlikely... That makes sense, I'll remove it in the next commit. Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 merged pull request #1948: Introduce VirtualTopologyGroup and its assignment logic.

Posted by GitBox <gi...@apache.org>.
zhangmeng916 merged pull request #1948:
URL: https://github.com/apache/helix/pull/1948


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1948:
URL: https://github.com/apache/helix/pull/1948#discussion_r797951883



##########
File path: helix-core/src/test/java/org/apache/helix/cloud/virtualTopologyGroup/TestVirtualTopologyGroupAssignment.java
##########
@@ -0,0 +1,160 @@
+package org.apache.helix.cloud.virtualTopologyGroup;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.collect.Sets;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.cloud.constants.VirtualTopologyGroupConstants;
+import org.apache.helix.cloud.topology.VirtualGroupAssignmentAlgorithm;
+import org.apache.helix.cloud.topology.VirtualTopologyGroupScheme;
+import org.apache.helix.util.HelixUtil;
+import org.testng.Assert;
+import org.testng.annotations.BeforeTest;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Test;
+
+public class TestVirtualTopologyGroupAssignment {
+
+  private static final String GROUP_NAME = "test_virtual_group";
+  private final List<String> _flattenExpected = Arrays.asList(
+      "1", "2", "3",
+      "4", "5", "6",
+      "7", "8", "9",
+      "a", "b", "c", "d");
+  private Map<String, Set<String>> _zoneMapping = new HashMap<>();
+
+  @BeforeTest
+  public void prepare() {
+    _zoneMapping = new HashMap<>();
+    _zoneMapping.put("c", Sets.newHashSet("9", "8", "7"));
+    _zoneMapping.put("a", Sets.newHashSet("2", "3", "1"));
+    _zoneMapping.put("z", Sets.newHashSet("b", "c", "d", "a"));
+    _zoneMapping.put("b", Sets.newHashSet("5", "4", "6"));
+  }
+
+  @Test
+  public void testFlattenZoneMapping() {
+    Assert.assertEquals(HelixUtil.sortAndFlattenZoneMapping(_zoneMapping), _flattenExpected);
+  }
+
+  @Test(dataProvider = "getMappingTests")
+  public void testAssignmentScheme(int numGroups, Map<String, Set<String>> expected,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Assert.assertEquals(algorithm.computeAssignment(numGroups, GROUP_NAME, _zoneMapping), expected);
+  }
+
+  @Test
+  public void assignmentBenchmark() {
+    Map<String, Set<String>> testMapping = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    singleAssignmentBenchmark(testMapping, 13);
+    singleAssignmentBenchmark(testMapping, 20);
+
+    int numGroups = 20;
+    Map<String, Set<String>> testMapping2 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping2.get("zone_2").add("2_201");
+    testMapping2.get("zone_2").add("2_202");
+    testMapping2.get("zone_3").add("3_301");
+    testMapping2.get("zone_3").add("3_302");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping2, numGroups, VirtualTopologyGroupScheme.FIFO);
+
+    Map<String, Set<String>> testMapping3 = createZoneMapping(new int[] {40, 40, 40, 40, 40}, "zone");
+    testMapping3.get("zone_2").remove("2_13");
+    testMapping3.get("zone_2").remove("2_8");
+    testMapping3.get("zone_4").remove("4_8");
+    testMapping3.get("zone_4").remove("4_17");
+    compareAssignmentUnderDifferentZoneMapping(testMapping, testMapping3, numGroups, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void singleAssignmentBenchmark(Map<String, Set<String>> testMapping, int numGroups) {
+    System.out.println("Test mapping with numGroups: " + numGroups);
+    System.out.println(testMapping);
+    System.out.println("==========");
+    validate(numGroups, testMapping, VirtualTopologyGroupScheme.FIFO);
+  }
+
+  private static void compareAssignmentUnderDifferentZoneMapping(
+      Map<String, Set<String>> baseMapping,
+      Map<String, Set<String>> testMapping,
+      int numGroups,
+      VirtualGroupAssignmentAlgorithm algorithm) {
+    Map<String, Set<String>> baseAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, baseMapping);
+    Map<String, Set<String>> testAssignment = algorithm.computeAssignment(numGroups, GROUP_NAME, testMapping);
+    AssignmentEvaluation base = new AssignmentEvaluation(baseAssignment, baseMapping);
+    AssignmentEvaluation test = new AssignmentEvaluation(testAssignment, testMapping);
+    System.out.println("Diff for " + algorithm + " : " + base.compareAssignments(test));
+  }

Review comment:
       Why do we need to include this comparison in our test suite? Since your algorithm is deterministic, the result won't change no matter what. And I don't see the point of keeping running it for each commit. What is the value of knowing the diff between base and test case? The test does not really validate/assert anything.

##########
File path: helix-core/src/test/java/org/apache/helix/cloud/virtualTopologyGroup/AssignmentEvaluation.java
##########
@@ -0,0 +1,126 @@
+package org.apache.helix.cloud.virtualTopologyGroup;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+
+public class AssignmentEvaluation {
+  private final Map<String, Set<String>> _assignment;
+  private final Map<String, Set<String>> _coveredZonesByVirtualGroup;
+  private final Map<String, Set<String>> _affectedVirtualGroupByZone;
+  private final Map<String, String> _zoneByInstance;
+  private final Map<String, String> _virtualGroupByInstance;
+
+  public AssignmentEvaluation(Map<String, Set<String>> assignment, Map<String, Set<String>> zoneMapping) {
+    _assignment = assignment;
+
+    _zoneByInstance = new HashMap<>();
+    _virtualGroupByInstance = new HashMap<>();
+    for (Map.Entry<String, Set<String>> entry : zoneMapping.entrySet()) {
+      entry.getValue().forEach(instance -> _zoneByInstance.put(instance, entry.getKey()));
+    }
+    // key is virtual group, value is the set of physical zones that the virtual group span across
+    _coveredZonesByVirtualGroup = new HashMap<>();
+    // key is physical zone, value is the set of affected virtual group if the physical zone is down
+    _affectedVirtualGroupByZone = new HashMap<>();
+    for (String virtualGroup : assignment.keySet()) {
+      _coveredZonesByVirtualGroup.put(virtualGroup, computeCoveredZones(_zoneByInstance, assignment.get(virtualGroup)));
+      for (String instance : assignment.get(virtualGroup)) {
+        String zone = _zoneByInstance.get(instance);
+        _virtualGroupByInstance.put(instance, virtualGroup);
+        _affectedVirtualGroupByZone.putIfAbsent(zone, new HashSet<>());
+        _affectedVirtualGroupByZone.get(zone).add(virtualGroup);
+      }
+    }
+  }
+
+  public void print() {
+    for (Map.Entry<String, Set<String>> entry : _coveredZonesByVirtualGroup.entrySet()) {
+      System.out.println("virtual group: " + entry.getKey() + ", " + "covered physical zones: " + entry.getValue());
+    }
+    for (Map.Entry<String, Set<String>> entry : _affectedVirtualGroupByZone.entrySet()) {
+      System.out.println("physical zone " + entry.getKey() + " impacts " + entry.getValue());
+    }
+    System.out.println(generateStats().toString());

Review comment:
       Are all these prints useful? If it won't help for any debugging purpose, but only gets used in one time testing. I suggest to remove them.

##########
File path: helix-core/src/main/java/org/apache/helix/cloud/topology/VirtualGroupAssignmentAlgorithm.java
##########
@@ -0,0 +1,38 @@
+package org.apache.helix.cloud.topology;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.Map;
+import java.util.Set;
+
+
+public interface VirtualGroupAssignmentAlgorithm {

Review comment:
       I suggest to change the class name to be more self explanatory. Currently it's not easy to see the relation of `VirtualTopologyGroupScheme` and `VirtualGroupAssignmentAlgorithm`. 

##########
File path: helix-core/src/main/java/org/apache/helix/cloud/topology/VirtualTopologyGroupScheme.java
##########
@@ -0,0 +1,73 @@
+package org.apache.helix.cloud.topology;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.cloud.constants.VirtualTopologyGroupConstants;
+import org.apache.helix.util.HelixUtil;
+
+public enum VirtualTopologyGroupScheme implements VirtualGroupAssignmentAlgorithm {
+
+  /**
+   * A strategy that densely assign virtual groups with input instance list, it doesn't move to the next one until
+   * the current one is filled.
+   * Given that instances.size = instancesPerGroup * numGroups + residuals,
+   * we break [residuals] into the first few groups, as a result each virtual group will have
+   * either [instancesPerGroup] or [instancesPerGroup + 1] instances.
+   */
+  FIFO {

Review comment:
       I'm a bit confused about the code structure here. You have an enum with one value only, do you plan to add more in the future? Then will the function `computeVirtualGroupId` work for all future scheme? I think you should have one class for one type of assigning scheme, which implements the interface.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] qqu0127 commented on a change in pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
qqu0127 commented on a change in pull request #1948:
URL: https://github.com/apache/helix/pull/1948#discussion_r797973921



##########
File path: helix-core/src/main/java/org/apache/helix/cloud/topology/VirtualTopologyGroupScheme.java
##########
@@ -0,0 +1,73 @@
+package org.apache.helix.cloud.topology;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import org.apache.helix.cloud.constants.VirtualTopologyGroupConstants;
+import org.apache.helix.util.HelixUtil;
+
+public enum VirtualTopologyGroupScheme implements VirtualGroupAssignmentAlgorithm {
+
+  /**
+   * A strategy that densely assign virtual groups with input instance list, it doesn't move to the next one until
+   * the current one is filled.
+   * Given that instances.size = instancesPerGroup * numGroups + residuals,
+   * we break [residuals] into the first few groups, as a result each virtual group will have
+   * either [instancesPerGroup] or [instancesPerGroup + 1] instances.
+   */
+  FIFO {

Review comment:
       Thanks for the comment.
   My original idea is to use Enum to achieve singleton pattern while implementing an interface. Future impl can be added as enum value.
   However, seems like it's causing more confusion than benefits, let me create standalone impl class instead. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on pull request #1948:
URL: https://github.com/apache/helix/pull/1948#issuecomment-1023580373


   @qqu0127 I would like a clarification please: the final choice of assignment logic is one of the three choices, is that correct? If so, the benchmarking should be done in offline code and only the best one should be selected and checked in. Let me know if I have a misunderstanding. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] qqu0127 commented on pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
qqu0127 commented on pull request #1948:
URL: https://github.com/apache/helix/pull/1948#issuecomment-1023611265


   > @qqu0127 I would like a clarification please: the final choice of assignment logic is one of the three choices, is that correct? If so, the benchmarking should be done in offline code and only the best one should be selected and checked in. Let me know if I have a misunderstanding.
   
   @NealSun96 
   Eventually, yes, I expect only one is used for non-test. I open this and include all choices in order to share the code and as a reference to help better understand the design. 
   We have two options here, one is to merge this PR as it is with all 3, and cleanup in a following PR. This is targeting a project branch after all. 
   Second option is to update this PR after review with a new commit. I prefer the first one though.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on pull request #1948:
URL: https://github.com/apache/helix/pull/1948#issuecomment-1024430863


   @qqu0127 
   Since it's a feature branch, I will weigh your opinions more than my own. Although it seems odd to me that experiment code is checked in, I suppose the clean up can be done during the final merge. 
   Note that since this PR is not targeting master, github CI wouldn't kick off an automatic build & test process. If you wish to verify your changes, you need to do so locally. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] qqu0127 commented on pull request #1948: Introduce VirtualTopologyGroup and its assignment logic with benchmark.

Posted by GitBox <gi...@apache.org>.
qqu0127 commented on pull request #1948:
URL: https://github.com/apache/helix/pull/1948#issuecomment-1028089152


   Cleanup the unused code, this is ready for review. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org