You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/11/24 19:00:16 UTC

[GitHub] [helix] NealSun96 opened a new pull request #1550: Controller-side Task Current State Migration

NealSun96 opened a new pull request #1550:
URL: https://github.com/apache/helix/pull/1550


   ### Issues
   
   - [x] My PR addresses the following Helix issues and references them in the PR description:
   
   Fixes #1549 (Partially)
   
   ### Description
   
   - [x] Here are some details about my PR, including screenshots of any UI changes:
   
   To address excessive ZooKeeper reads caused by task framework CurrentState updates, it is proposed to move task framework CurrentStates to their separate path. 
   
   The change is divided to multiple phases. This is the first phase, where the main focus of the changes are on the controller side. These changes should be backward compatible, and should not break anything even without further changes on participant sides
   
   ### Tests
   
   - [ ] The following tests are written for this issue:
   
   (List the names of added unit/integration tests)
   
   - [x] The following is the result of the "mvn test" command on the appropriate module:
   
   ```
   [INFO] Tests run: 1250, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5,198.587 s - in TestSuite
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 1250, Failures: 0, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time:  01:26 h
   [INFO] Finished at: 2020-11-24T10:49:10-08:00
   [INFO] ------------------------------------------------------------------------
   ```
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534413070



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {
+        Map<String, CurrentState> taskCurrentStateMap = ((WorkflowControllerDataProvider) cache)

Review comment:
       When the project was first started, the agreement was that we keep all current behavior, which means merging task and non-task current states; due to possible naming conflicts, we allow regular current states to overwrite task ones. In terms of safety, this is the safest option since it doesn't modify behavior. 
   
   I think we definitely need to split it. For the future project, I am talking about code splitting: the current way is that the 2 pipelines are using one stage, which isn't truly split. I can do it here, but I have concerns over the timeline. This PR is only the first PR, and there are the "non-essential changes" and "participant-side changes" all requiring reviews. 
   
   I'm still looking into how this can be split, but if the intention here is "it's a good to have", then I'm against doing it for this PR due to time concerns. It wasn't how we planned it either. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533068598



##########
File path: helix-core/src/main/java/org/apache/helix/common/caches/TaskCurrentStateCache.java
##########
@@ -0,0 +1,59 @@
+package org.apache.helix.common.caches;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.PropertyKey;
+import org.apache.helix.common.controllers.ControlContextProvider;
+import org.apache.helix.model.CurrentState;
+import org.apache.helix.model.LiveInstance;
+
+
+/**
+ * Cache to hold all task CurrentStates of a cluster.
+ */
+public class TaskCurrentStateCache extends ParticipantStateCache<CurrentState> {

Review comment:
       No, the `PopulateParticipantKeys` function is different: it adds `taskCurrentState` keys. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r529817108



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/WorkflowControllerDataProvider.java
##########
@@ -78,14 +82,17 @@ private void refreshClusterStateChangeFlags(Set<HelixConstants.ChangeType> prope
         // This check (and set) is necessary for now since the current state flag in
         // _propertyDataChangedMap is not used by the BaseControllerDataProvider for now.
         _propertyDataChangedMap.get(HelixConstants.ChangeType.CURRENT_STATE).getAndSet(false)
-            || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE).getAndSet(false)
-            || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || _propertyDataChangedMap.get(HelixConstants.ChangeType.TASK_CURRENT_STATE)
+            .getAndSet(false) || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE)
+            .getAndSet(false) || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || propertyRefreshed.contains(HelixConstants.ChangeType.TASK_CURRENT_STATE)
             || propertyRefreshed.contains(HelixConstants.ChangeType.LIVE_INSTANCE);
   }
 
   public synchronized void refresh(HelixDataAccessor accessor) {
     long startTime = System.currentTimeMillis();
     Set<HelixConstants.ChangeType> propertyRefreshed = super.doRefresh(accessor);
+    _taskCurrentStateCache.refresh(accessor, getLiveInstanceCache().getPropertyMap());

Review comment:
       Why do we need to refreh here? Can TaskCurrentState get refreshed like regular current states?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -44,6 +44,7 @@
 import org.apache.helix.common.caches.CurrentStateCache;
 import org.apache.helix.common.caches.InstanceMessagesCache;
 import org.apache.helix.common.caches.PropertyCache;
+import org.apache.helix.common.caches.TaskCurrentStateCache;

Review comment:
       Unused import?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -79,7 +79,25 @@ public void process(ClusterEvent event) throws Exception {
 
     // It's important to get partitions from CurrentState as well since the
     // idealState might be removed.
-    processCurrentStates(cache, resourceMap, resourceToRebalance, idealStates, isTaskCache);
+    Map<String, LiveInstance> availableInstances = cache.getLiveInstances();
+    for (LiveInstance instance : availableInstances.values()) {
+      String instanceName = instance.getInstanceName();
+      String clientSessionId = instance.getEphemeralOwner();
+
+      Map<String, CurrentState> currentStateMap =
+          cache.getCurrentState(instanceName, clientSessionId);
+      processCurrentStates(currentStateMap, resourceMap, resourceToRebalance, idealStates,
+          isTaskCache);
+      // Duplicate resource names between regular and task resources may happen, but most likely
+      // won't. If it does, let regular resources overwrite task resources. To avoid duplicate
+      // resource overwriting, it's better to split regular and task pipelines entirely.

Review comment:
       Good comment

##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +90,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {

Review comment:
       Assuming we only refresh the task current state for the task pipeline then I got my answer to the above question? 👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r529833350



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/WorkflowControllerDataProvider.java
##########
@@ -78,14 +82,17 @@ private void refreshClusterStateChangeFlags(Set<HelixConstants.ChangeType> prope
         // This check (and set) is necessary for now since the current state flag in
         // _propertyDataChangedMap is not used by the BaseControllerDataProvider for now.
         _propertyDataChangedMap.get(HelixConstants.ChangeType.CURRENT_STATE).getAndSet(false)
-            || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE).getAndSet(false)
-            || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || _propertyDataChangedMap.get(HelixConstants.ChangeType.TASK_CURRENT_STATE)
+            .getAndSet(false) || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE)
+            .getAndSet(false) || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || propertyRefreshed.contains(HelixConstants.ChangeType.TASK_CURRENT_STATE)
             || propertyRefreshed.contains(HelixConstants.ChangeType.LIVE_INSTANCE);
   }
 
   public synchronized void refresh(HelixDataAccessor accessor) {
     long startTime = System.currentTimeMillis();
     Set<HelixConstants.ChangeType> propertyRefreshed = super.doRefresh(accessor);
+    _taskCurrentStateCache.refresh(accessor, getLiveInstanceCache().getPropertyMap());

Review comment:
       It's to save performance on the regular pipeline. Other caches refresh in `super.doRefresh(accessor)`, and TF specific caches can refresh here. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] dasahcc commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
dasahcc commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534415035



##########
File path: helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java
##########
@@ -527,8 +540,11 @@ public void invoke(NotificationContext changeContext) throws Exception {
       _zkClient.unsubscribeChildChanges(path, this);
     }
 
-    // List of children could be empty, but won't be null.
-    return _zkClient.getChildren(path);
+    try {
+      return _zkClient.getChildren(path);
+    } catch (ZkNoNodeException e) {
+      return null;
+    }

Review comment:
       My concern is if original behavior of other types will be changed as well. The exception we will see can be changed from NoNodeException to NullPointerException if you just return null for it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535768942



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -189,56 +189,69 @@ private void processCurrentStates(BaseControllerDataProvider cache,
         String instanceName = instance.getInstanceName();
         String clientSessionId = instance.getEphemeralOwner();
 
-        Map<String, CurrentState> currentStateMap =
-            cache.getCurrentState(instanceName, clientSessionId);
-        if (currentStateMap == null || currentStateMap.size() == 0) {
-          continue;
+        processCurrentStateMap(cache.getCurrentState(instanceName, clientSessionId), resourceMap,

Review comment:
       This logic should be in the cache, actually.
   
   getCurrentState(boolean isTaskPipeline) {
     currentstates = cache.getCurrentState(...)
     if (isTaskPipeline) {
       filter currentStates for the tasks only
       currentstates.addAll(cache.getTaskCurrentState(...))
     } else {
       filter currentStates for the resources only
     }
     return currentStates
   } 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] dasahcc commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
dasahcc commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533645122



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {
+        Map<String, CurrentState> taskCurrentStateMap = ((WorkflowControllerDataProvider) cache)

Review comment:
       So we already split them. Is this necessary to combine them into current state output? Can we just change the later logic get current state from cache?

##########
File path: helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java
##########
@@ -527,8 +540,11 @@ public void invoke(NotificationContext changeContext) throws Exception {
       _zkClient.unsubscribeChildChanges(path, this);
     }
 
-    // List of children could be empty, but won't be null.
-    return _zkClient.getChildren(path);
+    try {
+      return _zkClient.getChildren(path);
+    } catch (ZkNoNodeException e) {
+      return null;
+    }

Review comment:
       This changes the behavior, right? Can you elaborate what kind of scenario you will see this?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535507264



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -189,56 +189,69 @@ private void processCurrentStates(BaseControllerDataProvider cache,
         String instanceName = instance.getInstanceName();
         String clientSessionId = instance.getEphemeralOwner();
 
-        Map<String, CurrentState> currentStateMap =
-            cache.getCurrentState(instanceName, clientSessionId);
-        if (currentStateMap == null || currentStateMap.size() == 0) {
-          continue;
+        processCurrentStateMap(cache.getCurrentState(instanceName, clientSessionId), resourceMap,

Review comment:
       This is for backward compatibility, right?
   Can we create another private method to get the full task current state smartly? So you don't need 2 processCurrentStateMap calls here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534432829



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       I think it is just a name so it won't impact any logic.
   But task current state sounds redundent. Task state is good enough.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534499615



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/WorkflowControllerDataProvider.java
##########
@@ -78,14 +82,17 @@ private void refreshClusterStateChangeFlags(Set<HelixConstants.ChangeType> prope
         // This check (and set) is necessary for now since the current state flag in
         // _propertyDataChangedMap is not used by the BaseControllerDataProvider for now.
         _propertyDataChangedMap.get(HelixConstants.ChangeType.CURRENT_STATE).getAndSet(false)
-            || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE).getAndSet(false)
-            || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || _propertyDataChangedMap.get(HelixConstants.ChangeType.TASK_CURRENT_STATE)
+            .getAndSet(false) || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE)
+            .getAndSet(false) || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || propertyRefreshed.contains(HelixConstants.ChangeType.TASK_CURRENT_STATE)
             || propertyRefreshed.contains(HelixConstants.ChangeType.LIVE_INSTANCE);
   }
 
   public synchronized void refresh(HelixDataAccessor accessor) {
     long startTime = System.currentTimeMillis();
     Set<HelixConstants.ChangeType> propertyRefreshed = super.doRefresh(accessor);

Review comment:
       Resolved offline. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on pull request #1550:
URL: https://github.com/apache/helix/pull/1550#issuecomment-736132378


   @dasahcc As requested, I have trimmed down this PR: I removed all the non-essential changes, including REST and tools (utils) changes. 
   
   I kept both path building and controller logic change together. It doesn't make much sense for them to be separated. The controller change looks daunting, but it's actually just a refactor, so it's not very hard to review. Splitting it out would result too small of a PR. 
   
   Please let me know if this looks good. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533739162



##########
File path: helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java
##########
@@ -527,8 +540,11 @@ public void invoke(NotificationContext changeContext) throws Exception {
       _zkClient.unsubscribeChildChanges(path, this);
     }
 
-    // List of children could be empty, but won't be null.
-    return _zkClient.getChildren(path);
+    try {
+      return _zkClient.getChildren(path);
+    } catch (ZkNoNodeException e) {
+      return null;
+    }

Review comment:
       This change was because you reminded me about the instances without the task path. :) 
   
   For instance without TASKCURRENTSTATES path, the callback handler is still created, however, during reset(), subscribeChildChange() will be invoked and they will reach the above path. Since the path doesn't exist, this line causes a NoNodeException that's only caught on the outside by reset(). This causes some log pollution. 
   
   I synced with @pkuwm and we determined that this is unexpected and should be fixed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533743239



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {
+        Map<String, CurrentState> taskCurrentStateMap = ((WorkflowControllerDataProvider) cache)

Review comment:
       It may be possible to keep them split for the 2 pipelines. Experimenting. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533068390



##########
File path: helix-core/src/main/java/org/apache/helix/PropertyKey.java
##########
@@ -485,6 +486,52 @@ public PropertyKey currentState(String instanceName, String sessionId, String re
       }
     }
 
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance and session. This key
+     * is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentStates(String instanceName, String sessionId) {
+      return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+          sessionId);
+    }
+
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance, session, and
+     * resource. This key is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @param resourceName
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentState(String instanceName, String sessionId, String resourceName) {
+      return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+          sessionId, resourceName);
+    }
+
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance, session, resource,
+     * and bucket name. This key is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @param resourceName
+     * @param bucketName
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentState(String instanceName, String sessionId, String resourceName,
+        String bucketName) {
+      if (bucketName == null) {
+        return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+            sessionId, resourceName);
+
+      } else {

Review comment:
       Yep. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535501257



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       Reduce the length of code is big deal to me. If this is not a big effort, I suggest we do it.
   But this is my own opinion. I guess more ideas here will help. @dasahcc which name do you prefer?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r532754427



##########
File path: helix-core/src/main/java/org/apache/helix/PropertyKey.java
##########
@@ -485,6 +486,52 @@ public PropertyKey currentState(String instanceName, String sessionId, String re
       }
     }
 
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance and session. This key

Review comment:
       Both current state and task current state are using the same CurrentState. Logically it's fine, but it is really confusing. Is this the ultimate solution? Do we have plan to rename?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones

Review comment:
       I have a high level question about this PR. It separates current states to different paths and update them separately. How do you ensure the backward compatibility? e.g., routing table provider listens to current state path change, but now task related current state changes are removed from that path, routing table provider does have a different behavior than before, right? This may be what you want? But still this is not backward compatible. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534471895



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       The name should reflect what it represents, and it represents a current state. 
   
   Are you also suggesting that the zk path needs to be named as `TASKSTATES`? If that's the case, then `CURRENTSTATES` should be renamed to `RESOURCESTATES` to show the "counter part" relation. 
   
   All in all, I really don't see a big problem with "task current state". 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] dasahcc commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
dasahcc commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r536313792



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       Personally, TASK_CURRENT_STATE sounds better in this case. That equivalents CURRENT_STATE we have right now. Also if we use TASK_STATE, that could be confused with the TaskState enum class. Actually that is pointing to the concept of Job level state. Although I know this is something not correct, unless we change everything, we need to keep our naming concept consistent.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on pull request #1550:
URL: https://github.com/apache/helix/pull/1550#issuecomment-738932730


   Do not forget to request to merge it to the Task_CurrentState branch


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533892483



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {
+        Map<String, CurrentState> taskCurrentStateMap = ((WorkflowControllerDataProvider) cache)

Review comment:
       Looks like the pipelines are stuck, I'm checking again. 
   
   I believe theoretically, it's possible to split the "CurrentStateOutput" for 2 pipelines. However, for this PR I would like to have minimal behavior changes. Combining the "CurrentStateOutput" is to maintain the task-side behavior; the task pipeline processes both regular and task framework resources, so there may be a caveat somewhere that's preventing this from working. If this isn't a big issue, I think it's wise for us to keep this behavior for now until a pipeline splitting project. What do you think? @dasahcc 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535653642



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       TASK_CURRENT_STATE is a completely fine name in all regard, in my opinion. It has to be at least a wrong name to warranty changes on all variables that expand for 3 PRs, that's my take. I'll also defer to @dasahcc .




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534403396



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       It's still CurrentState for now, so I prefer calling it TASK_CURRENT_STATE. It corresponds to the underlying model. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r532770264



##########
File path: helix-core/src/main/java/org/apache/helix/PropertyKey.java
##########
@@ -485,6 +486,52 @@ public PropertyKey currentState(String instanceName, String sessionId, String re
       }
     }
 
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance and session. This key

Review comment:
       Down the line, task CurrentStates will have a different model. It's not within the scope of this change, though. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534440011



##########
File path: helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java
##########
@@ -527,8 +540,11 @@ public void invoke(NotificationContext changeContext) throws Exception {
       _zkClient.unsubscribeChildChanges(path, this);
     }
 
-    // List of children could be empty, but won't be null.
-    return _zkClient.getChildren(path);
+    try {
+      return _zkClient.getChildren(path);
+    } catch (ZkNoNodeException e) {
+      return null;
+    }

Review comment:
       All callers of this function (2 occasions) handle null cases. I believe this is the best solution when `path` doesn't exist, unless there's a better idea. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] pkuwm commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
pkuwm commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533006447



##########
File path: helix-core/src/main/java/org/apache/helix/HelixManager.java
##########
@@ -229,6 +229,15 @@ void addCurrentStateChangeListener(CurrentStateChangeListener listener, String i
   void addCurrentStateChangeListener(org.apache.helix.CurrentStateChangeListener listener, String instanceName,
       String sessionId) throws Exception;
 
+  /**
+   * Uses CurrentStateChangeListener since TaskCurrentState shares the same CurrentState model
+   * @see CurrentStateChangeListener#onStateChange(String, List, NotificationContext)
+   * @param listener
+   * @param instanceName
+   */
+  void addTaskCurrentStateChangeListener(CurrentStateChangeListener listener, String instanceName,

Review comment:
       I would suggest we add a default implementation for this newly added API for backwards compatibility. I've noticed that a helix user **ambry** implements this interface. So if we don't add a default implementation, it'll break ambry build if they upgrades helix.
   And another benefit of default implementation is, we don't need to add an empty implementation in those classes like `MockClusterManager` in tests. Thus less changed files. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani merged pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
alirezazamani merged pull request #1550:
URL: https://github.com/apache/helix/pull/1550


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535839252



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -189,56 +189,69 @@ private void processCurrentStates(BaseControllerDataProvider cache,
         String instanceName = instance.getInstanceName();
         String clientSessionId = instance.getEphemeralOwner();
 
-        Map<String, CurrentState> currentStateMap =
-            cache.getCurrentState(instanceName, clientSessionId);
-        if (currentStateMap == null || currentStateMap.size() == 0) {
-          continue;
+        processCurrentStateMap(cache.getCurrentState(instanceName, clientSessionId), resourceMap,

Review comment:
       Right now, it can't be a part of `BaseControllerDataProvider` if that's what you're referring to. `TaskCurrentStateCache` exists in `WorkflowControllerDataProvider` only, and it's for a performance reason since it only needs to be refreshed there, not in the regular pipeline. Without first determining `isTaskPipeline`, we can't parse the base data provider to the workflow data provider.
   
   Of course, we can change how that works. I'll experiment. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on pull request #1550:
URL: https://github.com/apache/helix/pull/1550#issuecomment-740889131


   This PR is ready to be merged, approved by @alirezazamani @dasahcc @jiajunwang 
   Final commit message:
   ## Controller-side Task Current State Migration ##
   First part of task current state migration. All changes made in this PR are on the controller side and are directly pipeline-impacting. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r529832726



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -44,6 +44,7 @@
 import org.apache.helix.common.caches.CurrentStateCache;
 import org.apache.helix.common.caches.InstanceMessagesCache;
 import org.apache.helix.common.caches.PropertyCache;
+import org.apache.helix.common.caches.TaskCurrentStateCache;

Review comment:
       Removed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on pull request #1550:
URL: https://github.com/apache/helix/pull/1550#issuecomment-733193530


   > So the participant side change will come later on. Right? Then this PR will be a new controller just listening on both paths? Basically, the integration tests are passing and they are confirming the controller is able to listen on both paths. But the actual functionality of the new path will be done later when you do participant side change?
   
   The goal I'm trying to achieve is that this PR can be separately rolled out/back without breaking anything. Your understanding is right that there's no code that's making use of the new path, but the listener is ready to listen and respond with the new path. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534586932



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {
+        Map<String, CurrentState> taskCurrentStateMap = ((WorkflowControllerDataProvider) cache)

Review comment:
       @dasahcc @jiajunwang I was very distracted by the argument itself that I lost sight to a very simple fact, my apology: splitting doesn't work here due to backward compatibility issues. If we completely split now, all task current states will fail to read for any participants with older helix versions. In another way, the splitting is already done in the sense that regular pipelines are not reading task current states anymore, which is an improvement done by this PR. :) 
   
   This PR itself doesn't introduce participant-side changes, so all task current states are still in the old path. Therefore complete split causes test problems. What I observed as "pipelines stuck" are overflowing logs caused by message handling, preventing me from seeing test outputs. In short, initialization messages are continuously generated due to missing current states from CurrentStateOutput, and that causes a log flood. If you're interested to find out more details, feel free to let me know. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534424426



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -79,7 +79,25 @@ public void process(ClusterEvent event) throws Exception {
 
     // It's important to get partitions from CurrentState as well since the
     // idealState might be removed.
-    processCurrentStates(cache, resourceMap, resourceToRebalance, idealStates, isTaskCache);
+    Map<String, LiveInstance> availableInstances = cache.getLiveInstances();
+    for (LiveInstance instance : availableInstances.values()) {
+      String instanceName = instance.getInstanceName();
+      String clientSessionId = instance.getEphemeralOwner();
+
+      Map<String, CurrentState> currentStateMap =
+          cache.getCurrentState(instanceName, clientSessionId);
+      processCurrentStates(currentStateMap, resourceMap, resourceToRebalance, idealStates,

Review comment:
       Since processing task current states and processing regular current states share the same logic, at one point along the code, the same logic will be factored out. It either happens in `process()` (processCurrentStates() is the "same logic"), or it happens in `processCurrentStates()` (requires another new function). I think it can happen in `process()`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r532773007



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones

Review comment:
       That is what we want: routing table provider should not be triggered by task current states; you could see the issue description for more details.
   
   The backward compatibility needs to be ensured on the controller side, since this change should not impact the controller. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] zhangmeng916 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
zhangmeng916 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r532785291



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones

Review comment:
       So a related question is that when we roll out this controller version, are we sure no client will notice the difference. I mean are we sure no client relies on current state change for any task related notification?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on pull request #1550:
URL: https://github.com/apache/helix/pull/1550#issuecomment-733190660


   I did initial parsing for this PR. Did not see any specific issue. Will look more soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534413070



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {
+        Map<String, CurrentState> taskCurrentStateMap = ((WorkflowControllerDataProvider) cache)

Review comment:
       When the project was first started, the agreement was that we keep all current behavior, which means merging task and non-task current states; due to possible naming conflicts, we allow regular current states to overwrite task ones. In terms of safety, this is the safest option since it doesn't modify behavior. 
   
   I think we definitely need to split it. For the future project, I am talking about code splitting: the current way is that the 2 pipelines are using one stage, which isn't truly split. I can do it here, but I have concerns over the timeline. This PR is only the first PR, and there are the "non-essential changes" and "participant-side changes" all requiring reviews. 
   
   I'm still looking into how this can be split, but if the intention here is "it's a good to have", then I'm against doing it for this PR due to time concerns and this is not a low hanging fruit apparently. It wasn't how we planned it either. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535565948



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -189,56 +189,69 @@ private void processCurrentStates(BaseControllerDataProvider cache,
         String instanceName = instance.getInstanceName();
         String clientSessionId = instance.getEphemeralOwner();
 
-        Map<String, CurrentState> currentStateMap =
-            cache.getCurrentState(instanceName, clientSessionId);
-        if (currentStateMap == null || currentStateMap.size() == 0) {
-          continue;
+        processCurrentStateMap(cache.getCurrentState(instanceName, clientSessionId), resourceMap,

Review comment:
       We always need to process current states from both paths. In the resource pipeline, we need the regular current states; in the task pipeline, we need task current states + regular for backward compatibility. 
   Could you elaborate on the "getting smartly" part? Do you mean: combine the current state maps so there's only one call? That would incur additional memory and computation cost. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] dasahcc commented on pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
dasahcc commented on pull request #1550:
URL: https://github.com/apache/helix/pull/1550#issuecomment-740254066


   Yeah. I dont have major concern and already approve it. If there is no other concerns, feel free to check in.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534404266



##########
File path: helix-core/src/main/java/org/apache/helix/PropertyKey.java
##########
@@ -485,6 +486,50 @@ public PropertyKey currentState(String instanceName, String sessionId, String re
       }
     }
 
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance and session. This key
+     * is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentStates(String instanceName, String sessionId) {
+      return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+          sessionId);
+    }
+
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance, session, and
+     * resource. This key is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @param resourceName
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentState(String instanceName, String sessionId, String resourceName) {

Review comment:
       Sure




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r532788649



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones

Review comment:
       The assumption I'm under is that no routing table provider should rely on task current states; preventing it from listening to task current states is the main goal of this change. This assumption is proposed before starting this project, so I'm inclined to believe that it's true. Conceptually, it also doesn't make sense for routing table provider to rely on task current states. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534471895



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       The name should reflect what it represents, and it represents a current state. 
   
   Are you also suggesting that the zk path needs to be named as `TASKSTATES`? Personally, this name "TaskCurrentState" has been used everywhere from design doc to all variables in this PR. I don't see a big problem with it, so I don't think it's worth the effort to change all of them. The effort is not large but the return is even smaller. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r532988566



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -179,66 +197,53 @@ private void processJobConfigs(WorkflowControllerDataProvider taskDataCache, Map
   /*
    * Construct Resources based on CurrentStates and add them to resource maps
    */
-  private void processCurrentStates(BaseControllerDataProvider cache,

Review comment:
       This change is just a refactor: same code before and after. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533908911



##########
File path: helix-core/src/main/java/org/apache/helix/HelixConstants.java
##########
@@ -34,6 +34,7 @@
     CLUSTER_CONFIG (PropertyType.CONFIGS),
     LIVE_INSTANCE (PropertyType.LIVEINSTANCES),
     CURRENT_STATE (PropertyType.CURRENTSTATES),
+    TASK_CURRENT_STATE (PropertyType.TASKCURRENTSTATES),

Review comment:
       nit, but might be important, can we just call it TASK_STATE for simplicity?

##########
File path: helix-core/src/main/java/org/apache/helix/PropertyKey.java
##########
@@ -485,6 +486,50 @@ public PropertyKey currentState(String instanceName, String sessionId, String re
       }
     }
 
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance and session. This key
+     * is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentStates(String instanceName, String sessionId) {
+      return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+          sessionId);
+    }
+
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance, session, and
+     * resource. This key is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @param resourceName
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentState(String instanceName, String sessionId, String resourceName) {

Review comment:
       Shall we call it jobName instead of resourceName?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +89,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {
+        Map<String, CurrentState> taskCurrentStateMap = ((WorkflowControllerDataProvider) cache)

Review comment:
       Strictly speaking, the pipelines are already split. So there won't be a separate project to split the pipeline further. Note that we have 2 pipelines already. The project we planned to do is split the code. So if you can do it in this PR, we will save some later work. Otherwise, any new code you added here using a branch might need to be re-write soon : )

##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -79,7 +79,25 @@ public void process(ClusterEvent event) throws Exception {
 
     // It's important to get partitions from CurrentState as well since the
     // idealState might be removed.
-    processCurrentStates(cache, resourceMap, resourceToRebalance, idealStates, isTaskCache);
+    Map<String, LiveInstance> availableInstances = cache.getLiveInstances();
+    for (LiveInstance instance : availableInstances.values()) {
+      String instanceName = instance.getInstanceName();
+      String clientSessionId = instance.getEphemeralOwner();
+
+      Map<String, CurrentState> currentStateMap =
+          cache.getCurrentState(instanceName, clientSessionId);
+      processCurrentStates(currentStateMap, resourceMap, resourceToRebalance, idealStates,

Review comment:
       Since processCurrentStates already takes isTaskCache, can we put these complex into the processCurrentStates() method?

##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/WorkflowControllerDataProvider.java
##########
@@ -78,14 +82,17 @@ private void refreshClusterStateChangeFlags(Set<HelixConstants.ChangeType> prope
         // This check (and set) is necessary for now since the current state flag in
         // _propertyDataChangedMap is not used by the BaseControllerDataProvider for now.
         _propertyDataChangedMap.get(HelixConstants.ChangeType.CURRENT_STATE).getAndSet(false)
-            || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE).getAndSet(false)
-            || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || _propertyDataChangedMap.get(HelixConstants.ChangeType.TASK_CURRENT_STATE)
+            .getAndSet(false) || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE)
+            .getAndSet(false) || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
+            || propertyRefreshed.contains(HelixConstants.ChangeType.TASK_CURRENT_STATE)
             || propertyRefreshed.contains(HelixConstants.ChangeType.LIVE_INSTANCE);
   }
 
   public synchronized void refresh(HelixDataAccessor accessor) {
     long startTime = System.currentTimeMillis();
     Set<HelixConstants.ChangeType> propertyRefreshed = super.doRefresh(accessor);

Review comment:
       Seems you did not update the propertyRefreshed flag for _taskCurrentStateCache, so the refreshClusterStateChangeFlags() won't work as expected.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] dasahcc commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
dasahcc commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r532999315



##########
File path: helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixAdmin.java
##########
@@ -193,6 +193,8 @@ public void addInstance(String clusterName, InstanceConfig instanceConfig) {
 
     _zkClient.createPersistent(PropertyPathBuilder.instanceMessage(clusterName, nodeId), true);
     _zkClient.createPersistent(PropertyPathBuilder.instanceCurrentState(clusterName, nodeId), true);
+    _zkClient

Review comment:
       Please remember to handle the instances are created but not have this path.

##########
File path: helix-core/src/main/java/org/apache/helix/common/caches/TaskCurrentStateCache.java
##########
@@ -0,0 +1,59 @@
+package org.apache.helix.common.caches;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.PropertyKey;
+import org.apache.helix.common.controllers.ControlContextProvider;
+import org.apache.helix.model.CurrentState;
+import org.apache.helix.model.LiveInstance;
+
+
+/**
+ * Cache to hold all task CurrentStates of a cluster.
+ */
+public class TaskCurrentStateCache extends ParticipantStateCache<CurrentState> {

Review comment:
       This is same logic as regular current state, right? We can create two different object for CurrentStateCache instead of creating a new class

##########
File path: helix-core/src/main/java/org/apache/helix/PropertyKey.java
##########
@@ -485,6 +486,52 @@ public PropertyKey currentState(String instanceName, String sessionId, String re
       }
     }
 
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance and session. This key
+     * is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentStates(String instanceName, String sessionId) {
+      return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+          sessionId);
+    }
+
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance, session, and
+     * resource. This key is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @param resourceName
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentState(String instanceName, String sessionId, String resourceName) {
+      return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+          sessionId, resourceName);
+    }
+
+    /**
+     * Get a property key associated with {@link CurrentState} of an instance, session, resource,
+     * and bucket name. This key is for TaskCurrentState specifically.
+     * @param instanceName
+     * @param sessionId
+     * @param resourceName
+     * @param bucketName
+     * @return {@link PropertyKey}
+     */
+    public PropertyKey taskCurrentState(String instanceName, String sessionId, String resourceName,
+        String bucketName) {
+      if (bucketName == null) {
+        return new PropertyKey(TASKCURRENTSTATES, CurrentState.class, _clusterName, instanceName,
+            sessionId, resourceName);
+
+      } else {

Review comment:
       This "else" keyword is not necessary. You can have it as:
   
   if (xxx) {
   return xxx
   }
   return xxxx;




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r533066464



##########
File path: helix-core/src/main/java/org/apache/helix/HelixManager.java
##########
@@ -229,6 +229,15 @@ void addCurrentStateChangeListener(CurrentStateChangeListener listener, String i
   void addCurrentStateChangeListener(org.apache.helix.CurrentStateChangeListener listener, String instanceName,
       String sessionId) throws Exception;
 
+  /**
+   * Uses CurrentStateChangeListener since TaskCurrentState shares the same CurrentState model
+   * @see CurrentStateChangeListener#onStateChange(String, List, NotificationContext)
+   * @param listener
+   * @param instanceName
+   */
+  void addTaskCurrentStateChangeListener(CurrentStateChangeListener listener, String instanceName,

Review comment:
       Thank you! Great tip!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534436443



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -79,7 +79,25 @@ public void process(ClusterEvent event) throws Exception {
 
     // It's important to get partitions from CurrentState as well since the
     // idealState might be removed.
-    processCurrentStates(cache, resourceMap, resourceToRebalance, idealStates, isTaskCache);
+    Map<String, LiveInstance> availableInstances = cache.getLiveInstances();
+    for (LiveInstance instance : availableInstances.values()) {
+      String instanceName = instance.getInstanceName();
+      String clientSessionId = instance.getEphemeralOwner();
+
+      Map<String, CurrentState> currentStateMap =
+          cache.getCurrentState(instanceName, clientSessionId);
+      processCurrentStates(currentStateMap, resourceMap, resourceToRebalance, idealStates,

Review comment:
       In this case, I will suggest extracting another private method to make process() code cleaner.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535768942



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -189,56 +189,69 @@ private void processCurrentStates(BaseControllerDataProvider cache,
         String instanceName = instance.getInstanceName();
         String clientSessionId = instance.getEphemeralOwner();
 
-        Map<String, CurrentState> currentStateMap =
-            cache.getCurrentState(instanceName, clientSessionId);
-        if (currentStateMap == null || currentStateMap.size() == 0) {
-          continue;
+        processCurrentStateMap(cache.getCurrentState(instanceName, clientSessionId), resourceMap,

Review comment:
       getCurrentState(boolean isTaskPipeline) {
     currentstates = cache.getCurrentState(...)
     if (isTaskPipeline) {
       filter currentStates for the tasks only
       currentstates.addAll(cache.getTaskCurrentState(...))
     } else {
       filter currentStates for the resources only
     }
     return currentStates
   } 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] dasahcc commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
dasahcc commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r535490469



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/WorkflowControllerDataProvider.java
##########
@@ -78,14 +82,16 @@ private void refreshClusterStateChangeFlags(Set<HelixConstants.ChangeType> prope
         // This check (and set) is necessary for now since the current state flag in
         // _propertyDataChangedMap is not used by the BaseControllerDataProvider for now.
         _propertyDataChangedMap.get(HelixConstants.ChangeType.CURRENT_STATE).getAndSet(false)
-            || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE).getAndSet(false)
-            || propertyRefreshed.contains(HelixConstants.ChangeType.CURRENT_STATE)
-            || propertyRefreshed.contains(HelixConstants.ChangeType.LIVE_INSTANCE);
+            || _propertyDataChangedMap.get(HelixConstants.ChangeType.TASK_CURRENT_STATE)
+            .getAndSet(false) || _propertyDataChangedMap.get(HelixConstants.ChangeType.MESSAGE)
+            .getAndSet(false) || propertyRefreshed
+            .contains(HelixConstants.ChangeType.LIVE_INSTANCE);

Review comment:
       Let's make the consistent format




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r529833549



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
##########
@@ -89,6 +90,13 @@ public void process(ClusterEvent event) throws Exception {
       String instanceSessionId = instance.getEphemeralOwner();
 
       // update current states.
+      // Like ResourceComputationStage, we give priority to regular resources, so update task ones
+      // first and allow regular ones to overwrite if there's any name conflicts.
+      if (_isTaskFrameworkPipeline) {

Review comment:
       See above. I believe we're referring to the same thing. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] NealSun96 commented on a change in pull request #1550: Controller-side Task Current State Migration

Posted by GitBox <gi...@apache.org>.
NealSun96 commented on a change in pull request #1550:
URL: https://github.com/apache/helix/pull/1550#discussion_r534515653



##########
File path: helix-core/src/main/java/org/apache/helix/controller/stages/ResourceComputationStage.java
##########
@@ -79,7 +79,25 @@ public void process(ClusterEvent event) throws Exception {
 
     // It's important to get partitions from CurrentState as well since the
     // idealState might be removed.
-    processCurrentStates(cache, resourceMap, resourceToRebalance, idealStates, isTaskCache);
+    Map<String, LiveInstance> availableInstances = cache.getLiveInstances();
+    for (LiveInstance instance : availableInstances.values()) {
+      String instanceName = instance.getInstanceName();
+      String clientSessionId = instance.getEphemeralOwner();
+
+      Map<String, CurrentState> currentStateMap =
+          cache.getCurrentState(instanceName, clientSessionId);
+      processCurrentStates(currentStateMap, resourceMap, resourceToRebalance, idealStates,

Review comment:
       Done. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org