You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/12/09 01:38:33 UTC

[GitHub] [helix] jiajunwang opened a new pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

jiajunwang opened a new pull request #1580:
URL: https://github.com/apache/helix/pull/1580


   ### Issues
   
   - [X] My PR addresses the following Helix issues and references them in the PR description:
   
   #1578 
   
   ### Description
   
   - [X] Here are some details about my PR, including screenshots of any UI changes:
   
   It was observed that the set methods in the Helix Data Provider classes, which are used but the Helix tools or tests, are not updating the derived fields.
   This makes the override incomplete and it causes some potential issues when the tools are used.
   This PR ensures that the set methods are modifying important derived fields.
   
   ### Tests
   
   - [X] The following tests are written for this issue:
   
   The main changed logic has been covered by the existing tests.
   It is hard, and maybe not a good idea to cover all the possible util tool usages.
   
   - [X] The following is the result of the "mvn test" command on the appropriate module:
   
   [ERROR] Failures:
   [ERROR]   TestDisableCustomCodeRunner.test:236 expected:<false> but was:<true>
   [INFO]
   [ERROR] Tests run: 1252, Failures: 1, Errors: 0, Skipped: 0
   [INFO]
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD FAILURE
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time: 01:32 h
   [INFO] Finished at: 2020-12-08T16:37:00-08:00
   [INFO] ------------------------------------------------------------------------
   
   Rerun the failed test:
   
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.117 s - in org.apache.helix.integration.TestDisableCustomCodeRunner
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time: 18.469 s
   [INFO] Finished at: 2020-12-08T17:36:03-08:00
   [INFO] ------------------------------------------------------------------------
   
   ### Documentation (Optional)
   
   - In case of new functionality, my PR adds documentation in the following wiki page:
   
   (Link the GitHub wiki you added)
   
   ### Commits
   
   - My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   
   - My diff has been formatted using helix-style.xml 
   (helix-style-intellij.xml if IntelliJ IDE is used)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-742968121


   > One bigger picture question:
   > 
   > Think from controller caching layer (of zk) point of view. The write of data is normally from tools or participant or even controller. But they do not use (should not use) these setClusterConfig etc methods of BaseControllerDataProvider.
   > 
   > Then who should use these set method? Note, these set method does not really persist the data to ZK. BaseControllerDataProvider is read only cache, right?
   > 
   > In fact, we should mark these set method test only? Otherwise, when people use them, they would be surprised:
   > 
   > * The data is not going to persist to next controller running cycle.
   
   I don't agree with you. The rebalancer tool is not test code. They are really used in production.
   The best we can do is create a child class based on the basic cache class. But that is out of the scope of this PR. And it is not an essential change.
   
   I suggest focusing on the issue fixing here. For the other topic, feel free to create other issues to track : )


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539711637



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());
+    updateDisabledInstances(getInstanceConfigMap().values(), getClusterConfig());

Review comment:
       This it to improve the design so we lower the chance of making such a mistake again.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());
+    updateDisabledInstances(getInstanceConfigMap().values(), getClusterConfig());

Review comment:
       This is to improve the design so we lower the chance of making such a mistake again.

##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());
+    updateDisabledInstances(getInstanceConfigMap().values(), getClusterConfig());

Review comment:
       This is to improve the style so we lower the chance of making such a mistake again.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang merged pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang merged pull request #1580:
URL: https://github.com/apache/helix/pull/1580


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r540445977



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -418,6 +418,8 @@ public ClusterConfig getClusterConfig() {
   public void setClusterConfig(ClusterConfig clusterConfig) {
     _clusterConfig = clusterConfig;
     refreshAbnormalStateResolverMap(_clusterConfig);
+    updateIdealRuleMap(_clusterConfig);
+    updateDisabledInstances(getInstanceConfigMap().values(), _clusterConfig);

Review comment:
       Then why this place we don't use `getClusterConfig()`. 
   
   Either way it is find to me. But please do use one approach. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539704431



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -744,30 +747,31 @@ private void updateOfflineInstanceHistory(HelixDataAccessor accessor) {
     _updateInstanceOfflineTime = false;
   }
 
-  private void updateDisabledInstances() {
+  private void updateDisabledInstances(Collection<InstanceConfig> instanceConfigs,
+      ClusterConfig clusterConfig) {
     // Move the calculating disabled instances to refresh
     _disabledInstanceForPartitionMap.clear();
     _disabledInstanceSet.clear();
-    for (InstanceConfig config : _instanceConfigCache.getPropertyMap().values()) {
+    for (InstanceConfig config : instanceConfigs) {
       Map<String, List<String>> disabledPartitionMap = config.getDisabledPartitionsMap();
       if (!config.getInstanceEnabled()) {
         _disabledInstanceSet.add(config.getInstanceName());
       }
       for (String resource : disabledPartitionMap.keySet()) {
         if (!_disabledInstanceForPartitionMap.containsKey(resource)) {
-          _disabledInstanceForPartitionMap.put(resource, new HashMap<String, Set<String>>());
+          _disabledInstanceForPartitionMap.put(resource, new HashMap<>());
         }
         for (String partition : disabledPartitionMap.get(resource)) {
           if (!_disabledInstanceForPartitionMap.get(resource).containsKey(partition)) {
-            _disabledInstanceForPartitionMap.get(resource).put(partition, new HashSet<String>());
+            _disabledInstanceForPartitionMap.get(resource).put(partition, new HashSet<>());
           }
           _disabledInstanceForPartitionMap.get(resource).get(partition)
               .add(config.getInstanceName());
         }
       }
     }
-    if (_clusterConfig != null && _clusterConfig.getDisabledInstances() != null) {
-      _disabledInstanceSet.addAll(_clusterConfig.getDisabledInstances().keySet());
+    if (clusterConfig != null && clusterConfig.getDisabledInstances() != null) {

Review comment:
       again, is `clusterConfig` passed in would always to the same as `_clusterConfig`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r540696299



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -418,6 +418,8 @@ public ClusterConfig getClusterConfig() {
   public void setClusterConfig(ClusterConfig clusterConfig) {
     _clusterConfig = clusterConfig;
     refreshAbnormalStateResolverMap(_clusterConfig);
+    updateIdealRuleMap(_clusterConfig);
+    updateDisabledInstances(getInstanceConfigMap().values(), _clusterConfig);

Review comment:
       I think I might need to clarify it a little bit more. Since you seem to be confused by the style things...
   
   I'm not sure what is the "approach" that you refer to... Calling the method or referring to the private fields are both OK. They are actually the same thing, right? Unless the method has some concurrent control or so.
   
   I don't see any reason why we have to use getClusterConfig() or _clusterConfig or clusterConfig in this context. They make a minor difference. If I have to compare getClusterConfig() is the worst, since it triggers one more method call.
   
   The difference is whether you pass the **INPUT** as the method parameters or directly refer to the private fields. This makes a difference. The latter introduces hidden dependencies between different private things. So this style is not good.
   This is the small improvement that I'm trying to do here.
   
   And obviously, this rule is not applicable in the set methods. Because they have to refer to the private fields, right?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-742966189


   > Change looks fine to me. Can you elaborate more we do this from external input? Why we cannot set cluster config and call these functions
   
   That's also a possible way. But the fact is that you cannot set and then trigger a partial refresh. So either you set to override one list, but the corresponding derived fields have old values. Or you call refresh and your set will be covered by the ZK data source.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-742863583


   One bigger picture question:
   
   Think from controller caching layer (of zk)  point of view. The write of data is normally from tools or participant or even controller. But they do not use (should not use) these setClusterConfig etc methods of BaseControllerDataProvider.
   
   Then who should use these set method? Note, these set method does not really persist the data to ZK. BaseControllerDataProvider is read only cache, right? 
   
   In fact, we should mark these set method test only? Otherwise, when people use them, they would be surprised:
   
   -  The data is not going to persist to next controller running cycle.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-744741417


   > > One bigger picture question:
   > > Think from controller caching layer (of zk) point of view. The write of data is normally from tools or participant or even controller. But they do not use (should not use) these setClusterConfig etc methods of BaseControllerDataProvider.
   > > Then who should use these set method? Note, these set method does not really persist the data to ZK. BaseControllerDataProvider is read only cache, right?
   > > In fact, we should mark these set method test only? Otherwise, when people use them, they would be surprised:
   > > 
   > > * The data is not going to persist to next controller running cycle.
   > 
   > I don't agree with you. The rebalancer tool is not test code. They are really used in production.
   > The best we can do is create a child class based on the basic cache class. But that is out of the scope of this PR. And it is not an essential change.
   > 
   > I suggest focusing on the issue fixing here. For the other topic, feel free to create other issues to track : )
   
   The the question is this:
   
   Calling setClusterConfig method of BaseControllerDataProvider by tools would only change data in the cache. However, it is not going to change data in Zookeeper. Next controller run cycle, the data refreshed from BaseController would be old again.
   
    Is this something expected? I doubt.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-746794830


   Approved by @dasahcc, I will merge the change soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539698594



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());

Review comment:
       This seems to be a non-essential change? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539711470



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -744,30 +747,31 @@ private void updateOfflineInstanceHistory(HelixDataAccessor accessor) {
     _updateInstanceOfflineTime = false;
   }
 
-  private void updateDisabledInstances() {
+  private void updateDisabledInstances(Collection<InstanceConfig> instanceConfigs,
+      ClusterConfig clusterConfig) {
     // Move the calculating disabled instances to refresh
     _disabledInstanceForPartitionMap.clear();
     _disabledInstanceSet.clear();
-    for (InstanceConfig config : _instanceConfigCache.getPropertyMap().values()) {
+    for (InstanceConfig config : instanceConfigs) {
       Map<String, List<String>> disabledPartitionMap = config.getDisabledPartitionsMap();
       if (!config.getInstanceEnabled()) {
         _disabledInstanceSet.add(config.getInstanceName());
       }
       for (String resource : disabledPartitionMap.keySet()) {
         if (!_disabledInstanceForPartitionMap.containsKey(resource)) {
-          _disabledInstanceForPartitionMap.put(resource, new HashMap<String, Set<String>>());
+          _disabledInstanceForPartitionMap.put(resource, new HashMap<>());
         }
         for (String partition : disabledPartitionMap.get(resource)) {
           if (!_disabledInstanceForPartitionMap.get(resource).containsKey(partition)) {
-            _disabledInstanceForPartitionMap.get(resource).put(partition, new HashSet<String>());
+            _disabledInstanceForPartitionMap.get(resource).put(partition, new HashSet<>());
           }
           _disabledInstanceForPartitionMap.get(resource).get(partition)
               .add(config.getInstanceName());
         }
       }
     }
-    if (_clusterConfig != null && _clusterConfig.getDisabledInstances() != null) {
-      _disabledInstanceSet.addAll(_clusterConfig.getDisabledInstances().keySet());
+    if (clusterConfig != null && clusterConfig.getDisabledInstances() != null) {

Review comment:
       Please refer to the previous comment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r543661663



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -418,6 +418,8 @@ public ClusterConfig getClusterConfig() {
   public void setClusterConfig(ClusterConfig clusterConfig) {
     _clusterConfig = clusterConfig;
     refreshAbnormalStateResolverMap(_clusterConfig);
+    updateIdealRuleMap(_clusterConfig);
+    updateDisabledInstances(getInstanceConfigMap().values(), _clusterConfig);

Review comment:
       Feel free to resolve this comment thread. The reason I ask is not for styling. It is that initially I not sure about the intention. I thought `getInstanceConfigMap()` had some tricks inside and that was why we did not use `_configMap`. If it is styling, no such intention, then either way is fine to me. Making them same style would also be good, but up to you.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539711223



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());

Review comment:
       Because setClusterConfig() is where you update _clusterConfig. And here, it is using it as an input.
   It is a good practice that we don't refer to a private field directly in a private method. Note that the main reason no one calls this method to update for now is that we hide it too deep, so no one remembers. If the parameters are explicitly passed, meaning input and output are clearly divided, then we avoid potential bugs.
   So, IMO this is tightly related to this change.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-742751878


   One general comment, shall we add a simple test to make sure this is working as expected?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539711637



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());
+    updateDisabledInstances(getInstanceConfigMap().values(), getClusterConfig());

Review comment:
       This it to improve the design so we don't make such a mistake again.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r540688969



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -418,6 +418,8 @@ public ClusterConfig getClusterConfig() {
   public void setClusterConfig(ClusterConfig clusterConfig) {
     _clusterConfig = clusterConfig;
     refreshAbnormalStateResolverMap(_clusterConfig);
+    updateIdealRuleMap(_clusterConfig);
+    updateDisabledInstances(getInstanceConfigMap().values(), _clusterConfig);

Review comment:
       I can change to "clusterConfig" instead of "_clusterConfig". But it makes very little difference.
   
   getClusterConfig() is a different story. It is in set method, we should not call get method in a set method. It is just confusing.
   
   Let's don't make it too complicated. My target is just to remove all the unnecessary private field references. And this specific method is a set method. It is necessary.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539696247



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());

Review comment:
       why here we use getClusterConfig() while later in `setClusterConfig` we use _clusterConfig?
   
   In fact, 
   
   ```
     public ClusterConfig getClusterConfig() {
       return _clusterConfig;
     }
   ```
   
   any specific reason?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang edited a comment on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang edited a comment on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-742966189


   > Change looks fine to me. Can you elaborate more we do this from external input? Why we cannot set cluster config and call these functions
   
   That's also a possible way. But the target refresh method is a private one. And I don't think we want to break the OO design here.
   So as a result you cannot set and then trigger a partial refresh from outside. So either you set to override one list, but the corresponding derived fields have old values. Or you call full refresh but your set data will be covered by the ZK data source.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] kaisun2000 commented on a change in pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
kaisun2000 commented on a change in pull request #1580:
URL: https://github.com/apache/helix/pull/1580#discussion_r539707574



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -384,8 +384,8 @@ public synchronized void refresh(HelixDataAccessor accessor) {
     _instanceMessagesCache.updateRelayMessages(_liveInstanceCache.getPropertyMap(),
         _currentStateCache.getParticipantStatesMap());
 
-    updateIdealRuleMap();
-    updateDisabledInstances();
+    updateIdealRuleMap(getClusterConfig());
+    updateDisabledInstances(getInstanceConfigMap().values(), getClusterConfig());

Review comment:
       This seems to be non-essential too, right? The problem is in the` setClusterConfig` and `setInstanceConfig`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] jiajunwang commented on pull request #1580: Update the internal fields of the Data Providr when overriding the data through set methods.

Posted by GitBox <gi...@apache.org>.
jiajunwang commented on pull request #1580:
URL: https://github.com/apache/helix/pull/1580#issuecomment-745642983


   > > > One bigger picture question:
   > > > Think from controller caching layer (of zk) point of view. The write of data is normally from tools or participant or even controller. But they do not use (should not use) these setClusterConfig etc methods of BaseControllerDataProvider.
   > > > Then who should use these set method? Note, these set method does not really persist the data to ZK. BaseControllerDataProvider is read only cache, right?
   > > > In fact, we should mark these set method test only? Otherwise, when people use them, they would be surprised:
   > > > 
   > > > * The data is not going to persist to next controller running cycle.
   > > 
   > > 
   > > I don't agree with you. The rebalancer tool is not test code. They are really used in production.
   > > The best we can do is create a child class based on the basic cache class. But that is out of the scope of this PR. And it is not an essential change.
   > > I suggest focusing on the issue fixing here. For the other topic, feel free to create other issues to track : )
   > 
   > The the question is this:
   > 
   > Calling setClusterConfig method of BaseControllerDataProvider by tools would only change data in the cache. However, it is not going to change data in Zookeeper. Next controller run cycle, the data refreshed from BaseController would be old again.
   > 
   > Is this something expected? I doubt.
   
   As I mentioned this is out of the scope of this PR. And I already proposed a solution in the previous discussion. Do you have any other comments about the change itself? Or I suggest that we discuss future enhancement in another thread.
   I think what you said is valid. But I'm not convinced that this is very important so we need to address in the near future. But if you don't agree, please feel free to create an issue for your concern. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org