You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@helix.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/03/09 01:28:00 UTC

[jira] [Commented] (HELIX-676) Controller keeps updating idealstates when there is no real diff.

    [ https://issues.apache.org/jira/browse/HELIX-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392216#comment-16392216 ] 

ASF GitHub Bot commented on HELIX-676:
--------------------------------------

GitHub user jiajunwang opened a pull request:

    https://github.com/apache/helix/pull/143

    [HELIX-676] Fix the issue that the controller keep updating idealstates when there is no real diff.

    The causes of the problem are that:
    
    1.A previous issue introduced into IntermediateStateCalcStage prevents ERROR/OFFLINE state replicas from being added to the intermediateState, given the controller thinks recovery rebalance is not necessary.
    This makes the processed stateMapping in pipeline always different from the cached IdeaStates. And then causes endless updating.
    
    2.Another separate change in persistAssignmentStage is also related to this issue. When updating the map/list, we used putAll. This call will keep all existing items but only modify the intersect. Our assumption previously is allow customized items. However, when we investigate this issue, it would be error-prone to allow these customized items in the map/list fields. Helix won't be able to tell if one item is added by the controller or users. So we decide to use clear and set instead. This ensure the map/list fields are always uptodate.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiajunwang/helix helix-676

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/helix/pull/143.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #143
    
----
commit 9083950d9a0e5f5c377672fb5b21af13cdde0d9a
Author: Jiajun Wang <jj...@...>
Date:   2018-02-12T19:45:18Z

    [HELIX-676] Fix the issue that the controller keep updating idealstates when there is no real diff.
    
    The causes of the problem are that:
    
    1.A previous issue introduced into IntermediateStateCalcStage prevents ERROR/OFFLINE state replicas from being added to the intermediateState, given the controller thinks recovery rebalance is not necessary.
    This makes the processed stateMapping in pipeline always different from the cached IdeaStates. And then causes endless updating.
    
    2.Another separate change in persistAssignmentStage is also related to this issue. When updating the map/list, we used putAll. This call will keep all existing items but only modify the intersect. Our assumption previously is allow customized items. However, when we investigate this issue, it would be error-prone to allow these customized items in the map/list fields. Helix won't be able to tell if one item is added by the controller or users. So we decide to use clear and set instead. This ensure the map/list fields are always uptodate.

commit 08ff9d0b287ece2a17f2430ba82bd4f2373263e7
Author: Jiajun Wang <jj...@...>
Date:   2018-02-17T00:31:27Z

    [HELIX-676] Adding test for intermediate state cal stage.
    
    Testing 4 cases (recovery, load balance, recovery with transient states, and error partition blocks load balance).
    This test covers the recent fix for HELIX-676.

----


> Controller keeps updating idealstates when there is no real diff.
> -----------------------------------------------------------------
>
>                 Key: HELIX-676
>                 URL: https://issues.apache.org/jira/browse/HELIX-676
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Jiajun Wang
>            Assignee: Jiajun Wang
>            Priority: Major
>
> An issue has been confirmed that controller may keep updating ideastates when PERSIST_****_STATE is true.
> This increase ZK traffic a lot, and cause performance issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)