You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@helix.apache.org by Zhen Zhang <ne...@gmail.com> on 2014/11/21 20:05:21 UTC

Review Request 28342: [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28342/
-----------------------------------------------------------

Review request for helix and Shi Lu.


Repository: helix-git


Description
-------

[HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model
This is a workaround to avoid livelock in Helix controller @see HELIX-541


Diffs
-----

  helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java b8b3aeb 

Diff: https://reviews.apache.org/r/28342/diff/


Testing
-------


Thanks,

Zhen Zhang


Re: Review Request 28342: [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model

Posted by Shi Lu <lu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28342/#review62598
-----------------------------------------------------------

Ship it!


We still need a integration test for this

- Shi Lu


On Nov. 21, 2014, 7:05 p.m., Zhen Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28342/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2014, 7:05 p.m.)
> 
> 
> Review request for helix and Shi Lu.
> 
> 
> Repository: helix-git
> 
> 
> Description
> -------
> 
> [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model
> This is a workaround to avoid livelock in Helix controller @see HELIX-541
> 
> 
> Diffs
> -----
> 
>   helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java b8b3aeb 
> 
> Diff: https://reviews.apache.org/r/28342/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhen Zhang
> 
>


Re: Review Request 28342: [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model

Posted by Shi Lu <lu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28342/#review62597
-----------------------------------------------------------

Ship it!


Ship It!

- Shi Lu


On Nov. 21, 2014, 7:05 p.m., Zhen Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28342/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2014, 7:05 p.m.)
> 
> 
> Review request for helix and Shi Lu.
> 
> 
> Repository: helix-git
> 
> 
> Description
> -------
> 
> [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model
> This is a workaround to avoid livelock in Helix controller @see HELIX-541
> 
> 
> Diffs
> -----
> 
>   helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java b8b3aeb 
> 
> Diff: https://reviews.apache.org/r/28342/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhen Zhang
> 
>


Re: Review Request 28342: [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model

Posted by Zhen Zhang <ne...@gmail.com>.

> On Nov. 21, 2014, 7:10 p.m., Kishore Gopalakrishna wrote:
> > I am not convinced this is the right thing to do yet. Can we hold on to this. See my comment on https://issues.apache.org/jira/browse/HELIX-541
> 
> Kishore Gopalakrishna wrote:
>     I agree with the work around but I am not clear about the behavior in described in 541. The root cause might be something else.

Assume current state is:
Node_0: LEADER
Node_1: STANDBY

Since we are using full auto, and assume Node_0 holds more partitions than some other nodes, so the rebalancer is trying to "migrate" some partitions from  Node_0 to some other nodes that haven't reached their capacity yet. In this case, the rebalancer comes up with a new ideal-state:
Node_2: LEADER
Node_1: STANDBY

Now it's the Helix controller's resposiblity to move from current-state to the new ideal-state. Thinking about all possible intermediate mappings as a graph, there are multiple paths to walk from current state to ideal state:
Option1) 
Send LEADER->STANDBY to Node_0

Option2)
Send OFFLINE->STANDBY to Node_2

If controller chooses Option1, it goes to an dead-end. The root cause of the problem is that Helix controller uses a greedy algorithm that only looks one step ahead. Given a graph and contraints on the graph, greedy algorithm can't gurantee to find a feasible path.


- Zhen


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28342/#review62601
-----------------------------------------------------------


On Nov. 21, 2014, 7:05 p.m., Zhen Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28342/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2014, 7:05 p.m.)
> 
> 
> Review request for helix and Shi Lu.
> 
> 
> Repository: helix-git
> 
> 
> Description
> -------
> 
> [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model
> This is a workaround to avoid livelock in Helix controller @see HELIX-541
> 
> 
> Diffs
> -----
> 
>   helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java b8b3aeb 
> 
> Diff: https://reviews.apache.org/r/28342/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhen Zhang
> 
>


Re: Review Request 28342: [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model

Posted by Kishore Gopalakrishna <ki...@apache.org>.

> On Nov. 21, 2014, 7:10 p.m., Kishore Gopalakrishna wrote:
> > I am not convinced this is the right thing to do yet. Can we hold on to this. See my comment on https://issues.apache.org/jira/browse/HELIX-541

I agree with the work around but I am not clear about the behavior in described in 541. The root cause might be something else.


- Kishore


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28342/#review62601
-----------------------------------------------------------


On Nov. 21, 2014, 7:05 p.m., Zhen Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28342/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2014, 7:05 p.m.)
> 
> 
> Review request for helix and Shi Lu.
> 
> 
> Repository: helix-git
> 
> 
> Description
> -------
> 
> [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model
> This is a workaround to avoid livelock in Helix controller @see HELIX-541
> 
> 
> Diffs
> -----
> 
>   helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java b8b3aeb 
> 
> Diff: https://reviews.apache.org/r/28342/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhen Zhang
> 
>


Re: Review Request 28342: [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model

Posted by Zhen Zhang <ne...@gmail.com>.

> On Nov. 21, 2014, 7:10 p.m., Kishore Gopalakrishna wrote:
> > I am not convinced this is the right thing to do yet. Can we hold on to this. See my comment on https://issues.apache.org/jira/browse/HELIX-541
> 
> Kishore Gopalakrishna wrote:
>     I agree with the work around but I am not clear about the behavior in described in 541. The root cause might be something else.
> 
> Zhen Zhang wrote:
>     Assume current state is:
>     Node_0: LEADER
>     Node_1: STANDBY
>     
>     Since we are using full auto, and assume Node_0 holds more partitions than some other nodes, so the rebalancer is trying to "migrate" some partitions from  Node_0 to some other nodes that haven't reached their capacity yet. In this case, the rebalancer comes up with a new ideal-state:
>     Node_2: LEADER
>     Node_1: STANDBY
>     
>     Now it's the Helix controller's resposiblity to move from current-state to the new ideal-state. Thinking about all possible intermediate mappings as a graph, there are multiple paths to walk from current state to ideal state:
>     Option1) 
>     Send LEADER->STANDBY to Node_0
>     
>     Option2)
>     Send OFFLINE->STANDBY to Node_2
>     
>     If controller chooses Option1, it goes to an dead-end. The root cause of the problem is that Helix controller uses a greedy algorithm that only looks one step ahead. Given a graph and contraints on the graph, greedy algorithm can't gurantee to find a feasible path.

never mind. the priority is already in correct order. root cause should be something else.


- Zhen


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28342/#review62601
-----------------------------------------------------------


On Nov. 21, 2014, 7:05 p.m., Zhen Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28342/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2014, 7:05 p.m.)
> 
> 
> Review request for helix and Shi Lu.
> 
> 
> Repository: helix-git
> 
> 
> Description
> -------
> 
> [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model
> This is a workaround to avoid livelock in Helix controller @see HELIX-541
> 
> 
> Diffs
> -----
> 
>   helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java b8b3aeb 
> 
> Diff: https://reviews.apache.org/r/28342/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhen Zhang
> 
>


Re: Review Request 28342: [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model

Posted by Kishore Gopalakrishna <ki...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28342/#review62601
-----------------------------------------------------------


I am not convinced this is the right thing to do yet. Can we hold on to this. See my comment on https://issues.apache.org/jira/browse/HELIX-541

- Kishore Gopalakrishna


On Nov. 21, 2014, 7:05 p.m., Zhen Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28342/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2014, 7:05 p.m.)
> 
> 
> Review request for helix and Shi Lu.
> 
> 
> Repository: helix-git
> 
> 
> Description
> -------
> 
> [HELIX-556] Reorder transition priority in LeaderStandby/MasterSlave state model
> This is a workaround to avoid livelock in Helix controller @see HELIX-541
> 
> 
> Diffs
> -----
> 
>   helix-core/src/main/java/org/apache/helix/tools/StateModelConfigGenerator.java b8b3aeb 
> 
> Diff: https://reviews.apache.org/r/28342/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhen Zhang
> 
>