You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@helix.apache.org by "subramanian raghunathan (JIRA)" <ji...@apache.org> on 2017/01/26 20:04:24 UTC

[jira] [Commented] (HELIX-652) Double assignment , when participant is not able to establish connection with zookeeper quorum

    [ https://issues.apache.org/jira/browse/HELIX-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840342#comment-15840342 ] 

subramanian raghunathan commented on HELIX-652:
-----------------------------------------------

Thoughts/Inputs from Kishore:
Helix can handle this and probably should. Couple of challenges here are
1.	How to generalize this across all use cases. This is a trade-off between availability and ensuring there is only one leader per partition. 
2.	There is a pathological case where all zookeeper nodes get partitioned/crash/GC. In this case, we will make all participants disconnect and assume they don't own the partition. But when zookeepers come out of GC, it can continue as if nothing happened i.e it does not account for the time when its down. I can't think of a good solution for this scenario. Moreover, we cannot differentiate between a participant GC'ing/partitioned v/s ZK ensemble crash/partition/GC. This is typically avoided by ensuring ZK servers are deployed on different racks.
Having said that, I think implementing a config based solution is worth it. 


> Double assignment , when participant is not able to establish connection with zookeeper quorum
> ----------------------------------------------------------------------------------------------
>
>                 Key: HELIX-652
>                 URL: https://issues.apache.org/jira/browse/HELIX-652
>             Project: Apache Helix
>          Issue Type: Bug
>          Components: helix-core
>    Affects Versions: 0.7.1, 0.6.4
>            Reporter: subramanian raghunathan
>
> Double assignment , when participant is not able to establish connection with zookeeper quorum 
>  
> Following is the  set up. 
> Version(s) :               Helix: 0.7.1
>                                 Zookeeper:3.3.4
>  
> - State Model: OnlineOffline 
> - Controller (leader elected from one of the cluster nodes)
> - Single resources with partitions.
> - Full auto rebalancer
>  
> -Zookeeper quorum (3 nodes)
>  
> When one participant loses the zookeeper connection (It’s not able to connect to any of the zookeepers , a typical occurrence we faced was switch failure from that rack or a network switch failure on a node) 
>  
>   ---- >  The partition (P1) for which this participant (say Node N1) is online is still maintained
>  
> Meanwhile since it loses the ephemeral  node in zookeeper , the rebalancer gets triggered and it reallocates the partition (P1) to another participant node (say Node N2) to become online  @ time T1
>  
>                 ---- >  After this both N1 and N2 are acting as online for the same Partition (P1) 
>  
> But as soon as participant in (say Node N1) is able to re-establish the zookeeper connection  @ time T2
>                 ---- >  Reset gets called on the partition in participant (say Node N1) 
>                 
> Double assignment: 
> The question here is this an expected behavior that both nodes N1 and N2 could be online for the same Partition (P1) between time (T1-T2) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)