You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@helix.apache.org by kishore g <g....@gmail.com> on 2014/02/27 01:08:25 UTC

Dependency on Zookeeper at runtime

Nice article by Pinterest folks on Zookeeper as SPoF.
http://engineering.pinterest.com/post/77933733851/zookeeper-resilience-at-pinterest

Though I agree with the problems, not sure I would go the extent of having
separate daemons, it introduces more fault points.

However, with Helix we have designed the system to continue to work in the
current state if Zookeeper crashes. Atleast I had that goal during initial
coding phase.

Basically the system to work as if nothing happened. The only compromise is
that no more transitions can happen in the system while zookeeper is down.

Should we add an integration test to always guarantee this property. Is
this valuable.

thanks,
Kishore G

RE: Dependency on Zookeeper at runtime

Posted by Kanak Biscuitwala <ka...@hotmail.com>.
Agreed that this is important. Looking at the current code, it seems to do the right thing (Helix ignores change in connection state unless there's flapping behavior). I can write a test.

Kanak

________________________________
> Date: Wed, 26 Feb 2014 16:08:25 -0800 
> Subject: Dependency on Zookeeper at runtime 
> From: g.kishore@gmail.com 
> To: dev@helix.apache.org; user@helix.apache.org 
> 
> Nice article by Pinterest folks on Zookeeper as SPoF. 
> http://engineering.pinterest.com/post/77933733851/zookeeper-resilience-at-pinterest 
> 
> Though I agree with the problems, not sure I would go the extent of 
> having separate daemons, it introduces more fault points. 
> 
> However, with Helix we have designed the system to continue to work in 
> the current state if Zookeeper crashes. Atleast I had that goal during 
> initial coding phase. 
> 
> Basically the system to work as if nothing happened. The only 
> compromise is that no more transitions can happen in the system while 
> zookeeper is down. 
> 
> Should we add an integration test to always guarantee this property. Is 
> this valuable. 
> 
> thanks, 
> Kishore G 
 		 	   		  

RE: Dependency on Zookeeper at runtime

Posted by Kanak Biscuitwala <ka...@hotmail.com>.
Agreed that this is important. Looking at the current code, it seems to do the right thing (Helix ignores change in connection state unless there's flapping behavior). I can write a test.

Kanak

________________________________
> Date: Wed, 26 Feb 2014 16:08:25 -0800 
> Subject: Dependency on Zookeeper at runtime 
> From: g.kishore@gmail.com 
> To: dev@helix.apache.org; user@helix.apache.org 
> 
> Nice article by Pinterest folks on Zookeeper as SPoF. 
> http://engineering.pinterest.com/post/77933733851/zookeeper-resilience-at-pinterest 
> 
> Though I agree with the problems, not sure I would go the extent of 
> having separate daemons, it introduces more fault points. 
> 
> However, with Helix we have designed the system to continue to work in 
> the current state if Zookeeper crashes. Atleast I had that goal during 
> initial coding phase. 
> 
> Basically the system to work as if nothing happened. The only 
> compromise is that no more transitions can happen in the system while 
> zookeeper is down. 
> 
> Should we add an integration test to always guarantee this property. Is 
> this valuable. 
> 
> thanks, 
> Kishore G