You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@helix.apache.org by Ming Fang <mi...@mac.com> on 2013/03/03 06:53:16 UTC

Failure detection time

How can I tune the amount of time it takes for detecting a failed node, e.g. kill -9?
Is it by setting "helixmanager.flappingTimeWindow"?

What is the fastest possible time for a failover?

Re: Failure detection time

Posted by kishore g <g....@gmail.com>.

Thanks. Pushed the fix.


On Sun, Mar 3, 2013 at 9:16 PM, Ming Fang <mi...@mac.com> wrote:

> It's just a one liner fix
>
> https://github.com/mingfang/apache-helix/commit/c7a7a840c9347cb362080619c53db23345b5ed10
>
> I'm afraid writing a proper test to detect session timeout is beyond me at
> this point.
>
> On Mar 3, 2013, at 11:59 PM, kishore g <g....@gmail.com> wrote:
>
> Thanks Ming, good catch. Do you mind submitting a patch and adding a test
> case ?
>
> https://issues.apache.org/jira/browse/HELIX-55
>
> Thanks,
> Kishore G
>
>
>
>
>
> On Sun, Mar 3, 2013 at 10:34 AM, Ming Fang <mi...@mac.com> wrote:
>
>> I've tried setting zk.session.timeout property from my participants but I
>> don't think it's working.
>> Looking at org.apache.helix.manager.zk.ZKHelixManager line 155, it seems
>> the session timeout is made same value as helixmanager.flappingTimeWindow.
>> That looks like a bug since these two values are for different purposes.
>>
>> As a temporary workaround, this is a hack that works
>>
>>             manager = HelixManagerFactory.getZKHelixManager(CLUSTER_NAME,
>> instanceName, InstanceType.PARTICIPANT, ZK_ADDRESS);
>>             {
>>                 //hack to set sessionTimeout
>>                 Field sessionTimeout =
>> ZKHelixManager.class.getDeclaredField("_sessionTimeout");
>>                 sessionTimeout.setAccessible(true);
>>                 sessionTimeout.setInt(manager, 1000);
>>             }
>>
>> Also on the Zookeeper side I made the tickTime =500 and minSessionTimeout
>> = 1000.
>>
>> On Mar 3, 2013, at 1:53 AM, kishore g <g....@gmail.com> wrote:
>>
>> There are two kinds of fail over planned( during software upgrade)
>> unplanned( node crash etc).
>>
>> For planned, you should add a jvm shutdownhook from which will you invoke
>> helixmanager.disconnect() and then invoke kill <pid>. This will allow Helix
>> to detect the failure immediately like 5-15 milli seconds.
>>
>> For unplanned, it is determined by zookeeper session timeout, this is by
>> default set to 30 seconds. You can change this to be more aggressive like
>> 5,10 or 15 seconds. Recommended value 15 seconds. You can change this by
>> setting system property "zk.session.timeout"= 15*1000.
>>
>> helixmanager.flappingTimeWindow and helixmanager.maxDisconnectThreshold
>> can be tuned in case you have bad network situations and excessive GC's.
>> You probably dont need to tune this, but let me know if you need additional
>> info on this.
>>
>> Fail over depends on number of partitions, nodes, resources etc in the
>> system.  For a 1000 partition system with 10 nodes, failover time for one
>> node might be 200-300 milliseconds.
>>
>> Jason has done lot of performance improvements on another branch that
>> might improve this time further.
>>
>> thanks,
>> Kishore G
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sat, Mar 2, 2013 at 9:53 PM, Ming Fang <mi...@mac.com> wrote:
>>
>>> How can I tune the amount of time it takes for detecting a failed node,
>>> e.g. kill -9?
>>> Is it by setting "helixmanager.flappingTimeWindow"?
>>>
>>> What is the fastest possible time for a failover?
>>
>>
>>
>>
>
>

Re: Failure detection time

Posted by Ming Fang <mi...@mac.com>.

It's just a one liner fix
https://github.com/mingfang/apache-helix/commit/c7a7a840c9347cb362080619c53db23345b5ed10

I'm afraid writing a proper test to detect session timeout is beyond me at this point.

On Mar 3, 2013, at 11:59 PM, kishore g <g....@gmail.com> wrote:

> Thanks Ming, good catch. Do you mind submitting a patch and adding a test case ?
> 
> https://issues.apache.org/jira/browse/HELIX-55
> 
> Thanks,
> Kishore G
> 
> 
> 
> 
> 
> On Sun, Mar 3, 2013 at 10:34 AM, Ming Fang <mi...@mac.com> wrote:
> I've tried setting zk.session.timeout property from my participants but I don't think it's working.
> Looking at org.apache.helix.manager.zk.ZKHelixManager line 155, it seems the session timeout is made same value as helixmanager.flappingTimeWindow.
> That looks like a bug since these two values are for different purposes.
> 
> As a temporary workaround, this is a hack that works
> 
>             manager = HelixManagerFactory.getZKHelixManager(CLUSTER_NAME, instanceName, InstanceType.PARTICIPANT, ZK_ADDRESS);
>             {
>                 //hack to set sessionTimeout
>                 Field sessionTimeout = ZKHelixManager.class.getDeclaredField("_sessionTimeout");
>                 sessionTimeout.setAccessible(true);
>                 sessionTimeout.setInt(manager, 1000);
>             }
> 
> Also on the Zookeeper side I made the tickTime =500 and minSessionTimeout = 1000.
> 
> On Mar 3, 2013, at 1:53 AM, kishore g <g....@gmail.com> wrote:
> 
>> There are two kinds of fail over planned( during software upgrade) unplanned( node crash etc). 
>> 
>> For planned, you should add a jvm shutdownhook from which will you invoke helixmanager.disconnect() and then invoke kill <pid>. This will allow Helix to detect the failure immediately like 5-15 milli seconds.
>> 
>> For unplanned, it is determined by zookeeper session timeout, this is by default set to 30 seconds. You can change this to be more aggressive like 5,10 or 15 seconds. Recommended value 15 seconds. You can change this by setting system property "zk.session.timeout"= 15*1000.
>> 
>> helixmanager.flappingTimeWindow and helixmanager.maxDisconnectThreshold can be tuned in case you have bad network situations and excessive GC's. You probably dont need to tune this, but let me know if you need additional info on this.
>> 
>> Fail over depends on number of partitions, nodes, resources etc in the system.  For a 1000 partition system with 10 nodes, failover time for one node might be 200-300 milliseconds. 
>> 
>> Jason has done lot of performance improvements on another branch that might improve this time further. 
>> 
>> thanks,
>> Kishore G
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Sat, Mar 2, 2013 at 9:53 PM, Ming Fang <mi...@mac.com> wrote:
>> How can I tune the amount of time it takes for detecting a failed node, e.g. kill -9?
>> Is it by setting "helixmanager.flappingTimeWindow"?
>> 
>> What is the fastest possible time for a failover?
>> 
> 
>

Re: Failure detection time

Posted by kishore g <g....@gmail.com>.

Thanks Ming, good catch. Do you mind submitting a patch and adding a test
case ?

https://issues.apache.org/jira/browse/HELIX-55

Thanks,
Kishore G





On Sun, Mar 3, 2013 at 10:34 AM, Ming Fang <mi...@mac.com> wrote:

> I've tried setting zk.session.timeout property from my participants but I
> don't think it's working.
> Looking at org.apache.helix.manager.zk.ZKHelixManager line 155, it seems
> the session timeout is made same value as helixmanager.flappingTimeWindow.
> That looks like a bug since these two values are for different purposes.
>
> As a temporary workaround, this is a hack that works
>
>             manager = HelixManagerFactory.getZKHelixManager(CLUSTER_NAME,
> instanceName, InstanceType.PARTICIPANT, ZK_ADDRESS);
>             {
>                 //hack to set sessionTimeout
>                 Field sessionTimeout =
> ZKHelixManager.class.getDeclaredField("_sessionTimeout");
>                 sessionTimeout.setAccessible(true);
>                 sessionTimeout.setInt(manager, 1000);
>             }
>
> Also on the Zookeeper side I made the tickTime =500 and minSessionTimeout
> = 1000.
>
> On Mar 3, 2013, at 1:53 AM, kishore g <g....@gmail.com> wrote:
>
> There are two kinds of fail over planned( during software upgrade)
> unplanned( node crash etc).
>
> For planned, you should add a jvm shutdownhook from which will you invoke
> helixmanager.disconnect() and then invoke kill <pid>. This will allow Helix
> to detect the failure immediately like 5-15 milli seconds.
>
> For unplanned, it is determined by zookeeper session timeout, this is by
> default set to 30 seconds. You can change this to be more aggressive like
> 5,10 or 15 seconds. Recommended value 15 seconds. You can change this by
> setting system property "zk.session.timeout"= 15*1000.
>
> helixmanager.flappingTimeWindow and helixmanager.maxDisconnectThreshold
> can be tuned in case you have bad network situations and excessive GC's.
> You probably dont need to tune this, but let me know if you need additional
> info on this.
>
> Fail over depends on number of partitions, nodes, resources etc in the
> system.  For a 1000 partition system with 10 nodes, failover time for one
> node might be 200-300 milliseconds.
>
> Jason has done lot of performance improvements on another branch that
> might improve this time further.
>
> thanks,
> Kishore G
>
>
>
>
>
>
>
>
> On Sat, Mar 2, 2013 at 9:53 PM, Ming Fang <mi...@mac.com> wrote:
>
>> How can I tune the amount of time it takes for detecting a failed node,
>> e.g. kill -9?
>> Is it by setting "helixmanager.flappingTimeWindow"?
>>
>> What is the fastest possible time for a failover?
>
>
>
>

Re: Failure detection time

Posted by Ming Fang <mi...@mac.com>.

I've tried setting zk.session.timeout property from my participants but I don't think it's working.
Looking at org.apache.helix.manager.zk.ZKHelixManager line 155, it seems the session timeout is made same value as helixmanager.flappingTimeWindow.
That looks like a bug since these two values are for different purposes.

As a temporary workaround, this is a hack that works

            manager = HelixManagerFactory.getZKHelixManager(CLUSTER_NAME, instanceName, InstanceType.PARTICIPANT, ZK_ADDRESS);
            {
                //hack to set sessionTimeout
                Field sessionTimeout = ZKHelixManager.class.getDeclaredField("_sessionTimeout");
                sessionTimeout.setAccessible(true);
                sessionTimeout.setInt(manager, 1000);
            }

Also on the Zookeeper side I made the tickTime =500 and minSessionTimeout = 1000.

On Mar 3, 2013, at 1:53 AM, kishore g <g....@gmail.com> wrote:

> There are two kinds of fail over planned( during software upgrade) unplanned( node crash etc). 
> 
> For planned, you should add a jvm shutdownhook from which will you invoke helixmanager.disconnect() and then invoke kill <pid>. This will allow Helix to detect the failure immediately like 5-15 milli seconds.
> 
> For unplanned, it is determined by zookeeper session timeout, this is by default set to 30 seconds. You can change this to be more aggressive like 5,10 or 15 seconds. Recommended value 15 seconds. You can change this by setting system property "zk.session.timeout"= 15*1000.
> 
> helixmanager.flappingTimeWindow and helixmanager.maxDisconnectThreshold can be tuned in case you have bad network situations and excessive GC's. You probably dont need to tune this, but let me know if you need additional info on this.
> 
> Fail over depends on number of partitions, nodes, resources etc in the system.  For a 1000 partition system with 10 nodes, failover time for one node might be 200-300 milliseconds. 
> 
> Jason has done lot of performance improvements on another branch that might improve this time further. 
> 
> thanks,
> Kishore G
> 
> 
> 
> 
> 
> 
> 
> 
> On Sat, Mar 2, 2013 at 9:53 PM, Ming Fang <mi...@mac.com> wrote:
> How can I tune the amount of time it takes for detecting a failed node, e.g. kill -9?
> Is it by setting "helixmanager.flappingTimeWindow"?
> 
> What is the fastest possible time for a failover?
>

Re: Failure detection time

Posted by Ming Fang <mi...@mac.com>.

Thanks Kishore.

For our system we're going to start small. 
It consist of 1 controller, 1 master, 1 slave.
But the unplanned failover time must be under 1 second.

I tried setting zk.session.timeout to 1000 on the participants but it doesn't seem to make a difference. It still takes 30 seconds for the controller to detect a killed node. 
Do I have to set this property every, e.g. Zookeeper, controller, and participants?


Sent from my iPad

On Mar 3, 2013, at 1:53 AM, kishore g <g....@gmail.com> wrote:

> There are two kinds of fail over planned( during software upgrade) unplanned( node crash etc). 
> 
> For planned, you should add a jvm shutdownhook from which will you invoke helixmanager.disconnect() and then invoke kill <pid>. This will allow Helix to detect the failure immediately like 5-15 milli seconds.
> 
> For unplanned, it is determined by zookeeper session timeout, this is by default set to 30 seconds. You can change this to be more aggressive like 5,10 or 15 seconds. Recommended value 15 seconds. You can change this by setting system property "zk.session.timeout"= 15*1000.
> 
> helixmanager.flappingTimeWindow and helixmanager.maxDisconnectThreshold can be tuned in case you have bad network situations and excessive GC's. You probably dont need to tune this, but let me know if you need additional info on this.
> 
> Fail over depends on number of partitions, nodes, resources etc in the system.  For a 1000 partition system with 10 nodes, failover time for one node might be 200-300 milliseconds. 
> 
> Jason has done lot of performance improvements on another branch that might improve this time further. 
> 
> thanks,
> Kishore G
> 
> 
> 
> 
> 
> 
> 
> 
> On Sat, Mar 2, 2013 at 9:53 PM, Ming Fang <mi...@mac.com> wrote:
>> How can I tune the amount of time it takes for detecting a failed node, e.g. kill -9?
>> Is it by setting "helixmanager.flappingTimeWindow"?
>> 
>> What is the fastest possible time for a failover?
>

Re: Failure detection time

Posted by kishore g <g....@gmail.com>.

There are two kinds of fail over planned( during software upgrade)
unplanned( node crash etc).

For planned, you should add a jvm shutdownhook from which will you invoke
helixmanager.disconnect() and then invoke kill <pid>. This will allow Helix
to detect the failure immediately like 5-15 milli seconds.

For unplanned, it is determined by zookeeper session timeout, this is by
default set to 30 seconds. You can change this to be more aggressive like
5,10 or 15 seconds. Recommended value 15 seconds. You can change this by
setting system property "zk.session.timeout"= 15*1000.

helixmanager.flappingTimeWindow and helixmanager.maxDisconnectThreshold can
be tuned in case you have bad network situations and excessive GC's. You
probably dont need to tune this, but let me know if you need additional
info on this.

Fail over depends on number of partitions, nodes, resources etc in the
system.  For a 1000 partition system with 10 nodes, failover time for one
node might be 200-300 milliseconds.

Jason has done lot of performance improvements on another branch that might
improve this time further.

thanks,
Kishore G

On Sat, Mar 2, 2013 at 9:53 PM, Ming Fang <mi...@mac.com> wrote:

> How can I tune the amount of time it takes for detecting a failed node,
> e.g. kill -9?
> Is it by setting "helixmanager.flappingTimeWindow"?
>
> What is the fastest possible time for a failover?