You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Patrick Hunt <ph...@apache.org> on 2013/07/02 01:59:13 UTC

Re: Make zookeeper more test friendly

It's come up a bunch of times before, would be great to find someone
to drive this!
http://markmail.org/message/vvk2ttrdhe6qqp2q

Patrick

On Fri, Jun 28, 2013 at 1:08 AM, kishore g <g....@gmail.com> wrote:
> +1
>
> We had to do similar stuff internally at LinkedIn and most of the bugs we
> found were in the way session expiry/disconnect handling. We did a
> combination of iptables, SIGSTOP and having another client connect with
> same session id/password and close that connection. This is non trivial and
> requires some effort to wire up different pieces.
>
> However I would like to add that the even though our test cases worked we
> had weird issues during GC's and some times during long GC. GC on both
> server and client are problematic. For example clients would get a session
> expiry and then a syncconnected event but before syncconnected is processed
> there would be another session expiry. These scenarios are much harder to
> test for and reproduce.
>
> Thanks for taking this up.
>
> Thanks,
> Kishore G
>
>
> On Thu, Jun 27, 2013 at 10:38 PM, Ted Dunning <te...@gmail.com> wrote:
>
>> +1  this is a very big deal
>>
>>
>>
>>
>> On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:
>>
>> > Many recent issues that I saw internally is due to incorrect handling or
>> > no sufficient testing on ZooKeeper failure scenario in the custom wrapper
>> > API or in the applications.
>> >
>> > I am thinking that we might be able to expose a few more API calls that
>> > allow user write unit tests that cover various failure scenarios (similar
>> > to the TestableZookeer in zookeeper test) . This should also minimize the
>> > effort on setting the test framework.  Ideally, if we have a mock client
>> > that don't need a running the server that would be ideal, but I think it
>> is
>> > too much effort to write and maintain for all the languages. Our internal
>> > test facility is that we have a dedicated ensemble used by all unit
>> tests.
>> > This ensure application logic correctness but it is hard to test various
>> > failure scenarios.
>> >
>> > So my current thought is to expose the following functionalities.
>> >
>> >   1.  zookeeper_close() that don't actually send close request to the
>> > server:     This can be used to simulate a client crash without actually
>> > crashing the test program.
>> >   2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
>> > event:    This will allow the user to test their watchers and callback
>> (and
>> > possible race condition)
>> >
>> > Let me know if you have additional suggestions.
>> >
>> >
>> > --
>> > Thawan Kooburat
>> >
>>

Re: Make zookeeper more test friendly

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
FYI

I've created the first patch for this series - it adds support for injecting a session expiration:

https://issues.apache.org/jira/browse/ZOOKEEPER-1730

-JZ

On Jul 6, 2013, at 8:40 AM, Jordan Zimmerman <jo...@jordanzimmerman.com> wrote:

> Is there a Jira to track this? I'd like to do some work on this. 
> 
> -Jordan
> 
> On Jul 1, 2013, at 4:59 PM, Patrick Hunt <ph...@apache.org> wrote:
> 
>> It's come up a bunch of times before, would be great to find someone
>> to drive this!
>> http://markmail.org/message/vvk2ttrdhe6qqp2q
>> 
>> Patrick
>> 
>> On Fri, Jun 28, 2013 at 1:08 AM, kishore g <g....@gmail.com> wrote:
>>> +1
>>> 
>>> We had to do similar stuff internally at LinkedIn and most of the bugs we
>>> found were in the way session expiry/disconnect handling. We did a
>>> combination of iptables, SIGSTOP and having another client connect with
>>> same session id/password and close that connection. This is non trivial and
>>> requires some effort to wire up different pieces.
>>> 
>>> However I would like to add that the even though our test cases worked we
>>> had weird issues during GC's and some times during long GC. GC on both
>>> server and client are problematic. For example clients would get a session
>>> expiry and then a syncconnected event but before syncconnected is processed
>>> there would be another session expiry. These scenarios are much harder to
>>> test for and reproduce.
>>> 
>>> Thanks for taking this up.
>>> 
>>> Thanks,
>>> Kishore G
>>> 
>>> 
>>> On Thu, Jun 27, 2013 at 10:38 PM, Ted Dunning <te...@gmail.com> wrote:
>>> 
>>>> +1  this is a very big deal
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:
>>>> 
>>>>> Many recent issues that I saw internally is due to incorrect handling or
>>>>> no sufficient testing on ZooKeeper failure scenario in the custom wrapper
>>>>> API or in the applications.
>>>>> 
>>>>> I am thinking that we might be able to expose a few more API calls that
>>>>> allow user write unit tests that cover various failure scenarios (similar
>>>>> to the TestableZookeer in zookeeper test) . This should also minimize the
>>>>> effort on setting the test framework.  Ideally, if we have a mock client
>>>>> that don't need a running the server that would be ideal, but I think it
>>>> is
>>>>> too much effort to write and maintain for all the languages. Our internal
>>>>> test facility is that we have a dedicated ensemble used by all unit
>>>> tests.
>>>>> This ensure application logic correctness but it is hard to test various
>>>>> failure scenarios.
>>>>> 
>>>>> So my current thought is to expose the following functionalities.
>>>>> 
>>>>> 1.  zookeeper_close() that don't actually send close request to the
>>>>> server:     This can be used to simulate a client crash without actually
>>>>> crashing the test program.
>>>>> 2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
>>>>> event:    This will allow the user to test their watchers and callback
>>>> (and
>>>>> possible race condition)
>>>>> 
>>>>> Let me know if you have additional suggestions.
>>>>> 
>>>>> 
>>>>> --
>>>>> Thawan Kooburat
>>>>> 
>>>> 
> 


Re: Make zookeeper more test friendly

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
Is there a Jira to track this? I'd like to do some work on this. 

-Jordan

On Jul 1, 2013, at 4:59 PM, Patrick Hunt <ph...@apache.org> wrote:

> It's come up a bunch of times before, would be great to find someone
> to drive this!
> http://markmail.org/message/vvk2ttrdhe6qqp2q
> 
> Patrick
> 
> On Fri, Jun 28, 2013 at 1:08 AM, kishore g <g....@gmail.com> wrote:
>> +1
>> 
>> We had to do similar stuff internally at LinkedIn and most of the bugs we
>> found were in the way session expiry/disconnect handling. We did a
>> combination of iptables, SIGSTOP and having another client connect with
>> same session id/password and close that connection. This is non trivial and
>> requires some effort to wire up different pieces.
>> 
>> However I would like to add that the even though our test cases worked we
>> had weird issues during GC's and some times during long GC. GC on both
>> server and client are problematic. For example clients would get a session
>> expiry and then a syncconnected event but before syncconnected is processed
>> there would be another session expiry. These scenarios are much harder to
>> test for and reproduce.
>> 
>> Thanks for taking this up.
>> 
>> Thanks,
>> Kishore G
>> 
>> 
>> On Thu, Jun 27, 2013 at 10:38 PM, Ted Dunning <te...@gmail.com> wrote:
>> 
>>> +1  this is a very big deal
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:
>>> 
>>>> Many recent issues that I saw internally is due to incorrect handling or
>>>> no sufficient testing on ZooKeeper failure scenario in the custom wrapper
>>>> API or in the applications.
>>>> 
>>>> I am thinking that we might be able to expose a few more API calls that
>>>> allow user write unit tests that cover various failure scenarios (similar
>>>> to the TestableZookeer in zookeeper test) . This should also minimize the
>>>> effort on setting the test framework.  Ideally, if we have a mock client
>>>> that don't need a running the server that would be ideal, but I think it
>>> is
>>>> too much effort to write and maintain for all the languages. Our internal
>>>> test facility is that we have a dedicated ensemble used by all unit
>>> tests.
>>>> This ensure application logic correctness but it is hard to test various
>>>> failure scenarios.
>>>> 
>>>> So my current thought is to expose the following functionalities.
>>>> 
>>>>  1.  zookeeper_close() that don't actually send close request to the
>>>> server:     This can be used to simulate a client crash without actually
>>>> crashing the test program.
>>>>  2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
>>>> event:    This will allow the user to test their watchers and callback
>>> (and
>>>> possible race condition)
>>>> 
>>>> Let me know if you have additional suggestions.
>>>> 
>>>> 
>>>> --
>>>> Thawan Kooburat
>>>> 
>>>