You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zookeeper.apache.org by Ted Dunning <te...@gmail.com> on 2013/06/28 07:38:05 UTC

Re: Make zookeeper more test friendly

+1  this is a very big deal




On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:

> Many recent issues that I saw internally is due to incorrect handling or
> no sufficient testing on ZooKeeper failure scenario in the custom wrapper
> API or in the applications.
>
> I am thinking that we might be able to expose a few more API calls that
> allow user write unit tests that cover various failure scenarios (similar
> to the TestableZookeer in zookeeper test) . This should also minimize the
> effort on setting the test framework.  Ideally, if we have a mock client
> that don't need a running the server that would be ideal, but I think it is
> too much effort to write and maintain for all the languages. Our internal
> test facility is that we have a dedicated ensemble used by all unit tests.
> This ensure application logic correctness but it is hard to test various
> failure scenarios.
>
> So my current thought is to expose the following functionalities.
>
>   1.  zookeeper_close() that don't actually send close request to the
> server:     This can be used to simulate a client crash without actually
> crashing the test program.
>   2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
> event:    This will allow the user to test their watchers and callback (and
> possible race condition)
>
> Let me know if you have additional suggestions.
>
>
> --
> Thawan Kooburat
>

Re: Make zookeeper more test friendly

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

FYI

I've created the first patch for this series - it adds support for injecting a session expiration:

https://issues.apache.org/jira/browse/ZOOKEEPER-1730

-JZ

On Jul 6, 2013, at 8:40 AM, Jordan Zimmerman <jo...@jordanzimmerman.com> wrote:

> Is there a Jira to track this? I'd like to do some work on this. 
> 
> -Jordan
> 
> On Jul 1, 2013, at 4:59 PM, Patrick Hunt <ph...@apache.org> wrote:
> 
>> It's come up a bunch of times before, would be great to find someone
>> to drive this!
>> http://markmail.org/message/vvk2ttrdhe6qqp2q
>> 
>> Patrick
>> 
>> On Fri, Jun 28, 2013 at 1:08 AM, kishore g <g....@gmail.com> wrote:
>>> +1
>>> 
>>> We had to do similar stuff internally at LinkedIn and most of the bugs we
>>> found were in the way session expiry/disconnect handling. We did a
>>> combination of iptables, SIGSTOP and having another client connect with
>>> same session id/password and close that connection. This is non trivial and
>>> requires some effort to wire up different pieces.
>>> 
>>> However I would like to add that the even though our test cases worked we
>>> had weird issues during GC's and some times during long GC. GC on both
>>> server and client are problematic. For example clients would get a session
>>> expiry and then a syncconnected event but before syncconnected is processed
>>> there would be another session expiry. These scenarios are much harder to
>>> test for and reproduce.
>>> 
>>> Thanks for taking this up.
>>> 
>>> Thanks,
>>> Kishore G
>>> 
>>> 
>>> On Thu, Jun 27, 2013 at 10:38 PM, Ted Dunning <te...@gmail.com> wrote:
>>> 
>>>> +1  this is a very big deal
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:
>>>> 
>>>>> Many recent issues that I saw internally is due to incorrect handling or
>>>>> no sufficient testing on ZooKeeper failure scenario in the custom wrapper
>>>>> API or in the applications.
>>>>> 
>>>>> I am thinking that we might be able to expose a few more API calls that
>>>>> allow user write unit tests that cover various failure scenarios (similar
>>>>> to the TestableZookeer in zookeeper test) . This should also minimize the
>>>>> effort on setting the test framework.  Ideally, if we have a mock client
>>>>> that don't need a running the server that would be ideal, but I think it
>>>> is
>>>>> too much effort to write and maintain for all the languages. Our internal
>>>>> test facility is that we have a dedicated ensemble used by all unit
>>>> tests.
>>>>> This ensure application logic correctness but it is hard to test various
>>>>> failure scenarios.
>>>>> 
>>>>> So my current thought is to expose the following functionalities.
>>>>> 
>>>>> 1.  zookeeper_close() that don't actually send close request to the
>>>>> server:     This can be used to simulate a client crash without actually
>>>>> crashing the test program.
>>>>> 2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
>>>>> event:    This will allow the user to test their watchers and callback
>>>> (and
>>>>> possible race condition)
>>>>> 
>>>>> Let me know if you have additional suggestions.
>>>>> 
>>>>> 
>>>>> --
>>>>> Thawan Kooburat
>>>>> 
>>>> 
>

Re: Make zookeeper more test friendly

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.

Is there a Jira to track this? I'd like to do some work on this. 

-Jordan

On Jul 1, 2013, at 4:59 PM, Patrick Hunt <ph...@apache.org> wrote:

> It's come up a bunch of times before, would be great to find someone
> to drive this!
> http://markmail.org/message/vvk2ttrdhe6qqp2q
> 
> Patrick
> 
> On Fri, Jun 28, 2013 at 1:08 AM, kishore g <g....@gmail.com> wrote:
>> +1
>> 
>> We had to do similar stuff internally at LinkedIn and most of the bugs we
>> found were in the way session expiry/disconnect handling. We did a
>> combination of iptables, SIGSTOP and having another client connect with
>> same session id/password and close that connection. This is non trivial and
>> requires some effort to wire up different pieces.
>> 
>> However I would like to add that the even though our test cases worked we
>> had weird issues during GC's and some times during long GC. GC on both
>> server and client are problematic. For example clients would get a session
>> expiry and then a syncconnected event but before syncconnected is processed
>> there would be another session expiry. These scenarios are much harder to
>> test for and reproduce.
>> 
>> Thanks for taking this up.
>> 
>> Thanks,
>> Kishore G
>> 
>> 
>> On Thu, Jun 27, 2013 at 10:38 PM, Ted Dunning <te...@gmail.com> wrote:
>> 
>>> +1  this is a very big deal
>>> 
>>> 
>>> 
>>> 
>>> On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:
>>> 
>>>> Many recent issues that I saw internally is due to incorrect handling or
>>>> no sufficient testing on ZooKeeper failure scenario in the custom wrapper
>>>> API or in the applications.
>>>> 
>>>> I am thinking that we might be able to expose a few more API calls that
>>>> allow user write unit tests that cover various failure scenarios (similar
>>>> to the TestableZookeer in zookeeper test) . This should also minimize the
>>>> effort on setting the test framework.  Ideally, if we have a mock client
>>>> that don't need a running the server that would be ideal, but I think it
>>> is
>>>> too much effort to write and maintain for all the languages. Our internal
>>>> test facility is that we have a dedicated ensemble used by all unit
>>> tests.
>>>> This ensure application logic correctness but it is hard to test various
>>>> failure scenarios.
>>>> 
>>>> So my current thought is to expose the following functionalities.
>>>> 
>>>>  1.  zookeeper_close() that don't actually send close request to the
>>>> server:     This can be used to simulate a client crash without actually
>>>> crashing the test program.
>>>>  2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
>>>> event:    This will allow the user to test their watchers and callback
>>> (and
>>>> possible race condition)
>>>> 
>>>> Let me know if you have additional suggestions.
>>>> 
>>>> 
>>>> --
>>>> Thawan Kooburat
>>>> 
>>>

Re: Make zookeeper more test friendly

Posted by Patrick Hunt <ph...@apache.org>.

It's come up a bunch of times before, would be great to find someone
to drive this!
http://markmail.org/message/vvk2ttrdhe6qqp2q

Patrick

On Fri, Jun 28, 2013 at 1:08 AM, kishore g <g....@gmail.com> wrote:
> +1
>
> We had to do similar stuff internally at LinkedIn and most of the bugs we
> found were in the way session expiry/disconnect handling. We did a
> combination of iptables, SIGSTOP and having another client connect with
> same session id/password and close that connection. This is non trivial and
> requires some effort to wire up different pieces.
>
> However I would like to add that the even though our test cases worked we
> had weird issues during GC's and some times during long GC. GC on both
> server and client are problematic. For example clients would get a session
> expiry and then a syncconnected event but before syncconnected is processed
> there would be another session expiry. These scenarios are much harder to
> test for and reproduce.
>
> Thanks for taking this up.
>
> Thanks,
> Kishore G
>
>
> On Thu, Jun 27, 2013 at 10:38 PM, Ted Dunning <te...@gmail.com> wrote:
>
>> +1  this is a very big deal
>>
>>
>>
>>
>> On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:
>>
>> > Many recent issues that I saw internally is due to incorrect handling or
>> > no sufficient testing on ZooKeeper failure scenario in the custom wrapper
>> > API or in the applications.
>> >
>> > I am thinking that we might be able to expose a few more API calls that
>> > allow user write unit tests that cover various failure scenarios (similar
>> > to the TestableZookeer in zookeeper test) . This should also minimize the
>> > effort on setting the test framework.  Ideally, if we have a mock client
>> > that don't need a running the server that would be ideal, but I think it
>> is
>> > too much effort to write and maintain for all the languages. Our internal
>> > test facility is that we have a dedicated ensemble used by all unit
>> tests.
>> > This ensure application logic correctness but it is hard to test various
>> > failure scenarios.
>> >
>> > So my current thought is to expose the following functionalities.
>> >
>> >   1.  zookeeper_close() that don't actually send close request to the
>> > server:     This can be used to simulate a client crash without actually
>> > crashing the test program.
>> >   2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
>> > event:    This will allow the user to test their watchers and callback
>> (and
>> > possible race condition)
>> >
>> > Let me know if you have additional suggestions.
>> >
>> >
>> > --
>> > Thawan Kooburat
>> >
>>

Re: Make zookeeper more test friendly

Posted by kishore g <g....@gmail.com>.

+1

We had to do similar stuff internally at LinkedIn and most of the bugs we
found were in the way session expiry/disconnect handling. We did a
combination of iptables, SIGSTOP and having another client connect with
same session id/password and close that connection. This is non trivial and
requires some effort to wire up different pieces.

However I would like to add that the even though our test cases worked we
had weird issues during GC's and some times during long GC. GC on both
server and client are problematic. For example clients would get a session
expiry and then a syncconnected event but before syncconnected is processed
there would be another session expiry. These scenarios are much harder to
test for and reproduce.

Thanks for taking this up.

Thanks,
Kishore G

On Thu, Jun 27, 2013 at 10:38 PM, Ted Dunning <te...@gmail.com> wrote:

> +1  this is a very big deal
>
>
>
>
> On Thu, Jun 27, 2013 at 6:39 PM, Thawan Kooburat <th...@fb.com> wrote:
>
> > Many recent issues that I saw internally is due to incorrect handling or
> > no sufficient testing on ZooKeeper failure scenario in the custom wrapper
> > API or in the applications.
> >
> > I am thinking that we might be able to expose a few more API calls that
> > allow user write unit tests that cover various failure scenarios (similar
> > to the TestableZookeer in zookeeper test) . This should also minimize the
> > effort on setting the test framework.  Ideally, if we have a mock client
> > that don't need a running the server that would be ideal, but I think it
> is
> > too much effort to write and maintain for all the languages. Our internal
> > test facility is that we have a dedicated ensemble used by all unit
> tests.
> > This ensure application logic correctness but it is hard to test various
> > failure scenarios.
> >
> > So my current thought is to expose the following functionalities.
> >
> >   1.  zookeeper_close() that don't actually send close request to the
> > server:     This can be used to simulate a client crash without actually
> > crashing the test program.
> >   2.  Allow client to force triggering CONNECTION_LOSS or SESSSION_EXPIRE
> > event:    This will allow the user to test their watchers and callback
> (and
> > possible race condition)
> >
> > Let me know if you have additional suggestions.
> >
> >
> > --
> > Thawan Kooburat
> >
>