You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Adam Rosien <ad...@rosien.net> on 2012/05/18 19:34:47 UTC

cluster member was switched to standalone, detectable?

We have a 5-member 3.3.3 cluster. One of the node's configurations was
accidentally changed, and that node went into "standalone" mode, thinking
it was a single-node cluster. However, all our zk clients still had the
address of this server, and when connected obviously got missing or wrong
data.

Is this situation avoidable somehow?

.. Adam

Re: cluster member was switched to standalone, detectable?

Posted by Adam Rosien <ad...@rosien.net>.
Valid modes are: leader, follower, standalone, observer

If one node goes into standalone mode, your code will not detect a problem,
because standalone is a valid mode.

Monitoring like "ensure there is only one leader", "ensure no node is
standalone" are good things to have but require looking at all the nodes.

On Fri, May 18, 2012 at 1:57 PM, Jordan Zimmerman <jz...@netflix.com>wrote:

> The 'srvr' command lists what mode the instance thinks it's in.
> Unfortunately, you have to manually parse it. If there's a quorum issue it
> outputs something like "This ZooKeeper is not serving requests".
>
> -JZ
>
> On 5/18/12 1:55 PM, "Adam Rosien" <ad...@rosien.net> wrote:
>
> >Do the four-letter words tell me if a service joined the quorum correctly?
> >What commands and responses will tell me?
> >
> >How do I know what cluster it joined? What if nodes X & Y are in cluster A
> >but Z is in cluster B, should there be a cluster identifier to distinguish
> >membership?
> >
> >On Fri, May 18, 2012 at 12:05 PM, Patrick Hunt <ph...@apache.org> wrote:
> >
> >> That would detect it, I don't think it's avoidable in the sense that
> >> we can't detect that type of mis-configuration and somehow handle it
> >> (ie stop). Your best bet would be to automate the process (and test
> >> that ahead of time), or bring up the new server with the client port
> >> set to something previously unused, then verify, then restart it with
> >> the client port set as it was originally. I often do this when
> >> debugging issues. (but that itself might cause problems wrt config
> >> typos). Another option is to use iptables (etc...) to turn off access
> >> to clients until you've verified the server joined the quorum
> >> correctly, then turn off the filter.
> >>
> >> Patrick
> >>
> >> On Fri, May 18, 2012 at 11:51 AM, Jordan Zimmerman
> >> <jz...@netflix.com> wrote:
> >> > ZooKeeper has a telnet style interface for periodic querying.
> >> >
> >> > You could also use Exhibitor and query it's REST API periodically. I
> >> > should probably add alerting to Exhibitor for this kind of thing.
> >> >
> >> > -JZ
> >> >
> >> > On 5/18/12 10:34 AM, "Adam Rosien" <ad...@rosien.net> wrote:
> >> >
> >> >>We have a 5-member 3.3.3 cluster. One of the node's configurations was
> >> >>accidentally changed, and that node went into "standalone" mode,
> >>thinking
> >> >>it was a single-node cluster. However, all our zk clients still had
> >>the
> >> >>address of this server, and when connected obviously got missing or
> >>wrong
> >> >>data.
> >> >>
> >> >>Is this situation avoidable somehow?
> >> >>
> >> >>.. Adam
> >> >
> >>
>
>

Re: cluster member was switched to standalone, detectable?

Posted by Jordan Zimmerman <jz...@netflix.com>.
FYI - for an example you can look at the Exhibitor source.

https://github.com/Netflix/exhibitor/blob/master/exhibitor-core/src/main/ja
va/com/netflix/exhibitor/core/state/Checker.java


It issues an 'ruok' and then a 'srvr'.

-JZ

On 5/18/12 1:57 PM, "Jordan Zimmerman" <jz...@netflix.com> wrote:

>The 'srvr' command lists what mode the instance thinks it's in.
>Unfortunately, you have to manually parse it. If there's a quorum issue it
>outputs something like "This ZooKeeper is not serving requests".
>
>-JZ
>
>On 5/18/12 1:55 PM, "Adam Rosien" <ad...@rosien.net> wrote:
>
>>Do the four-letter words tell me if a service joined the quorum
>>correctly?
>>What commands and responses will tell me?
>>
>>How do I know what cluster it joined? What if nodes X & Y are in cluster
>>A
>>but Z is in cluster B, should there be a cluster identifier to
>>distinguish
>>membership?
>>
>>On Fri, May 18, 2012 at 12:05 PM, Patrick Hunt <ph...@apache.org> wrote:
>>
>>> That would detect it, I don't think it's avoidable in the sense that
>>> we can't detect that type of mis-configuration and somehow handle it
>>> (ie stop). Your best bet would be to automate the process (and test
>>> that ahead of time), or bring up the new server with the client port
>>> set to something previously unused, then verify, then restart it with
>>> the client port set as it was originally. I often do this when
>>> debugging issues. (but that itself might cause problems wrt config
>>> typos). Another option is to use iptables (etc...) to turn off access
>>> to clients until you've verified the server joined the quorum
>>> correctly, then turn off the filter.
>>>
>>> Patrick
>>>
>>> On Fri, May 18, 2012 at 11:51 AM, Jordan Zimmerman
>>> <jz...@netflix.com> wrote:
>>> > ZooKeeper has a telnet style interface for periodic querying.
>>> >
>>> > You could also use Exhibitor and query it's REST API periodically. I
>>> > should probably add alerting to Exhibitor for this kind of thing.
>>> >
>>> > -JZ
>>> >
>>> > On 5/18/12 10:34 AM, "Adam Rosien" <ad...@rosien.net> wrote:
>>> >
>>> >>We have a 5-member 3.3.3 cluster. One of the node's configurations
>>>was
>>> >>accidentally changed, and that node went into "standalone" mode,
>>>thinking
>>> >>it was a single-node cluster. However, all our zk clients still had
>>>the
>>> >>address of this server, and when connected obviously got missing or
>>>wrong
>>> >>data.
>>> >>
>>> >>Is this situation avoidable somehow?
>>> >>
>>> >>.. Adam
>>> >
>>>
>


Re: cluster member was switched to standalone, detectable?

Posted by Jordan Zimmerman <jz...@netflix.com>.
The 'srvr' command lists what mode the instance thinks it's in.
Unfortunately, you have to manually parse it. If there's a quorum issue it
outputs something like "This ZooKeeper is not serving requests".

-JZ

On 5/18/12 1:55 PM, "Adam Rosien" <ad...@rosien.net> wrote:

>Do the four-letter words tell me if a service joined the quorum correctly?
>What commands and responses will tell me?
>
>How do I know what cluster it joined? What if nodes X & Y are in cluster A
>but Z is in cluster B, should there be a cluster identifier to distinguish
>membership?
>
>On Fri, May 18, 2012 at 12:05 PM, Patrick Hunt <ph...@apache.org> wrote:
>
>> That would detect it, I don't think it's avoidable in the sense that
>> we can't detect that type of mis-configuration and somehow handle it
>> (ie stop). Your best bet would be to automate the process (and test
>> that ahead of time), or bring up the new server with the client port
>> set to something previously unused, then verify, then restart it with
>> the client port set as it was originally. I often do this when
>> debugging issues. (but that itself might cause problems wrt config
>> typos). Another option is to use iptables (etc...) to turn off access
>> to clients until you've verified the server joined the quorum
>> correctly, then turn off the filter.
>>
>> Patrick
>>
>> On Fri, May 18, 2012 at 11:51 AM, Jordan Zimmerman
>> <jz...@netflix.com> wrote:
>> > ZooKeeper has a telnet style interface for periodic querying.
>> >
>> > You could also use Exhibitor and query it's REST API periodically. I
>> > should probably add alerting to Exhibitor for this kind of thing.
>> >
>> > -JZ
>> >
>> > On 5/18/12 10:34 AM, "Adam Rosien" <ad...@rosien.net> wrote:
>> >
>> >>We have a 5-member 3.3.3 cluster. One of the node's configurations was
>> >>accidentally changed, and that node went into "standalone" mode,
>>thinking
>> >>it was a single-node cluster. However, all our zk clients still had
>>the
>> >>address of this server, and when connected obviously got missing or
>>wrong
>> >>data.
>> >>
>> >>Is this situation avoidable somehow?
>> >>
>> >>.. Adam
>> >
>>


Re: cluster member was switched to standalone, detectable?

Posted by Adam Rosien <ad...@rosien.net>.
Do the four-letter words tell me if a service joined the quorum correctly?
What commands and responses will tell me?

How do I know what cluster it joined? What if nodes X & Y are in cluster A
but Z is in cluster B, should there be a cluster identifier to distinguish
membership?

On Fri, May 18, 2012 at 12:05 PM, Patrick Hunt <ph...@apache.org> wrote:

> That would detect it, I don't think it's avoidable in the sense that
> we can't detect that type of mis-configuration and somehow handle it
> (ie stop). Your best bet would be to automate the process (and test
> that ahead of time), or bring up the new server with the client port
> set to something previously unused, then verify, then restart it with
> the client port set as it was originally. I often do this when
> debugging issues. (but that itself might cause problems wrt config
> typos). Another option is to use iptables (etc...) to turn off access
> to clients until you've verified the server joined the quorum
> correctly, then turn off the filter.
>
> Patrick
>
> On Fri, May 18, 2012 at 11:51 AM, Jordan Zimmerman
> <jz...@netflix.com> wrote:
> > ZooKeeper has a telnet style interface for periodic querying.
> >
> > You could also use Exhibitor and query it's REST API periodically. I
> > should probably add alerting to Exhibitor for this kind of thing.
> >
> > -JZ
> >
> > On 5/18/12 10:34 AM, "Adam Rosien" <ad...@rosien.net> wrote:
> >
> >>We have a 5-member 3.3.3 cluster. One of the node's configurations was
> >>accidentally changed, and that node went into "standalone" mode, thinking
> >>it was a single-node cluster. However, all our zk clients still had the
> >>address of this server, and when connected obviously got missing or wrong
> >>data.
> >>
> >>Is this situation avoidable somehow?
> >>
> >>.. Adam
> >
>

Re: cluster member was switched to standalone, detectable?

Posted by Patrick Hunt <ph...@apache.org>.
That would detect it, I don't think it's avoidable in the sense that
we can't detect that type of mis-configuration and somehow handle it
(ie stop). Your best bet would be to automate the process (and test
that ahead of time), or bring up the new server with the client port
set to something previously unused, then verify, then restart it with
the client port set as it was originally. I often do this when
debugging issues. (but that itself might cause problems wrt config
typos). Another option is to use iptables (etc...) to turn off access
to clients until you've verified the server joined the quorum
correctly, then turn off the filter.

Patrick

On Fri, May 18, 2012 at 11:51 AM, Jordan Zimmerman
<jz...@netflix.com> wrote:
> ZooKeeper has a telnet style interface for periodic querying.
>
> You could also use Exhibitor and query it's REST API periodically. I
> should probably add alerting to Exhibitor for this kind of thing.
>
> -JZ
>
> On 5/18/12 10:34 AM, "Adam Rosien" <ad...@rosien.net> wrote:
>
>>We have a 5-member 3.3.3 cluster. One of the node's configurations was
>>accidentally changed, and that node went into "standalone" mode, thinking
>>it was a single-node cluster. However, all our zk clients still had the
>>address of this server, and when connected obviously got missing or wrong
>>data.
>>
>>Is this situation avoidable somehow?
>>
>>.. Adam
>

Re: cluster member was switched to standalone, detectable?

Posted by Jordan Zimmerman <jz...@netflix.com>.
ZooKeeper has a telnet style interface for periodic querying.

You could also use Exhibitor and query it's REST API periodically. I
should probably add alerting to Exhibitor for this kind of thing.

-JZ

On 5/18/12 10:34 AM, "Adam Rosien" <ad...@rosien.net> wrote:

>We have a 5-member 3.3.3 cluster. One of the node's configurations was
>accidentally changed, and that node went into "standalone" mode, thinking
>it was a single-node cluster. However, all our zk clients still had the
>address of this server, and when connected obviously got missing or wrong
>data.
>
>Is this situation avoidable somehow?
>
>.. Adam